CMP Architecture Timeline

Early CMP

Initial tests validating the relational binding matrix. High memory footprint, unstable gradients over long horizons.

CMP v1

Introduced the Hash Encoder. Proved token-savings hypothesis but struggled with OOV (Out of Vocabulary) generalization and morphology.

CMP v1.5

Sparse-gated refinement introduced. Stabilized the recurrent state, allowing for deeper networks.

CMP v1.9

Deprecated the Hash Encoder. Introduced character n-gram features and LearnedSparseEncoder in the Yudi AI framework. Current focus on predictive output head.

PC-CMP Direction

Active experiments testing Predictive Coding (local prediction-error updates) as a replacement for global backpropagation in CMP.

Recent Results

Model Version Params Dataset Training Method Result Interpretation / Caveat
CMP v1.5 12M WikiText-103 (Sub) Global BPTT Converged faster than dense baseline. Transformer PPL gap remains high on this specific task.
CMP v1.9 34M Synthetic Relational Global BPTT Perfect recall on bounded entity-role mapping. Does not prove generalization to unstructured natural language.
PC-CMP (Toy) 1M MNIST Sequences Predictive Coding Local updates reached 92% of BPTT accuracy. Predictive coding open questions remain at 10M+ scale.

Failure Modes & Corrections

Honest documentation of what hasn't worked.

Hash Encoder Limitations
In CMP v1, we utilized a Hash Encoder to rapidly sparsify token inputs. While computationally cheap, we discovered severe failure modes regarding Out-of-Vocabulary (OOV) terms and morphologically rich languages. The hash collisions created semantic smearing that was impossible to recover from in later layers. This led to the v1.9 correction: replacing it entirely with a LearnedSparseEncoder.
Transformer Perplexity (PPL) Gap
Despite theoretical memory advantages, our dense-matching baselines (like standard LLaMA-architecture Transformers) still drastically outperform current CMP versions on pure next-token prediction perplexity. We are investigating whether this is an architectural flaw in CMP or simply a symptom of using token-optimized datasets for a non-token architecture.
Need for Matched Baselines
Early experiments compared highly-optimized Transformer code against unoptimized PyTorch CMP implementations. We have since corrected our evaluation framework to ensure FLOPS-matched baselines, which reduced some of the "miraculous" efficiency gains we initially observed.

Open Benchmarks

Areas where we are actively seeking collaboration or developing new evaluation metrics:

Continual Learning
Few-Shot Learning
Cross-Modal Grounding
Interpretability
OOV and Morphology
Memory and State Tracking
Collaborate on Benchmarks