Experimental neural network architectures based on Self-Organized Criticality (SOC) on geometric lattices, cellular automaton dynamics, and low-rank hypernetwork weight generation. The core idea: replace traditional layers (Linear, Conv, Attention) with physics-inspired spiking neuron grids that evolve over time through local neighborhood interactions.
INPUT (images, text, tabular, time-series)
|
v
┌──────────────────────────────────────────────────────────────────────────┐
│ PROJECTOR │
│ ┌──────────────────┐ ┌──────────────────────────┐ │
│ │ ViT Patch Aggreg. │ or │ CP Tensor Rank Projector │ or │ Simple Lin. │
│ │ (vision) │ │ einsum('brx,bry,brz │ │ (tabular) │
│ │ CLS token Txfmr │ │ -> bxyz') │ │ │
│ └────────┬─────────┘ └───────────┬──────────────┘ └──────┬───────┘
│ └──────────────────────────┴──────────────────────────┘
│ │
│ v driving currents [B, L, L, L]
│ │
│ ┌──────────────────────────────────┴──────────────────────────────────┐ │
│ │ VECTORIZED SOC LATTICE (LxLxL spiking neuron grid) │ │
│ │ │ │
│ │ For t = 0 .. T_steps: │ │
│ │ spikes[t] = Heaviside(mem - threshold) ← surrogate gradient │ │
│ │ activity = sparse_adj @ (spikes * resources) │ │
│ │ mem = mem * decay + activity │ │
│ │ resources = clamp(resources - u*spikes + tau*recovery) │ │
│ │ │ │
│ │ Physics regularization: │ │
│ │ branching_ratio (sigma) → 1.0 (critical point) │ │
│ │ excess_clustering → maximize modularity │ │
│ └──────────────────────────────────┬──────────────────────────────────┘ │
│ │ │
│ v spike history [B, T, L^3] │
│ │ │
│ ┌──────────────────────────────────┴──────────────────────────────────┐ │
│ │ READOUT │ │
│ │ ┌──────────────────────────┐ ┌───────────────────────────┐ │ │
│ │ │ Holographic Tensor │ │ Class-Query Cross-Attn. │ ... │ │
│ │ │ fx,fy,fz factor mats │ │ scan temporal history │ │ │
│ │ │ rank-R spatial contract │ │ with learned class tokens │ │ │
│ │ └───────────┬─────────────┘ └────────────┬──────────────┘ │ │
│ │ └─────────────────────────────┘ │ │
│ └──────────────────────────────────┬──────────────────────────────────┘ │
│ │ │
│ v [B, steps, rank] │
│ Temporal Transformer │
│ │ │
│ v [B, num_classes] │
│ LOGITS │
└──────────────────────────────────────────────────────────────────────────┘
LOSS = task_loss + alpha * criticality_loss + beta * clustering_loss + gamma * activity_loss
-
Block / BlockSparse — Dense/sparse cellular automaton on N-dimensional grids with relativistic (magnitude-gated) update rules:
M = sigmoid(alpha * (|S_i| - |S_j|)),S' = tanh(M * W * S). Seecore/block.py,core/blockconv.py,core/model.py. -
Low-Rank Hypernetworks — Generate per-sample interaction weights via FiLM-conditioned SIREN networks and CP tensor decomposition. Adaptive Computation Time (ACT) halting. See
core/lowrankact.py,helena/. -
Multi-Channel PC Logistic (LMN Apps) — Simplified single-vector relativistic updates in C parallel channels with cross-channel self-attention. Wraps frozen ResNet backbones for image tasks. See
lmn_apps/.
lattice-machine-networks/
├── core/ # SOC Lattice Core (16 files)
├── vision/ # SOC Vision: DomainNet/Sketch classification (3 files)
├── nlp/ # SOC NLP: BabyLM, TinyStories, BPTT (4 files)
├── lmn_apps/ # Multi-Channel PC Logistic applications (5 files)
├── helena/ # Hypernetwork cellular automata on Helena dataset (6 files)
├── training/ # Basic Block training scripts (6 files)
├── dataloaders/ # Reusable data loading infrastructure (6 files)
├── requirements.txt
├── .gitignore
└── README.md
| File | Lines | Description |
|---|---|---|
block.py |
73 | Block: dense NxN weight matrix, d-dimensional grid, relativistic update |
blockconv.py |
111 | Block (optimized): vectorized neighbor gather via precomputed index maps |
model.py |
201 | BlockSparseGrid + TimeSeriesSpatioTemporalModel: 1D line→dD cube pipeline |
cnn.py |
77 | EuclideanConnected3D: 3D conv with distance-based weight mask |
graph.py |
96 | NDimLatticeNetwork: sparse adjacency graph lattice with tanh activation |
snn.py |
227 | SOCLattice: first spiking variant, surrogate gradient, resource dynamics |
smartnet.py |
81 | SmartNet: multi-column blocks + inter-column self-attention at each depth |
socmodel.py |
443 | TabularSOCSystem: hypernetwork projection + SOC lattice + temporal attn readout |
soc44.py |
937 | FullSOCSystem (vision): ViT aggregator + CP projector + holographic transformer readout |
lowrankact.py |
303 | LowRankHyperNetACT: FiLM+SIREN generators, CP decomposition, ACT halting |
memory.py |
503 | PriceMemoryAnalyzer: PyQt6 crypto OHLCV visualization GUI |
mt.py |
724 | STSOC_DomainNet: CNN voxel projector + SOC lattice + class-query cross-attn |
socmodeltest2.py |
423 | Tabular SOC on FashionMNIST with low-rank linear projection |
socmodeltest3.py |
496 | 3D volumetric SOC on FashionMNIST with OverlappingGroupedReadout (CNN) |
socmodeltest4.py |
982 | SOC on DomainNet sketch: GPU patchify + HolographicEmbeddingReadout |
socmodeltest5.py |
930 | Colab version of socmodeltest4: InMemoryDataset + EmbeddingDotProductReadout |
| File | Lines | Description |
|---|---|---|
socVISION444_3M.py |
1206 | Refined SOC vision: CSR lattice, clustering loss, GPU data prefetcher |
stsocdomainnet.py |
1447 | Multi-domain DomainNet: single/all/lodo modes, ViT-style ConcatenatedQueryProjector |
pre.py |
191 | GPU patchification pipeline: images → patched tensors saved as chunked .pt files |
| File | Lines | Description |
|---|---|---|
socmodeltest5nlp.py |
554 | TinyStories + GloVe, sliding window, memory token passing, full BPTT |
socnlp.py |
768 | BabyLM 10M + BERT tiny frozen embeddings, CSR lattice, holographic metric readout |
socnlptbtt.py |
543 | Truncated BPTT on tokenized .bin files, continuous lattice state, ArcTan surrogate |
soctokentorank.py |
1692 | Unified SharedEmbeddingProjectorReadout, trainable embeddings from BERT vocab |
| File | Lines | Description |
|---|---|---|
lmnBenchtest.py |
235 | Benchmark: frozen ResNet50 + MultiChannelRelativisticLayer head on synthetic data |
lmncrypto.py |
421 | Crypto binary prediction: 1D-CNN encoder + MultiChannelPCNet on Kraken OHLCV |
lmnfood101.py |
329 | Food-101 classification: frozen ResNet50 + MultiChannelPCNet head |
lmnIMGNET.py |
267 | ImageNet-1K: frozen ResNet18 + MultiChannelPCNet head |
lmnma.py |
391 | Crypto regression: 100 moving averages → MultiChannelPCNet → future RSI prediction |
| File | Lines | Description |
|---|---|---|
routerphelena.py |
177 | BlockSparseHyperNet: MLP router generates per-cell dynamic weights |
routerhelena.py |
208 | BlockSparseFactoredRouter: learnable cell identity embeddings + factored router |
helenafullroutertimes.py |
267 | ImplicitHyperNet: INRs + spatial attention + temporal attention |
fullrouterhelena.py |
306 | ImplicitHyperNet: CP low-rank decomposition + FiLM-conditioned SIREN generators |
hlowrankgenerate.py |
250 | Dual low-rank factorization: emitter/listener generators + attention readout |
sinefilmrouter.py |
449 | LowRankHyperNet: FiLM+SIREN + dynamic multi-head attention readout (regression) |
| File | Lines | Description |
|---|---|---|
train.py |
270 | HIGGS binary classification (28 features → LowRankHyperNet) |
imgnettrain.py |
163 | ImageNet-1K with Block (2D grid, grayscale, 1000-class) |
mnisttrain.py |
117 | MNIST digit classification with Block (2D grid, 10-class) |
helenatrain.py |
173 | Helena 100-class classification with BlockSparse (4D grid, 10000 nodes) |
giveitatry.py |
694 | Crypto time-series with STSOCFinance: spiking SOC + transformer regressor |
trial.py |
26 | Smoke test: SmartNet instantiation and forward pass |
| File | Lines | Description |
|---|---|---|
dataloader.py |
210 | ImageNet ILSVRC (Kaggle format): train_cls.txt + LOC_val_solution.csv |
dataloader_.py |
91 | ImageNet grayscale variant, ImageFolder + CSV synset mapping |
helenadataloader.py |
126 | Helena .arff loader: scipy.io.arff → pandas → sklearn preprocessing |
yeardataloader.py |
165 | YearPredictionMSD: auto-download from UCI, 90 audio features → year |
imagenetdataloader.py |
213 | ImageNet with XML annotation parsing, validation set reorganization |
domainetdownlaod.py |
60 | DomainNet download script (6 domains, BU server, resume-capable) |
| Dataset | Task | Dim | Classes | Size | Loader |
|---|---|---|---|---|---|
| MNIST | Classification | 28x28 gray | 10 | 70k | torchvision |
| FashionMNIST | Classification | 28x28 gray | 10 | 70k | sklearn.fetch_openml |
| HIGGS | Classification | 28 features | 2 | 11M | inline CSV |
| Helena | Classification | ~27 features | 100 | 65k | helenadataloader.py |
| YearPredictionMSD | Regression | 90 features | 1 | 515k | yeardataloader.py |
| DomainNet/Sketch | Classification | 224x224 RGB | 345 | 70k | inline ImageFolder |
| ImageNet-1K | Classification | 224x224 RGB | 1000 | 1.28M | dataloader.py |
| Food-101 | Classification | 224x224 RGB | 101 | 101k | torchvision |
| BabyLM 10M | Language Model | text tokens | vocab | ~10M words | HuggingFace datasets |
| TinyStories | Language Model | text tokens | vocab | ~3B chars | inline text |
| Kraken OHLCV | Regression | 13 features | continuous | 8656 CSV | inline IterableDataset |
| Covertype | Classification | 54 features | 7 | 581k | sklearn |
# Install dependencies
pip install -r requirements.txt
# Quick smoke test — instantiate a SOC lattice
cd core && python snn.py
# Test Block forward pass
cd core && python blockconv.py
# Train on MNIST (small, fast)
cd training && python mnisttrain.py
# Train tabular SOC on FashionMNIST
cd core && python socmodeltest2.py
# Run benchmark with synthetic data (GPU)
cd lmn_apps && python lmnBenchtest.py
# Crypto time-series training (requires Kraken_OHLCVT/ data)
cd lmn_apps && python lmnma.py
# NLP training (auto-downloads TinyStories + GloVe)
cd nlp && python socmodeltest5nlp.py
# Helena hypernetwork (requires file1c556677f875.arff)
cd helena && python routerhelena.pyMost training scripts require the relevant dataset to be present. See individual file docstrings for data setup.
| Package | Purpose |
|---|---|
torch>=2.0 |
Core framework, autograd, CUDA |
torchvision |
Pretrained models (ResNet), datasets, transforms |
numpy |
Numerical arrays, data manipulation |
scipy |
ARFF file loading (Helena) |
scikit-learn |
Preprocessing, train_test_split, StandardScaler |
pandas |
CSV/DataFrame handling |
matplotlib |
Loss plotting |
tqdm |
Progress bars |
Pillow |
Image loading |
requests |
Dataset downloads |
transformers |
BERT model/tokenizer (NLP scripts) |
datasets |
HuggingFace dataset loading (BabyLM) |
The lattice operates at the critical point (branching ratio σ = 1.0) between decaying (σ < 1) and exploding (σ > 1) activity regimes. This is the computational sweet spot where information propagates maximally without damping or saturating the network. The branching ratio is computed as:
σ_t = |spikes_t| / |spikes_{t-1}| → MSE(σ, 1.0) → 0
A 3D grid of L³ spiking neurons with:
- Membrane potential (
mem): leaks with learned per-neuron decay ∈ [0.4, 0.99] - Resources: depleted by firing (
uutilization ∈ [0.01, 0.5]), recover with learnedtau∈ [10, 100] - Sparse adjacency: CSR format, radius-R neighborhood, periodic boundaries
- Surrogate gradient: Gaussian/sigmoid/ArcTan approximation of the Heaviside step for backprop
Input latent vectors are expanded to 3D driving currents via Canonical Polyadic (CANDECOMP/PARAFAC) decomposition:
x_vec, y_vec, z_vec = f_x(latent), f_y(latent), f_z(latent) # [B, R, L]
currents = einsum('brx, bry, brz -> bxyz', x_vec, y_vec, z_vec) # [B, L, L, L]Holographic readout reverses this: 3D spike volumes are contracted along spatial axes using learnable factor matrices.
The non-spiking alternative. Cell states S interact through a magnitude-gated mechanism:
diff = abs(S_i) - abs(S_j) # relativistic difference
mask = sigmoid(alpha * diff) # learnable gate steepness
S_new = tanh(mask * W @ S) # gated interactionSample-conditional weight generation via FiLM-conditioned SIREN networks with Adaptive Computation Time:
# Per-sample context from hypernetwork
state_context, weight_context = hypernetwork(x)
# SIREN generators produce factor vectors
U_state, V_state = siren_gen(state_context) # low-rank factorization
W = einsum('bcnr, bckr -> bcnk', U_weight, V_weight)
# ACT halting — each sample can stop early
h_t = sigmoid(halting_net(state))
if halting_sum >= 1.0: stopMost SOC models train with 3–4 loss terms:
loss = task_loss
+ alpha * crit_loss # push σ → 1.0
+ beta * clustering_loss # maximize modularity
+ gamma * activity_loss # regulate firing rateMany scripts use a burn-in phase: physics-only training for the first N epochs before adding task loss.
soctokentorank.py: ReferencesCONFIG['bert_model']which is undefined (line 253) — will crash on importsocnlp.pyline 764: Variableiis out of scope after tqdm loopsocmodeltest4.py/socmodeltest5.py: Loss formula uses*instead of+on line ~912/865:loss = crit_loss + task_loss * 0.1 * act_loss * 0.1— should becrit_loss + task_loss * 0.1 + act_loss * 0.1train.py/helenatrain.py:alphaplaced on CUDA as a plain tensor (notnn.Parameter), so it is frozenmemory.py: UsesPyQt6+pyqtgraph— optional GUI dependency not in core requirements- Extensive code duplication:
VectorizedSOCLattice,SurrogateSpike, and CP projectors are redefined independently in 8+ files
The name Lattice Machine Networks (LMN) reflects the unifying concept: machine learning models where computation occurs on geometric lattices through local, physics-inspired interaction rules.
Research code — provided as-is for reproducibility and further experimentation.