HeLa-Mem

Code for HeLa-Mem on LongMemEval.

LoCoMo evaluation code will be released in the future.

LongMemEval Result

The table below shows the target reproduce results of HeLa-Mem on LongMemEval-S on the full 500-item benchmark:

Method	Overall ACC
LangMem	37.20
MemoryOS	44.80
Mem0	53.61
FullText	56.80
NaiveRAG	61.00
A-MEM	62.60
HeLa-Mem (Ours)	65.40

Included Code

HeLa-Mem/
├── hela_mem/
│   ├── encode_longmemeval.py
│   ├── eval_longmemeval.py
│   ├── hebbian_knowledge_memory.py
│   ├── hebbian_memory.py
│   ├── hebbian_retriever.py
│   ├── profile_utils.py
│   ├── reranker.py
│   └── utils.py
├── scripts/
│   ├── encode_longmemeval.sh
│   └── eval_longmemeval.sh
├── pyproject.toml
└── requirements.txt

Setup

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configure your API access:

export OPENAI_API_KEY="your-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"

If you want multi-key rotation, provide:

export OPENAI_API_KEYS="key1,key2,key3"

or

export OPENAI_API_KEYS_FILE="/path/to/keys.txt"

The default model is gpt-4o-mini.

Dataset Format

Expected fields per item:

question_id
question
answer
question_type
question_date
haystack_dates
haystack_sessions

The complete 500-item LongMemEval-S file is bundled in this repository:

data/longmemeval_s.json

Experiment Entry Points

The original LongMemEval experiment here is two-stage:

encode_longmemeval.py
eval_longmemeval.py

Encoding builds:

*_hebbian.json
*_long_term.json
*_long_term_kb_graph.json

Evaluation does:

episodic retrieval
semantic retrieval
answer generation
GPT judge scoring
per-item result saving
summary aggregation

Reproduce

Configure standard OpenAI credentials:

export OPENAI_API_KEY="your-key"
export OPENAI_BASE_URL="https://api.openai.com/v1"

Then run the full 500-item experiment.

1. Encode

bash scripts/encode_longmemeval.sh

Or directly:

python -m hela_mem.encode_longmemeval \
  --data_path data/longmemeval_s.json \
  --output_dir results/longmemeval_mem_full \
  --workers 8

2. Evaluate

bash scripts/eval_longmemeval.sh

Or directly:

python -m hela_mem.eval_longmemeval \
  --data_path data/longmemeval_s.json \
  --mem_dir results/longmemeval_mem_full \
  --workers 8 \
  --top_k 15 \
  --semantic_top_k 5

Outputs are written under:

results/.../eval_results/result_<question_id>.json
results/.../eval_results/eval_summary.json

If you want a smaller sanity-check run, keep the same dataset file and add --num_items 100 or another cap to both encode and eval.

Notes

This release keeps the original experiment-style environment variable names (HEBBIAN_*) so existing commands map cleanly.
API-key rotation is still supported, but keys must now come from environment variables or a local keys file.
The code uses the standard OpenAI Python SDK request pattern (client.chat.completions.create) with OPENAI_API_KEY and the official OpenAI base URL by default.
The repository has been cleaned for release, but the LongMemEval path is kept source-aligned rather than simplified.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HeLa-Mem

LongMemEval Result

Included Code

Setup

Dataset Format

Experiment Entry Points

Reproduce

1. Encode

2. Evaluate

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
hela_mem		hela_mem
scripts		scripts
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

HeLa-Mem

LongMemEval Result

Included Code

Setup

Dataset Format

Experiment Entry Points

Reproduce

1. Encode

2. Evaluate

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages