Skip to content

[BUG] [v0.0.7] cortex stats undercounts tokens: fallback deduplication key is always identical for ID-less messages (stats_cmd.rs:497) #53417

Description

@nightmare0329

Bug Report

Version: v0.0.7
File: src/cortex-cli/src/stats_cmd.rs

Description

In parse_session_file(), when accumulating token counts per message, a deduplication key is generated to prevent double-counting. For messages that have no id field, the code falls back to format!("msg_{}", data.message_count) as the key. However, data.message_count is set once to the total number of messages before the loop begins and never updated during iteration. As a result, every message without an id generates the exact same fallback key. After the first ID-less message is processed, all subsequent ID-less messages hit the counted_message_ids.contains(&count_key) check and have their tokens silently skipped.

Root Cause

// stats_cmd.rs line 469 - set ONCE before the loop
data.message_count = messages.len() as u64;

for msg in messages {
    ...
    let count_key = if !msg_id.is_empty() {
        msg_id.clone()
    } else {
        // BUG: data.message_count is always the TOTAL count (e.g., 10),
        // not the current loop index. All ID-less messages get key "msg_10".
        format!("msg_{}", data.message_count)  // line 497
    };

    if !counted_message_ids.contains(&count_key) {
        data.input_tokens += ...;   // Only counted for the FIRST ID-less message
        data.output_tokens += ...;
        counted_message_ids.insert(count_key);  // "msg_10" inserted once
    }
    // All subsequent ID-less messages: count_key == "msg_10", already in set → SKIPPED
}

Steps to Reproduce

Have a session file where messages lack the id field (common with older formats or some providers):

{
  "model": "claude-sonnet-4",
  "messages": [
    {"role": "user", "usage": {"input_tokens": 100, "output_tokens": 0}},
    {"role": "assistant", "usage": {"input_tokens": 0, "output_tokens": 200}},
    {"role": "user", "usage": {"input_tokens": 150, "output_tokens": 0}},
    {"role": "assistant", "usage": {"input_tokens": 0, "output_tokens": 300}}
  ]
}

Expected: cortex stats shows 550 total tokens (100+200+150+300)
Actual: Shows only 100 tokens (only first message counted; others all get key "msg_4" and are skipped)

Fix

Use the actual loop index via enumerate():

for (idx, msg) in messages.iter().enumerate() {
    ...
    let count_key = if !msg_id.is_empty() {
        msg_id.clone()
    } else {
        format!("msg_{}", idx)  // Use loop index, not total count
    };
    ...
}

Hotkey: 5CzBkL6CJWFa7QSFoWxqiodobsGmxD6oLxLQa24tNxvsT9dn
UID: 137

Metadata

Metadata

Assignees

No one assigned

    Labels

    invalidThis doesn't seem right

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions