[Bug] Output Token Count Is Zero for Streaming LLM Responses

## Summary

When the application uses streaming mode, TokenFirewall records `outputTokens: 0` because it does not accumulate tokens from streamed response chunks. This significantly undercounts actual cost.

## Current Behavior

For SSE streaming responses, `inputTokens` is recorded correctly from the request body. `outputTokens` is always 0 because the middleware does not process the streamed response body.

## Expected Behavior

TokenFirewall should track output tokens for streaming responses by either reading the final `usage` chunk (when the provider includes it) or counting accumulated response text.

## Proposed Fix

```typescript
async function handleStreamingResponse(
  response: ReadableStream,
  metadata: RequestMetadata,
  adapter: ProviderAdapter
): Promise<void> {
  let outputText = '';
  for await (const chunk of parseSSEStream(response)) {
    outputText += chunk.delta?.content ?? '';
    if (chunk.usage) {
      await recordUsage(metadata, { outputTokens: chunk.usage.completion_tokens });
      return;
    }
  }
  const outputTokens = await adapter.countTokens(outputText, metadata.model);
  await recordUsage(metadata, { outputTokens });
}
```

For OpenAI, pass `stream_options: { include_usage: true }` so the final SSE chunk contains usage data, avoiding the need for manual token counting.

## Acceptance Criteria

- [ ] Streaming responses record correct `outputTokens` in the usage log.
- [ ] When the provider includes usage in the final SSE chunk, that value is used directly.
- [ ] When no usage chunk is present, token count is estimated from accumulated output text.
- [ ] Streaming functionality and pass-through to the client are not disrupted.
- [ ] A test using a mock SSE stream verifies that output tokens are correctly recorded.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Output Token Count Is Zero for Streaming LLM Responses #91

Summary

Current Behavior

Expected Behavior

Proposed Fix

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug] Output Token Count Is Zero for Streaming LLM Responses #91

Description

Summary

Current Behavior

Expected Behavior

Proposed Fix

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions