Skip to content

[iOS] Add cancel/stop generation API for LlmInference #6252

@DenisovAV

Description

@DenisovAV

Description

The iOS SDK (MediaPipeTasksGenAI CocoaPods) does not expose a cancel/stop generation API for LlmInference, while Android and Web already support it.

Current State

Platform Cancel API Status
Android cancelGenerateResponseAsync() ✅ Available
Web cancelProcessing() ✅ Available since @mediapipe/tasks-genai@0.10.26 (#6128)
iOS Missing

Use Case

Users want to cancel an ongoing LLM response mid-stream — this is a standard UX pattern in all chat AI applications. Without cancellation on iOS, the only workaround is closing the entire session and creating a new one, which is slow and wastes resources.

Requested API

Add a cancelProcessing() or equivalent method to LlmInference on iOS (Swift), matching the Android and Web APIs.

Swift example:

let llmInference = try LlmInference(options: options)
// Start async generation
llmInference.generateResponseAsync(inputText: prompt) { partial, error in ... }
// Cancel mid-stream
llmInference.cancelProcessing()  // <-- this doesn't exist on iOS

Context

We maintain flutter_gemma, a Flutter plugin for on-device LLM inference. Cancel generation is one of the most requested features by our users (see DenisovAV/flutter_gemma#34, DenisovAV/flutter_gemma#194).

Related

Metadata

Metadata

Assignees

Labels

platform:iosMediaPipe IOS issuestask:LLM inferenceIssues related to MediaPipe LLM Inference Gen AI setuptype:featureEnhancement in the New Functionality or Request for a New Solution

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions