fix(tongyi): avoid gevent busy loop in STT by Zenine · Pull Request #3261 · langgenius/dify-official-plugins

Zenine · 2026-06-08T03:55:52Z

Summary

patch DashScope's async-to-sync websocket bridge in Tongyi STT when running under gevent monkey patching
keep TONGYI_STT_SUBPROCESS=1 as an explicit fallback path, instead of making subprocess execution the default
document the STT runtime behavior and bump the Tongyi plugin version to 0.2.1
constrain dashscope to the verified 1.25.x range because this fix intentionally shims an SDK internal helper

Why

Dify's plugin runtime imports dify_plugin, which globally applies gevent.monkey.patch_all(sys=True). DashScope Recognition.call() uses websocket duplex streaming and internally converts an async generator into a synchronous iterator through dashscope.common.utils.iter_over_async().

That bridge creates a thread and queue. Under gevent-patched threading/queue behavior, the Tongyi plugin process can remain in a busy event-loop state after STT requests, causing sustained high CPU in self-hosted plugin_daemon deployments.

This change keeps the normal in-process Recognition path, but swaps DashScope's async bridge to use native threading.Thread and native queue.Queue when gevent has patched threading. The existing subprocess worker remains available by explicitly setting TONGYI_STT_SUBPROCESS=1.

The shim is guarded and process-local:

no-op when gevent is unavailable
no-op when threading has not been monkey-patched
logs a warning and falls back to the old behavior if DashScope internals are incompatible
applies once per plugin process

No upstream DashScope issue has been filed yet from this PR; this is currently handled as a Tongyi plugin compatibility shim for Dify's gevent-based plugin runtime.

Field validation

Tested on a self-hosted BsoftRAG/Dify deployment on a Hygon/Haiguang server.

Setup:

Removed TONGYI_STT_SUBPROCESS and TONGYI_STT_RECOGNITION_TIMEOUT from the plugin daemon runtime.
Copied this PR's models/speech2text/speech2text.py into the installed Tongyi plugin venv.
Restarted the Tongyi plugin process.
Ran 3 waves of 30 concurrent /v1/audio-to-text requests against the chat-audio-test app.

Observed result:

STT requests returned HTTP 200.
plugin_daemon had transient CPU during active STT processing.
After each wave, plugin_daemon returned to low CPU instead of staying at sustained single-core usage.
Observed post-wave CPU samples:
- wave 1: 0.19% after 5s, 0.24% after 15s, 0.33% after 30s
- wave 2: 0.47% after 5s, 0.31% after 15s, 0.27% after 30s
- wave 3: 0.30% after 5s, 0.54% after 15s, 0.32% after 30s
The server was restored to its prior TONGYI_STT_SUBPROCESS=1 / TONGYI_STT_RECOGNITION_TIMEOUT=120 configuration after the experiment.

Local validation

Passed:

cd models/tongyi
uv run --python 3.12 pytest tests -k 'not test_llm_invoke'

Result: 17 passed, 75 deselected.

Passed:

uvx black models/tongyi/models/speech2text/speech2text.py models/tongyi/tests/test_speech2text.py -C -l 100
uvx ruff check models/tongyi/models/speech2text/speech2text.py models/tongyi/tests/test_speech2text.py

Result: All checks passed.

Not run successfully:

python3 -m pytest -q

The local global pre-commit hook runs the repository-wide pytest suite with the host Python 3.14 environment, where plugin dependencies such as dify_plugin, dashscope, google, numpy, and tiktoken are not installed. This fails during test collection before exercising this Tongyi change.

Also not run locally: tests/test_llm_call.py, because it requires a real DASHSCOPE_API_KEY.

gemini-code-assist

Code Review

This pull request updates the speech-to-text subprocess execution logic to be enabled by default, disabling it only when the TONGYI_STT_SUBPROCESS environment variable is set to a false-like value. The tests have been updated to accommodate this default behavior. Feedback on the tests recommends avoiding clearing the entire environment using patch.dict(os.environ, {}, clear=True) as it can cause unexpected failures, and suggests selectively filtering out only the target environment variable instead.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-08T03:56:41Z


    audio = MagicMock(frame_rate=16000)
-    with patch.dict(os.environ, {"TONGYI_STT_SUBPROCESS": "1"}):
+    with patch.dict(os.environ, {}, clear=True):


Using clear=True with an empty dictionary in patch.dict(os.environ, {}, clear=True) completely clears all environment variables (including critical ones like PATH, PYTHONPATH, TMPDIR, etc.) for the duration of the block. This can cause unexpected failures in tests, subprocesses, or test runners in certain environments.

Instead, you should preserve the existing environment variables and only exclude or remove the TONGYI_STT_SUBPROCESS variable.

Suggested change

with patch.dict(os.environ, {}, clear=True):

with patch.dict(os.environ, {k: v for k, v in os.environ.items() if k != "TONGYI_STT_SUBPROCESS"}, clear=True):

Zenine temporarily deployed to models/tongyi June 8, 2026 03:56 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed Jun 8, 2026

View reviewed changes

Zenine force-pushed the codex/tongyi-stt-subprocess-default branch from b2acd0c to fd8426f Compare June 8, 2026 04:08

Zenine changed the title ~~fix(tongyi): enable STT subprocess by default~~ fix(tongyi): avoid gevent busy loop in STT Jun 8, 2026

Zenine temporarily deployed to models/tongyi June 8, 2026 04:09 — with GitHub Actions Inactive

fix(tongyi): avoid gevent busy loop in STT

4708347

Zenine force-pushed the codex/tongyi-stt-subprocess-default branch from fd8426f to 4708347 Compare June 8, 2026 05:44

Zenine deployed to models/tongyi June 8, 2026 05:45 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tongyi): avoid gevent busy loop in STT#3261

fix(tongyi): avoid gevent busy loop in STT#3261
Zenine wants to merge 1 commit into
langgenius:mainfrom
Zenine:codex/tongyi-stt-subprocess-default

Zenine commented Jun 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	with patch.dict(os.environ, {}, clear=True):
	with patch.dict(os.environ, {k: v for k, v in os.environ.items() if k != "TONGYI_STT_SUBPROCESS"}, clear=True):

Conversation

Zenine commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Field validation

Local validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Zenine commented Jun 8, 2026 •

edited

Loading