Skip to content

Speed up Local#551

Open
akx wants to merge 3 commits into
django:mainfrom
akx:real-context
Open

Speed up Local#551
akx wants to merge 3 commits into
django:mainfrom
akx:real-context

Conversation

@akx
Copy link
Copy Markdown
Contributor

@akx akx commented Feb 25, 2026

While profiling a Django app with viztracer, I noticed asgiref Local access/locking being actually visible in the trace, and figured there could be something to fix perf-wise :)

@contextlib.contextmanager can be slow due to the fact that the interpreter needs to juggle a generator frame, etc. This PR replaces it with a small real context manager (that is stored on the instance for reuse instead, in the first commit.

The second commit further specializes Local (using __new__ magic, so this is transparent to consumers) to 2 classes depending on if thread_critical is set, so accesses do not require branches at all – and indeed, in the thread-critical case, nullcontext isn't required anymore. (Further, the guards in the setattr case become unnecessary since we can just use object.__setattr__ to sidestep the classes' impls.)

EDIT: the third commit gets rid of the context manager altogether in favor of just inlining the same things as a try:finally: block, as per comments here, for even more performance. 😄

I pytest-benchmarked this to be pleasantly faster (and given that these are often very hot paths in Django, this is a nice win for not an awful lot of additional code):

Real context manager

Benchmark Kops/s before Kops/s after Change
test_getattr_thread_critical 1,041.50 3,715.52 +257%
test_setattr_thread_critical 1,040.11 3,575.43 +244%
test_getattr_default 1,034.84 2,493.43 +141%
test_setattr_default 913.03 1,833.74 +101%
test_delattr_thread_critical 504.15 1,747.49 +247%
test_getattr_missing_default 544.32 968.90 +78%
test_delattr_default 444.68 1,030.41 +132%
test_getattr_thread_critical_async 31.40 32.50 +4%
test_setattr_thread_critical_async 32.69 33.34 +2%

No context manager (only affects default)

Benchmark Kops/s before Kops/s after Change
test_getattr_thread_critical 1,047.4632 3,649.6306 +248.4%
test_setattr_thread_critical 1,013.8543 3,500.2482 +245.2%
test_getattr_default 986.0904 3,586.0984 +263.7%
test_setattr_default 859.1037 2,673.6265 +211.2%
test_getattr_missing_default 520.0822 1,068.1260 +105.4%
test_delattr_thread_critical 502.1324 1,712.0688 +241.0%
test_delattr_default 423.2087 1,347.7917 +218.5%
test_setattr_thread_critical_async 31.4873 32.1533 +2.1%
test_getattr_thread_critical_async 30.7843 31.7018 +3.0%

@akx akx marked this pull request as draft February 25, 2026 17:05
@bluetech
Copy link
Copy Markdown
Contributor

I have seen Local show up in profiles as well, and I don't even use async, so it'd be great to have its performance improved.

I wonder given that Local is performance sensitive if it won't be better to just use try/finally for the lock instead of the _LockContext helper? Sure it's not as nice and a bit more error prone, but I think this is a case where efficiency beats beauty (of course as long as avoiding _LockContext is actually faster).

@akx
Copy link
Copy Markdown
Contributor Author

akx commented Mar 24, 2026

I wonder given that Local is performance sensitive if it won't be better to just use try/finally for the lock instead of the _LockContext helper?

Sure. Triplicating the logic into the functions would probably make a lot of sense for performance, but those functions should probably be comment-marked "there be dragons here, be very careful" 😄

@akx
Copy link
Copy Markdown
Contributor Author

akx commented Mar 24, 2026

@bluetech Great news: according to my measurements it is indeed a whole lot faster, another +100% on top of the context-manager version 😄

Name (time in ns)                      0001_baselin(*) OPS (Kops/s)  0002_real-co OPS (Kops/s)  ΔOPS (Kops/s)  0003_real-co OPS (Kops/s)  ΔOPS (Kops/s)
-------------------------------------------------------------------------------------------------------------------------------------------------------
test_getattr_default                                       986.0904                 2,318.5320        +135.1%                 3,586.0984        +263.7%
test_setattr_default                                       859.1037                 1,839.8386        +114.2%                 2,673.6265        +211.2%
test_getattr_missing_default                               520.0822                   941.0258         +80.9%                 1,068.1260        +105.4%
test_delattr_default                                       423.2087                   988.1233        +133.5%                 1,347.7917        +218.5%

Comment thread asgiref/local.py Outdated
Thanks to bluetech for the idea.
@akx
Copy link
Copy Markdown
Contributor Author

akx commented May 21, 2026

Gentle review nudge, @carltongibson? 😄 (Ditto for #552)

@carltongibson
Copy link
Copy Markdown
Member

@akx Nice 😃 — I'm just working my way back to asgiref now, as it happens. 🎁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants