Skip to content

fix(test) fixes flaky es-index/alias race#7209

Open
Brennan-Chesley-FLP wants to merge 3 commits intomainfrom
fix/flaky-es-get_alias-race
Open

fix(test) fixes flaky es-index/alias race#7209
Brennan-Chesley-FLP wants to merge 3 commits intomainfrom
fix/flaky-es-get_alias-race

Conversation

@Brennan-Chesley-FLP
Copy link
Copy Markdown
Contributor

Fixes

This fixes a race condition related to get_alias() in the django-elasticsearch-dsl, where deleting an index while getting a list of aliases without supplying ignore_unavailable=True will throw an error. This migrates setup/teardown for a set of tests that were relying on the search_index command provided by that library to not use the particular codepath throwing the error.

Repro for the curious:

"Concurrent index deletion cause get_alias() to 404"

import threading

from elasticsearch_dsl import connections

es = connections.get_connection()

# Setup: create a few indices
for i in range(5):
    es.indices.create(index=f"race-test-{i}", ignore=[400])

errors = []
ITERATIONS = 2000

def hammer_get_alias():
    for _ in range(ITERATIONS):
        try:
            list(es.indices.get_alias().values())
        except Exception as e:
            errors.append(e)


def hammer_delete_create():
    for _ in range(ITERATIONS):
        es.indices.delete(index="race-test-0", ignore=[404])
        es.indices.create(index="race-test-0", ignore=[400])


t1 = threading.Thread(target=hammer_get_alias)
t2 = threading.Thread(target=hammer_delete_create)
t1.start()
t2.start()
t1.join()
t2.join()

# Cleanup
for i in range(5):
    es.indices.delete(index=f"race-test-{i}", ignore=[404])

print(f"\n{'=' * 40}")
print(f"Iterations: {ITERATIONS}")
print(f"Errors: {len(errors)}")
for e in errors[:3]:
    print(f"  {type(e).__name__}: {e}")

@Brennan-Chesley-FLP
Copy link
Copy Markdown
Contributor Author

The tracebacks for these errors looked approximately like this (the test could change depending on how the race played out):

======================================================================
ERROR: test_index_parent_or_child_docs_in_es (cl.search.tests.tests_es_opinion.IndexOpinionDocumentsCommandTest.test_index_parent_or_child_docs_in_es)
Confirm the command can properly index missing clusters when
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python3.13/unittest/case.py", line 58, in testPartExecutor
    yield
  File "/usr/local/lib/python3.13/unittest/case.py", line 654, in run
    self._callTearDown()
    
  File "/usr/local/lib/python3.13/unittest/case.py", line 611, in _callTearDown
    self.tearDown()
    ~~~~~~~~~~~~~^^
  File "/opt/courtlistener/cl/search/tests/tests_es_opinion.py", line 3473, in tearDown
    self.delete_index("search.OpinionCluster")
    ^^^^^^^^^^^^^^^
  File "/opt/courtlistener/cl/tests/cases.py", line 160, in delete_index
    call_command("search_index", "--delete", "-f", "--models", model)
    ^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/django/core/management/__init__.py", line 195, in call_command
    return command.execute(*args, **defaults)
      ^^^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/django/core/management/base.py", line 464, in execute
    output = self.handle(*args, **options)
    ^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 301, in handle
    for index in self.es_conn.indices.get_alias().values():
    ^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/elasticsearch/_sync/client/utils.py", line 452, in wrapped
    return api(*args, **kwargs)
    ^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/elasticsearch/_sync/client/indices.py", line 2276, in get_alias
    return self.perform_request(  # type: ignore[return-value]
    ^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/elasticsearch/_sync/client/_base.py", line 422, in perform_request
    return self._client.perform_request(
    ^^^^^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/elasticsearch/_sync/client/_base.py", line 271, in perform_request
    response = self._perform_request(
    ^^^^^^^^^^^
  File "/opt/venv/lib/python3.13/site-packages/elasticsearch/_sync/client/_base.py", line 351, in _perform_request
    raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
    ^^^^^^^^^^^
elasticsearch.NotFoundError: NotFoundError(404, 'index_not_found_exception', 'no such index [case_law_index-opinionfeedtest]', case_law_index-opinionfeedtest, index_or_alias)

@albertisfu albertisfu self-requested a review April 9, 2026 21:00
@albertisfu albertisfu moved this to To Do in Sprint (Web Team) Apr 9, 2026
@albertisfu albertisfu moved this from To Do to In progress in Sprint (Web Team) Apr 10, 2026
Copy link
Copy Markdown
Contributor

@albertisfu albertisfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Brennan-Chesley-FLP, this looks pretty good. It’ll solve this long-time annoying flaky test.

Just a small suggestion before we merge it.

Comment on lines +156 to +158
models = model if isinstance(model, list) else [model]
model_classes = [apps.get_model(m) for m in models]
for index in registry.get_indices(models=model_classes):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 3 lines are common in the 3 methods. Can we move them to a helper method like:

  def _get_indices(cls, model):                                                                                                                                        
      models = model if isinstance(model, list) else [model]                                                                                                                                               
      model_classes = [apps.get_model(m) for m in models]
      return registry.get_indices(models=model_classes)

And then just call it like:

for index in cls._get_indices(model):
          index.create(ignore=[400])

To prevent duplicated code.

@albertisfu albertisfu moved this from In progress to To Do in Sprint (Web Team) Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: To Do

Development

Successfully merging this pull request may close these issues.

2 participants