Skip to content

switchedToDwithin#829

Open
FirePheonix wants to merge 9 commits into
pysal:mainfrom
FirePheonix:dwithinPredicateCheck
Open

switchedToDwithin#829
FirePheonix wants to merge 9 commits into
pysal:mainfrom
FirePheonix:dwithinPredicateCheck

Conversation

@FirePheonix

Copy link
Copy Markdown
Contributor

Hello! Please make sure to check all these boxes before submitting a Pull Request
(PR). Once you have checked the boxes, feel free to remove all text except the
justification in point 5.

  1. You have run tests on this submission locally using pytest on your changes. Continuous integration will be run on all PRs with GitHub Actions, but it is good practice to test changes locally prior to a making a PR.
  2. This pull request is directed to the pysal/master branch.
  3. This pull introduces new functionality covered by
    docstrings and
    unittests?
  4. You have assigned a
    reviewer
    and added relevant labels

This PR replaces "buffer + intersects predicate" with "dwithin predicate". Fixing: #817 . I also did some performance checks which showed significant improvements

I have added a 2x multiplier on values so the tests remain consistent. The 2x factor is required because the previous buffer intersects approach expanded both geometries, effectively allowing distances up to 2r, whereas dwithin measures the direct inter-geometry distance.

image image

@FirePheonix

Copy link
Copy Markdown
Contributor Author

It's not complete yet, i'm still making some changes.

@codecov

codecov Bot commented Jan 10, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 82.05128% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.4%. Comparing base (174d24c) to head (ee11385).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
libpysal/weights/util.py 53.8% 6 Missing ⚠️
libpysal/graph/_contiguity.py 90.9% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##            main    #829     +/-   ##
=======================================
- Coverage   85.5%   85.4%   -0.0%     
=======================================
  Files        151     151             
  Lines      16063   16084     +21     
=======================================
+ Hits       13728   13742     +14     
- Misses      2335    2342      +7     
Files with missing lines Coverage Δ
libpysal/graph/base.py 96.5% <ø> (ø)
libpysal/graph/tests/test_contiguity.py 100.0% <100.0%> (ø)
libpysal/graph/_contiguity.py 97.9% <90.9%> (-1.0%) ⬇️
libpysal/weights/util.py 76.9% <53.8%> (-1.2%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@FirePheonix

Copy link
Copy Markdown
Contributor Author

hey @martinfleis
The tests are failing because the dwithin predicate with parameter requires GEOS >= 3.10, but the CI tests on older environments (like 311-oldest.yaml) that have older GEOS versions.

what shall I do? Update the tests? or put a fallback?

@martinfleis

Copy link
Copy Markdown
Member

Conditionally skip those tests.

Comment thread libpysal/graph/_contiguity.py Outdated
def _fuzzy_contiguity(
geoms, ids, tolerance=None, buffer=None, predicate="intersects", **kwargs
):
def _fuzzy_contiguity(geoms, ids, tolerance=None, buffer=None, predicate="intersects"):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of removing **kwargs, shall we actually pass them into the geoms.sindex.query calls (which we are not currently doing)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change sir. Could you please elaborate how that would help up us?

@martinfleis martinfleis left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it should not replace buffer entirely. I can imagine a use case where you want to do buffer followed up by a custom predicate. Replacing it like this is technically a breaking change as some workflows might stop working.

Though I am not sure how the API should look. We're already have it quite complicated with tolerance and buffer keywords...

@FirePheonix

Copy link
Copy Markdown
Contributor Author

I think that it should not replace buffer entirely. I can imagine a use case where you want to do buffer followed up by a custom predicate. Replacing it like this is technically a breaking change as some workflows might stop working.

Though I am not sure how the API should look. We're already have it quite complicated with tolerance and buffer keywords...

yeah, this is actually a breaking change. I was thinking about this and one option could be to add a new distance parameter for dwithin, and keep buffer working as before (geometry expansion + predicate)?

so, for an example: if distance = 5000 is passed -> it would simply call dwithin. (requires GEOS 3.10)
and, if buffer = 5000 is passed -> buffer the geometries + apply predicate (current behaviour)
tolerance can convert to either? i personally felt like distance would make more sense..

we can actually do that since I realized earlier, in my first commit of this pr, only one test, the oldest was failing.. maybe i can change the oldest test to use the buffer method / OR if GEOS < 3.10 (which i'm not sure about using) and we can work like that?

@FirePheonix

FirePheonix commented Jan 16, 2026

Copy link
Copy Markdown
Contributor Author

i think i figured out a way how the API could be like. the test cases too are passing (lemme know if something is wrong and needs a change.. i'll change it immediately)

(and yeah, the reason for test not skipping earlier was rather shapely version itself, and not geos..., one can switch to buffer + predicate method if shapely version is older.)

# fast (shapely >= 2.1): can use dwithin predicate
w = fuzzy_contiguity(gdf, distance=1000)

# flexible (all versions): custom predicate 
w = fuzzy_contiguity(gdf, buffering=True, buffer=1000, predicate='intersects')

# auto (all versions): calculated tolerance
w = fuzzy_contiguity(gdf, buffering=True)

in this way, we can have the buffer use-case, handle tolerance, and use dwithin for faster execution..

@FirePheonix FirePheonix marked this pull request as ready for review January 16, 2026 18:17
@martinfleis

Copy link
Copy Markdown
Member

I am not excited about the bloated API here. I think this needs a discussion during a dev call. Will keep you posted.

@ljwolf

ljwolf commented Mar 18, 2026

Copy link
Copy Markdown
Member

I don't recall the output of this discussion?

@martinfleis

martinfleis commented Mar 18, 2026

Copy link
Copy Markdown
Member

We haven't discussed it.

Edit: technically we did decide to postpone the discussion to the summit and then we did not discuss it.

@FirePheonix

Copy link
Copy Markdown
Contributor Author

yea, i was on the call actually... this is to be discussed later on due it's complexity..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants