Skip to content

v.overlay: add option to remove small areas#7370

Open
metzm wants to merge 6 commits into
OSGeo:mainfrom
metzm:v.overlay_rmarea
Open

v.overlay: add option to remove small areas#7370
metzm wants to merge 6 commits into
OSGeo:mainfrom
metzm:v.overlay_rmarea

Conversation

@metzm
Copy link
Copy Markdown
Contributor

@metzm metzm commented May 6, 2026

From the updated manual:

When overlaying two vectors with areas, very small areas can occur in the
output. This can happen when e.g. one vector is a slightly modified
version of the other vector (buffered or simplified). These very small
areas can be removed by setting minsize to some value larger 0.
The value is interpreted as square meters. In order to remove only noise
from slightly mismatching boundaries, the value of minsize should be
small, e.g. in the range 0.0001 to 1.

This is useful not only to remove noise, but also to reduce the size of the output vector in cases where a lot of very small areas are created by the overlay operation.

The group of PRs #7333, #7338, #7366, #7370 belong together.

@metzm metzm added this to the 8.6.0 milestone May 6, 2026
@metzm metzm added vector Related to vector data processing C Related code is in C module labels May 6, 2026
@github-actions github-actions Bot added HTML Related code is in HTML docs markdown Related to markdown, markdown files labels May 6, 2026
@metzm metzm added the enhancement New feature or request label May 6, 2026
@metzm metzm requested a review from wenzeslaus May 7, 2026 13:00
@metzm metzm force-pushed the v.overlay_rmarea branch from cd1961e to 092c6d6 Compare May 8, 2026 14:00
@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 8, 2026

The new test contains two tests:

The second test is relatively slow, together they take 10.7s on my laptop which I find a bit long for the whole GRASS testsuite. Any opinion on the test duration?

@echoix
Copy link
Copy Markdown
Member

echoix commented May 8, 2026

Limit to 2-3 polygons maybe? We wish to have unit tests, not really integration tests

@github-actions github-actions Bot added Python Related code is in Python tests Related to Test Suite labels May 8, 2026
@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 8, 2026

Limit to 2-3 polygons maybe? We wish to have unit tests, not really integration tests

OK, running time is now down to 1.5s, I think this is ok. My statements about the new tests still hold true.

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 13, 2026

There are now unittest tests and pytest tests. Both do the same tests. Delete the unittest tests?

@echoix
Copy link
Copy Markdown
Member

echoix commented May 13, 2026

There are now unittest tests and pytest tests. Both do the same tests. Delete the unittest tests?

Yes if possible. Or, to solve the failures, rename one of the files. But if they are both the same test, only keep pytest

@metzm
Copy link
Copy Markdown
Contributor Author

metzm commented May 19, 2026

Does anybody have any idea why the pytest fails at the very first GRASS command?

It succeeds locally, otherwise I would not have added this pytest. Because the failure is not reproducible, I have no idea how to fix it.

@echoix
Copy link
Copy Markdown
Member

echoix commented May 19, 2026

Yes, there’s something weird made as a side effect of calling something of the c-based library or tools, that make it fail on the first call (per worker). Then, next calls are working as normal. Python side doesn’t seem affected.

I didn’t manage to find out what magic is done yet in more than 2 years. This behavior is even more apparent when randomizing the test order. And on windows it’s worse, (from my experience), as more tests can fail.

@echoix
Copy link
Copy Markdown
Member

echoix commented May 19, 2026

If you find out why, it would unblock many things

Comment on lines +10 to +11
# create test data
@pytest.fixture(scope="class", autouse=True)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is autouse a pattern we want to use, or we prefer being explicit of the side effects?

Comment on lines +7 to +48
class TestVOverlay:
"""Test v.overlay output against expected output"""

# create test data
@pytest.fixture(scope="class", autouse=True)
def create_testdata(self):
# set up
gs.run_command(
"v.extract",
input="boundary_county",
output="boundary_county_extract1",
where="NAME in ('CURRITUCK')",
)
gs.run_command(
"v.extract",
input="boundary_county",
output="boundary_county_extract2",
where="NAME in ('CAMDEN')",
)
# modify extract 1
gs.run_command(
"v.buffer",
input="boundary_county_extract1",
output="boundary_county_extract1_buffer_out",
type="area",
distance=2,
)
gs.run_command(
"v.buffer",
input="boundary_county_extract1_buffer_out",
output="boundary_county_extract1_buffer_in",
type="area",
distance=-2,
)

# run the tests
yield

# clean up test data regardless of test success/failure
gs.run_command(
"g.remove", type="vector", flags="f", pattern="boundary_county_extract*"
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the grass temporary session/project created?

There is no default data available by default (that could be affected destructively by a low-quality test), unlike gunittest, that these tests already assume a project with certain maps are available and loaded (which end up being integration tests because of that).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! Indeed I assumed that a default GRASS session with the NC data is already active. So how can I make use of the NC data in a pytest test for a simple fast test?

Alternatively, I would extract the test data from the NC data and add them to the GRASS source code, but that seems wrong.

@echoix
Copy link
Copy Markdown
Member

echoix commented May 19, 2026

In the pytest error trace, for Linux, it makes additional mention of no active grass session, GISRC env var isn't set

Here:

ERROR: No active GRASS session: GISRC environment variable not set

So, not creating the temp project for the data setup fixture would be a good hypothesis, see the PR review comment.
But you would probably not have access to the demo project by default (which we avoid for pytest for now).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C Related code is in C docs enhancement New feature or request HTML Related code is in HTML markdown Related to markdown, markdown files module Python Related code is in Python tests Related to Test Suite vector Related to vector data processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants