Skip to content

v.geometry: new tool (v.to.db wrapper) for more obvious access to vector geometry computation#7329

Open
petrasovaa wants to merge 3 commits into
OSGeo:mainfrom
petrasovaa:v_geometry
Open

v.geometry: new tool (v.to.db wrapper) for more obvious access to vector geometry computation#7329
petrasovaa wants to merge 3 commits into
OSGeo:mainfrom
petrasovaa:v_geometry

Conversation

@petrasovaa
Copy link
Copy Markdown
Contributor

@petrasovaa petrasovaa commented Apr 18, 2026

When somebody wants to compute perimeter/lengths/bbox of vector features, it's difficult to find the tool because v.to.db's name doesn't convey that. Plus for many data science workflows printing it and putting in dataframe is easier than dealing with attribute table. The v.to.db's interface is centered around the table, e.g. column is required even when just printing. Here is a quick summary of the tool:

  • New tool v.geometry: read-only, JSON-first frontend to v.to.db for printing vector geometry metrics (area, perimeter, length, count, compactness, fractal_dimension, slope, sinuosity, azimuth, coordinates, start, end, bbox).
  • Multiple metrics in one call are dispatched in parallel (ThreadPoolExecutor, controlled by nprocs) and merged by category; results stay deterministic in metric order.
  • Output formats: json (default), csv (comma default), plain (pipe default); positional per-metric units.
  • Rejects mixing metrics from different feature-type families (area/line/point) to avoid silent cat-collision merges; count is universal.

It includes functions resolving number of cores, this is something that will need to go into a library I think, but I kept it here for now.

Before:

v.to.db map=geology option=bbox column=unused -p format=json

Instead using v.geometry:

v.geometry map=geology option=bbox

@github-actions github-actions Bot added vector Related to vector data processing Python Related code is in Python HTML Related code is in HTML module docs markdown Related to markdown, markdown files tests Related to Test Suite CMake labels Apr 18, 2026
Copy link
Copy Markdown
Contributor

@cwhite911 cwhite911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, everything looks good. I just have a few minor suggestions.

for record in data["records"]:
print(record["category"], record["compactness"])
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a grass.tools example?

# %end

# %option G_OPT_F_SEP
# % answer: {NULL}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does {NULL} work?

Comment on lines +57 to +61
# %option G_OPT_F_FORMAT
# % options: plain,json,csv
# % answer: json
# % descriptions: plain;Plain text with pipe separator by default;json;JSON (JavaScript Object Notation);csv;CSV (Comma Separated Values)
# %end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can address this in another PR, but we should make the order of options for G_OPT_F_FORMAT constant across all tools.

Comment on lines +125 to +152
def _available_cpus():
"""Number of CPUs this process may actually use.

Prefers affinity-aware sources over ``os.cpu_count()``, which reports
the host total and overcounts in containers and cgroup-limited jobs.
"""
if hasattr(os, "process_cpu_count"): # Python 3.13+
return os.process_cpu_count() or 1
if hasattr(os, "sched_getaffinity"): # Linux
return len(os.sched_getaffinity(0))
return os.cpu_count() or 1


def _resolve_nprocs(nprocs):
"""Resolve G_OPT_M_NPROCS into a worker count for ThreadPoolExecutor.

Mirrors the semantics of G_set_omp_num_threads() in
lib/gis/omp_threads.c: 0 means use all available cores, a positive
number is used as-is, a negative number means cpu_count + nprocs
(clamped to at least 1). Belongs in a library helper eventually.
"""
nprocs = int(nprocs)
if nprocs > 0:
return nprocs
available = _available_cpus()
if nprocs == 0:
return available
return max(1, available + nprocs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note. We should consider adding a version of this to grass.scripts since the same logic is widely reused.

Something like.

gs.resolve_nprocs(nprocs=0)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this is now in a separate PR #7414. Once that is merged I will update it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake docs HTML Related code is in HTML markdown Related to markdown, markdown files module Python Related code is in Python tests Related to Test Suite vector Related to vector data processing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants