Skip to content

Collections API: RasterLinesJoin improvements #72

@kellyi

Description

@kellyi

Spoke with @lossyrob a bit about how we might improve the performance of RasterLinesJoin and he pointed out two optimizations we could make:

Reducing the number of stream lines looped over per tile

Currently we loop over the whole set of MultiLines for each tile here: https://github.com/WikiWatershed/mmw-geoprocessing/blob/develop/api/src/main/scala/Geoprocessing.scala#L106

However, we could consider looping over only the subset of lines which actually intersect the tile. Depending on how many tiles are there for an AOI, this would reduce the number of times the lines loop executes since we'd only be dealing with lines with actual values.

We'd have to check whether improvements here would be offset by, presumably, looping over the lines to do the intersection operation before that.

Using Lines rather than MultiLines

Currently we do some processing on the input to transform the input stream vectors into MultiLines: https://github.com/WikiWatershed/mmw-geoprocessing/blob/develop/api/src/main/scala/Utils.scala#L120

However, apparently the MultiLines are unspooled by GT into Lines, so we could flatmap the stream vectors into a Seq[Line] and then try using something like a forEachByLineString method in the loop.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions