Skip to content

Add collapse_comments option and extract includedActivities fields#345

Merged
cmutel merged 2 commits into
mainfrom
feature/ecospold2-collapse-comments
Apr 24, 2026
Merged

Add collapse_comments option and extract includedActivities fields#345
cmutel merged 2 commits into
mainfrom
feature/ecospold2-collapse-comments

Conversation

@cmutel

@cmutel cmutel commented Apr 24, 2026

Copy link
Copy Markdown
Member

Summary

  • New collapse_comments kwarg (default True) on Ecospold2DataExtractor.extract() and extract_activity(). When False, the comment field is a dict keyed by source (e.g. "general", "included activities start", "geography", …) instead of a flat concatenated string. Only non-empty keys are included.
  • included_activities_start and included_activities_end are now always extracted as top-level dataset fields from activityDescription/activity/includedActivitiesStart and …End.
  • Fixed a pre-existing double-space in the collapsed comment separator (": " not ": " — the old code used " ".join() on a label that already ended with a space).

Test plan

  • Existing test_extraction_without_synonyms and test_extraction_with_synonyms updated for new fields and corrected spacing
  • New test_collapse_comments_false verifies dict output, correct key values, and absent keys for missing fields
  • All 9 extractor tests pass (pytest tests/ecospold2/ecospold2_extractor.py)

cmutel added 2 commits April 24, 2026 09:55
… ecospold2 extractor

- New `collapse_comments` kwarg (default True) on `extract()` and `extract_activity()`;
  when False, `comment` is a dict keyed by source field instead of a flat string
- Always extract `included_activities_start` and `included_activities_end` as top-level
  dataset fields from activityDescription/activity
- Fix pre-existing double-space in collapsed comment separator (": " not ":  ")
- Update existing tests for new fields and corrected spacing; add test for dict mode
…_period)

Add start_date, end_date, and valid_for_entire_period (bool) as top-level dataset
fields extracted from activityDescription/timePeriod XML attributes.
@cmutel cmutel merged commit 69396ba into main Apr 24, 2026
36 checks passed
@cmutel cmutel deleted the feature/ecospold2-collapse-comments branch April 24, 2026 08:37
@cmutel cmutel mentioned this pull request Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant