Skip to content

BERVO gap analysis: variables needed to represent CHESS (Colorado Headwaters Ecological Spectroscopy Study) datasets #29

Description

@cmungall

Summary

A systematic analysis of the CHESS datasets on ESS-DIVE (16 datasets) reveals significant gaps in BERVO's coverage of field ecology observational data. BERVO excels at representing EcoSIM's biogeochemical model parameters but lacks terms for many variables commonly collected in field campaigns that serve as model inputs, calibration targets, or validation benchmarks.

This analysis was generated from the chess-data repo which downloads all CHESS data from ESS-DIVE and maps variables to BERVO.


Gap Categories

1. 🔴 Vegetation Structure (HIGH PRIORITY — core EcoSIM inputs)

These are standard field measurements that parameterize or validate plant functional type (PFT) models:

Missing Variable CHESS Source Why BERVO Needs It
Vegetation cover fraction Vegetation_Cover, Cover_Percent, FractionalCover Ground-truth for canopy coverage. Maps to PFT fractional area in EcoSIM grid cells. No BERVO term exists for this fundamental measurement.
Stem diameter (DBH) Stem_DBH, DBH_1_CM, DBH_2_CM, DBH_Avg_CM Standard forest inventory metric. Allometric equations use DBH to estimate biomass, carbon stocks, canopy dimensions — all EcoSIM inputs.
Crown class / canopy position Crown_Class, Canopy_Position Dominant/codominant/intermediate/suppressed classification. Determines light interception partitioning in canopy models.
Basal area ba (forest structure dataset) Stand-level metric derived from DBH. Direct input for stand density parameterization.
Tree/stem density density, abla_density, pien_density, pico_density Species-specific stem counts per area. Required for PFT abundance initialization.

Existing close matches: BERVO:0000695 (Pft canopy height) covers height but nothing else in this category.

2. 🔴 Leaf Area Index & Canopy Optics (HIGH PRIORITY)

CHESS has an entire LAI dataset with sophisticated measurements:

Missing Variable CHESS Source Why BERVO Needs It
Field-measured LAI (as distinct from modeled LAI) L_2200, L_SCATCOR, L_WN, L_LANG, L_ELLIP, L_FV2200 Multiple LAI estimation methods (LAI-2200, scattering-corrected, Warren-Neilson, Lang, Ellipsoidal). These are observational LAI values used to calibrate/validate the modeled BERVO:8000164 (Leaf area index). BERVO should distinguish observed vs. modeled LAI.
Effective LAI (Le) Le_2200, Le_FV2200, Le_WN Clumping-uncorrected LAI. Distinct from true LAI. Important for radiation transfer models.
Clumping index CII, ACF_2200, ACF_SCATCOR Canopy clumping correction factors. BERVO:0000814 references clumping but only as a parameter, not an observed metric.
Scattering correction factors SCATCOR_*, CHI_SCATCOR, S_SCATCOR Wood-to-total area ratio corrections for LAI instruments.

3. 🟡 Soil Properties — Observational (MEDIUM PRIORITY)

CHESS metagenome dataset includes extensive soil characterization. Some overlap with existing BERVO soil terms, but many are missing:

Missing Variable CHESS Source Why BERVO Needs It
Soil moisture (field-measured) SoilMoisture_%_1/2/3, water content, VWC_1/2 Volumetric water content field measurements. Validation target for EcoSIM hydrology.
Bulk electrical conductivity bulk electrical conductivity TDR-measured. Proxy for soil moisture and salinity.
Saturated hydraulic conductivity ksat Direct input for EcoSIM soil water flow.
Available water capacity awc Plant-available water. Key soil parameter.
Infiltration rate infiltrations Surface hydrology parameterization.
Microbial biomass C/N microbial biomass carbon, microbial biomass nitrogen Microbial pool initialization for biogeochemical cycling.
Soil chemical properties ammonium nitrogen, nitrate_nitrogen, nitrite_nitrogen, organic nitrogen, total phosphorus, phosphate, aluminum saturation, manganese, zinc Nutrient pools and chemistry. Some may exist in BERVO under different names — needs cross-referencing.

4. 🟡 Terrain & Topography (MEDIUM PRIORITY)

Missing Variable CHESS Source Why BERVO Needs It
Elevation Elevation, Elevation_m, Topographical_Elevation BERVO has slope-related terms but no explicit elevation term.
Topographic wetness index (TWI) twi Hydrological connectivity metric. Important for subsurface flow modeling.
Topographic position index (TPI) tpi Ridge/valley classification. Affects soil depth, moisture.
Curvature curvature Surface curvature. Controls flow convergence/divergence.
Heat load index heat_load Aspect-derived radiation metric.
Folded aspect folded_aspect_205 Southwest-normalized aspect. Common in ecological modeling.

Existing matches: BERVO:0000685 (Aspect), BERVO:8000031 (Slope) — these are good, but the derived indices are missing.

5. 🟡 Climate / Meteorological (MEDIUM PRIORITY)

Missing Variable CHESS Source Why BERVO Needs It
Actual evapotranspiration (AET) aet Water balance component. Key validation target.
Snow water equivalent (SWE) swe, delta_swe Snowpack hydrology. Critical for mountain ecosystem modeling.
Climatic water deficit (CWD) cwd Drought stress metric. Drives vegetation response.
Mean annual/seasonal temperature mean annual temperature, mean seasonal temperature Site characterization. May exist in BERVO climate force terms — needs checking.
Mean annual/seasonal precipitation mean annual precipitation, average seasonal precipitation Site characterization.

6. 🟢 Spectroscopy & Remote Sensing (LOW PRIORITY for EcoSIM, but important for BioEPIC)

CHESS includes imaging spectroscopy (NEON AOP), field spectra, reflectance mosaics, LiDAR. These are primarily used for spatial scaling and are less directly tied to EcoSIM inputs, but are central to the BioEPIC AI transduction workflow.

Missing Variable Notes
Hyperspectral reflectance NEON AOP 426-band reflectance. Not a single variable but a data type.
Canopy water content Derived from spectroscopy. Relevant to plant stress.
Shade fraction Canopy structural metric from LiDAR/spectroscopy.
LiDAR-derived canopy height model (CHM) Spatial canopy height. Validates BERVO:0000708.
Digital terrain/surface models (DTM/DSM) Gridded elevation products.

7. 🟢 Taxonomy & Biodiversity (OUT OF SCOPE for BERVO, but worth noting)

CHESS has extensive taxonomic data (Taxon_binomial, Taxon_family, GBIF_Taxon_ID, species lists). These are outside BERVO's scope as a biogeochemical variable ontology, but the link between species identity and PFT assignment is a critical workflow gap. Consider cross-referencing GBIF/NCBI taxonomy to BERVO PFT categories.


Recommendations

  1. Immediate (for CHESS integration): Add terms for vegetation cover fraction, stem DBH, field-measured LAI (multiple methods), soil moisture, and basic topographic indices. These are the most commonly measured variables in field ecology and the most direct EcoSIM inputs.

  2. Distinguish observed vs. modeled quantities. BERVO currently treats LAI as a single concept, but field measurements (LAI-2200, hemispherical photo, etc.) are fundamentally different from modeled LAI. Consider an "observation method" axis or parallel observed/modeled term pairs.

  3. Add a "field measurement" category. Many BERVO categories are model-centric ("Plant growth parameters", "Canopy data type"). A "Field observation" or "Measurement" category would naturally house the ground-truth variables that validate model outputs.

  4. Consider extending to derived terrain indices (TWI, TPI, heat load). These are standard inputs for spatially distributed models and increasingly important for CONUS-scale parameterization.

  5. Cross-reference with ESS-DIVE Community Reporting Formats. ESS-DIVE has standardized reporting formats for CSV headers, sample metadata, etc. Aligning BERVO with these would enable automatic mapping.


Data Source

Analysis based on 16 CHESS datasets (ESS-DIVE), spanning:

  • Field vegetation surveys (meadow, shrub, tree plots)
  • Leaf Area Index (LAI-2200 instrument, multiple correction methods)
  • Spectroscopy (field ASD spectrometer, NEON AOP imaging spectroscopy)
  • LiDAR (discrete return + waveform)
  • Geophysics (TDR soil moisture, EMI survey)
  • Metagenomics (soil microbiome + soil properties)
  • Forest structure (census data, individual tree detection)

Code and curated mapping: https://github.com/bioepic-data/chess-data

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions