Skip to content
This repository was archived by the owner on Nov 17, 2022. It is now read-only.
This repository was archived by the owner on Nov 17, 2022. It is now read-only.

Extract data from Schemas #303

@hayfield

Description

@hayfield

Schemas are used to represent the structure that IATI XML is expected to be in. They contain a number of elements and attributes. Each of these has information that would be useful to extract. This includes descriptions, the occurrence properties, and XPaths that things occur at. Following research into this area, there does not appear to be a standard method to undertake this task using open tooling.

#64 provides an initial attempt at extracting this information. This is, however, using tools that aren't really designed for the job, leading to hundreds of lines of fairly confusing code that is hard to comprehend, doesn't really handle all the cases that it needs to, and would be a challenge to maintain.

It is therefore proposed to implement this functionality using a two-stage process:

  • Utilise XSLT to transform the Schema into an Intermediate Representation (IR) that has the information structured in an easy-to-query format
  • Have capabilities available within the schemas module to access the information presented in the IR through a defined Python API

Based on preliminary investigation, the IR will likely:

  • Treat elements and attributes as equivalents
    • ie. an optional attribute would become: min_occurs = 0 and max_occurs = 1
  • Be designed such that the primary key is an XPath

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiChanges to the pyIATI API.enhancementSome sort of new functionality (rather than fixing or tweaking something that already existed).missing-featureA major feature that should exist, but does not.parent-issueAn issue that makes reference to a number of other issues that split the large task into parts.schemasRelating to IATI Schemas.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions