Skip to content

Constructor Design for the library #4

@OlehOnyshchak

Description

@OlehOnyshchak

Think about the design choices of how to make the library easily extendable. For example, make the query to accept as an argument list of function to process text and images. For example, text handlers can accept HTML of the page and its URL as an input, and then return some key-value pair to be added to the dataset.

With that approach, if a user wants to parse additional field he would only need to define the function which with appropriate parsing and pass it as a parameter to query function, where all the meaty and common processing is done. With that approach, the user can select what to download by modifying the list of pre-created handlers for wikitext or caption parsing. Also, we could have designed an approach to uniformly pass cache-related parameters to such functions.

Might be a very good idea but requires tons of work. Will probably be suspended until some reasonable interest to the script appears.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions