This directory contains examples for extracting structured information from unstructured text using Mellea.
Basic information extraction using generative stubs to extract person names from text.
Key Features:
- Using
@generativedecorator for extraction tasks - Type-safe extraction with
list[str]return type - Simple, declarative approach to information extraction
- Example with NYTimes article text
More advanced extraction patterns using m.instruct() with structured outputs.
- Named Entity Recognition: Extracting person names, locations, etc.
- Structured Extraction: Getting typed, structured data from text
- Type Safety: Using Python types to constrain extraction format
- Declarative Extraction: Describing what to extract in docstrings
from mellea import generative, start_session
@generative
def extract_all_person_names(doc: str) -> list[str]:
"""
Given a document, extract names of ALL mentioned persons.
Return these names as list of strings.
"""
m = start_session()
names = extract_all_person_names(m, doc=article_text)
print(names) # ['President Obama', 'Angela Merkel']- Document Processing: Extract key information from documents
- Data Mining: Pull structured data from unstructured sources
- Content Analysis: Identify entities, relationships, and facts
- Metadata Generation: Create structured metadata from text
- See
generative_stubs/for more on the@generativedecorator - See
mellea/stdlib/components/genstub.pyfor implementation details