PICO Agent Developer Documentation

Overview

The PICOAgent is a specialized agent that extracts structured PICO (Population, Intervention, Comparison, Outcome) components from biomedical research papers. It follows the standard BaseAgent architecture and provides robust extraction with confidence scoring and retry logic.

Architecture

Class Hierarchy

BaseAgent (base.py)
    ↓
PICOAgent (pico_agent.py)

Core Components

PICOExtraction (dataclass): Represents extracted PICO components with metadata
PICOAgent (class): Main agent for extraction logic
Statistics Tracking: Built-in extraction performance monitoring

API Reference

PICOExtraction Dataclass

@dataclass
class PICOExtraction:
    """Represents extracted PICO components from a study."""

    # Core PICO components (required)
    population: str
    intervention: str
    comparison: str
    outcome: str

    # Metadata (required)
    document_id: str
    document_title: str
    extraction_confidence: float  # 0-1 overall confidence

    # Optional enrichment fields
    study_type: Optional[str] = None  # e.g., "RCT", "cohort study"
    sample_size: Optional[str] = None  # e.g., "N=150"
    pmid: Optional[str] = None
    doi: Optional[str] = None

    # Component-level confidence scores
    population_confidence: Optional[float] = None
    intervention_confidence: Optional[float] = None
    comparison_confidence: Optional[float] = None
    outcome_confidence: Optional[float] = None

    # Timestamp
    created_at: Optional[datetime] = None  # Auto-set in __post_init__

PICOAgent Class

Constructor

def __init__(
    self,
    model: str = "gpt-oss:20b",
    host: str = "http://localhost:11434",
    temperature: float = 0.1,
    top_p: float = 0.9,
    max_tokens: int = 2000,
    callback: Optional[Callable[[str, str], None]] = None,
    orchestrator=None,
    show_model_info: bool = True,
    max_retries: int = 3
)

Parameters:

model: Ollama model name (default: gpt-oss:20b for high accuracy)
host: Ollama server URL
temperature: Low value (0.1) for consistent, factual extraction
top_p: Nucleus sampling parameter
max_tokens: Maximum response length (2000 sufficient for detailed PICO)
callback: Progress callback function(step: str, data: str)
orchestrator: Optional orchestrator for queue-based processing
show_model_info: Display initialization info
max_retries: Retry attempts for failed extractions

Core Methods

extract_pico_from_document()

def extract_pico_from_document(
    self,
    document: Dict[str, Any],
    min_confidence: float = 0.5
) -> Optional[PICOExtraction]

Purpose: Extract PICO components from a single document

Parameters:

document: Dictionary with keys:
- id (required): Document identifier
- title (required): Paper title
- abstract (required if no full_text): Paper abstract
- full_text (optional): Complete paper text (preferred over abstract)
- pmid (optional): PubMed ID
- doi (optional): Digital Object Identifier
- publication_date (optional): Publication date
min_confidence: Minimum overall confidence threshold (0.0-1.0)
- 0.8-1.0: High confidence (systematic reviews)
- 0.5-0.7: Medium confidence (screening)
- 0.3-0.5: Low confidence (exploratory)

Returns:

PICOExtraction object if successful and confidence >= threshold
None if extraction failed, confidence too low, or no text available

Process Flow:

Validate Ollama connection
Get document text (prefer full_text, fall back to abstract)
Truncate text if > 8000 characters (preserve context limits)
Build extraction prompt with detailed instructions
Call LLM with retry logic (via _generate_and_parse_json())
Parse JSON response
Validate required fields
Check confidence threshold
Create and return PICOExtraction object
Update statistics

Error Handling:

Returns None on Ollama connection failure
Returns None on missing text
Returns None on parse failure after max retries
Returns None on missing required PICO fields
Returns None on confidence below threshold
Logs all failures for debugging

extract_pico_batch()

def extract_pico_batch(
    self,
    documents: List[Dict[str, Any]],
    min_confidence: float = 0.5,
    progress_callback: Optional[Callable[[int, int, str], None]] = None
) -> List[PICOExtraction]

Purpose: Extract PICO from multiple documents with progress tracking

Parameters:

documents: List of document dictionaries
min_confidence: Minimum confidence threshold
progress_callback: Optional callback(current: int, total: int, title: str)

Returns: List of successful PICOExtraction objects (excludes failures)

Usage Example:

def show_progress(current, total, doc_title):
    print(f"[{current}/{total}] {doc_title[:50]}...")

extractions = agent.extract_pico_batch(
    documents=docs,
    progress_callback=show_progress
)

get_extraction_stats()

def get_extraction_stats(self) -> Dict[str, Any]

Purpose: Get extraction performance statistics

Returns: Dictionary with:

total_extractions: Total attempted
successful_extractions: Successfully extracted
failed_extractions: Failed to extract
low_confidence_extractions: Below confidence threshold
parse_failures: JSON parse errors
success_rate: successful / total (0.0-1.0)

format_pico_summary()

def format_pico_summary(self, extraction: PICOExtraction) -> str

Purpose: Format PICO extraction as human-readable text

Returns: Multi-line formatted string with all PICO components

export_to_json()

def export_to_json(
    self,
    extractions: List[PICOExtraction],
    output_file: str
) -> None

Purpose: Export extractions to JSON file with metadata

Output Structure:

{
  "extractions": [
    {
      "population": "...",
      "intervention": "...",
      "comparison": "...",
      "outcome": "...",
      "extraction_confidence": 0.95,
      "document_id": "12345",
      "document_title": "...",
      "study_type": "RCT",
      "sample_size": "N=150",
      "pmid": "12345",
      "doi": "10.1000/x",
      "population_confidence": 0.9,
      "intervention_confidence": 0.95,
      "comparison_confidence": 0.9,
      "outcome_confidence": 0.95,
      "created_at": "2025-01-15T10:30:00Z"
    }
  ],
  "metadata": {
    "total_extractions": 1,
    "extraction_date": "2025-01-15T10:30:00Z",
    "agent_model": "gpt-oss:20b",
    "statistics": {...}
  }
}

export_to_csv()

def export_to_csv(
    self,
    extractions: List[PICOExtraction],
    output_file: str
) -> None

Purpose: Export extractions to CSV for systematic review tools

CSV Columns:

document_id, document_title, pmid, doi
study_type, sample_size
population, intervention, comparison, outcome
population_confidence, intervention_confidence, comparison_confidence, outcome_confidence
extraction_confidence, created_at

Prompt Engineering

Extraction Prompt Structure

The agent uses a carefully crafted prompt with:

Role Definition: "You are a medical research expert..."
Task Description: Extract PICO components
Detailed Instructions: For each PICO component with examples
Confidence Scoring: Guidelines for 0.0-1.0 scale
Critical Requirements:
- Extract ONLY information actually present in text
- DO NOT invent or fabricate information
- Use "Not clearly stated" for missing components
- Calculate overall_confidence as average of component confidences
Response Format: JSON schema with all required fields
Strict Output: "Respond ONLY with valid JSON"

Confidence Scoring Guidelines

The prompt instructs the LLM to score each component:

1.0: Explicitly stated, no ambiguity
0.8: Clearly stated but some details missing
0.6: Can be inferred but not explicit
0.4: Partially mentioned, significant uncertainty
0.2: Barely mentioned, high uncertainty
0.0: Not found in text

Handling Missing Information

The agent explicitly instructs the LLM to:

Write "Not clearly stated" for missing PICO components
Give low confidence scores (0.0-0.4) for uncertain extractions
Never fabricate or assume information

This ensures extractions are truthful and verifiable.

Internal Implementation Details

Text Preprocessing

# Prefer full text if available
text_to_analyze = full_text if full_text else abstract

# Truncate to avoid context limit issues
if len(text_to_analyze) > 8000:
    text_to_analyze = text_to_analyze[:8000] + "..."

Rationale:

8000 characters ≈ 2000 tokens (safe for most models)
Preserves complete introduction/methods sections
Prevents out-of-memory errors

JSON Parsing with Retry

Uses BaseAgent._generate_and_parse_json():

pico_data = self._generate_and_parse_json(
    prompt,
    max_retries=self.max_retries,
    retry_context=f"PICO extraction (doc {doc_id})",
    num_predict=self.max_tokens
)

Features:

Automatic retry on JSON parse failures
Regenerates LLM response (not just re-parse)
Detailed logging of retry attempts
Raises JSONDecodeError after max retries exhausted

Statistics Tracking

self._extraction_stats = {
    'total_extractions': 0,
    'successful_extractions': 0,
    'failed_extractions': 0,
    'low_confidence_extractions': 0,
    'parse_failures': 0
}

Updated throughout extraction lifecycle:

total_extractions: Incremented on every attempt
successful_extractions: Incremented on successful extraction
failed_extractions: Incremented on errors/exceptions
low_confidence_extractions: Incremented when confidence < threshold
parse_failures: Incremented on JSON parse errors

Integration with BMLibrarian

Configuration System

The agent integrates with BMLibrarian's configuration:

from bmlibrarian.config import get_model, get_agent_config

# Get configured model
model = get_model('pico_agent')  # Returns "gpt-oss:20b" by default

# Get agent configuration
config = get_agent_config('pico')
agent = PICOAgent(
    model=model,
    **config  # temperature, top_p, max_tokens, etc.
)

Factory Pattern

Create via AgentFactory:

from bmlibrarian.agents import AgentFactory

agent = AgentFactory.create_agent('pico')

Database Integration

Extract PICO from documents in database:

from bmlibrarian.database import get_db_manager

db_manager = get_db_manager()
with db_manager.get_connection() as conn:
    with conn.cursor() as cur:
        cur.execute("SELECT id, title, abstract FROM document LIMIT 100")
        documents = [
            {
                'id': row[0],
                'title': row[1],
                'abstract': row[2]
            }
            for row in cur.fetchall()
        ]

extractions = agent.extract_pico_batch(documents)

Testing

Unit Tests

See tests/test_pico_agent.py for comprehensive test suite:

Initialization tests
Successful extraction tests
Error handling tests (low confidence, missing fields, parse errors)
Batch processing tests
Export tests (JSON, CSV)
Statistics tracking tests
Callback integration tests

Running Tests

uv run python -m pytest tests/test_pico_agent.py -v

Test Coverage

The test suite covers:

✓ Agent initialization
✓ PICO extraction success path
✓ Low confidence filtering
✓ Missing abstract/text handling
✓ Incomplete PICO fields
✓ JSON parse errors
✓ Connection failures
✓ Batch processing
✓ Progress callbacks
✓ Statistics tracking
✓ Export to JSON
✓ Export to CSV
✓ Text truncation
✓ Dataclass methods

Performance Considerations

Extraction Speed

Approximate times (with gpt-oss:20b on M1 Mac):

Single document: 3-5 seconds
Batch of 10: 30-50 seconds
Batch of 100: 5-8 minutes

Optimization Tips:

Use batch processing (extract_pico_batch() not individual calls)
Use smaller models for initial screening (medgemma-27b)
Process documents in parallel (future enhancement)
Filter documents before extraction (e.g., by study type metadata)

Memory Usage

Minimal memory overhead (~10 MB per agent instance)
Document text truncated to 8000 chars (prevents OOM)
Batch processing handles lists of any size
Export methods write incrementally (no memory bottleneck)

Model Selection

Model	Speed	Accuracy	Use Case
gpt-oss:20b	Slow	High	Systematic reviews, meta-analyses
medgemma-27b	Medium	Medium-High	General screening
medgemma4B	Fast	Medium	Initial filtering only

Extension Points

Custom Extraction Schema

Extend PICOExtraction for specialized fields:

from dataclasses import dataclass
from bmlibrarian.agents import PICOExtraction

@dataclass
class ExtendedPICOExtraction(PICOExtraction):
    """Extended PICO with additional fields."""
    study_duration: Optional[str] = None
    funding_source: Optional[str] = None
    registration_number: Optional[str] = None  # ClinicalTrials.gov

Custom Prompts

Subclass PICOAgent to customize prompts:

class CustomPICOAgent(PICOAgent):
    def extract_pico_from_document(self, document, min_confidence=0.5):
        # Custom prompt engineering
        custom_prompt = self._build_custom_prompt(document)
        # ... rest of implementation

Integration with Orchestrator

Submit PICO extraction as queue tasks:

from bmlibrarian.agents import AgentOrchestrator, PICOAgent

orchestrator = AgentOrchestrator(max_workers=4)
agent = PICOAgent(orchestrator=orchestrator)

# Submit batch tasks (future enhancement)
task_ids = agent.submit_pico_extraction_tasks(
    documents=documents,
    priority=TaskPriority.NORMAL
)

# Wait for completion
results = orchestrator.wait_for_completion(task_ids)

Common Pitfalls

1. Using Abstracts Instead of Full Text

Problem: Abstracts often lack detailed PICO components

Solution: Extract full text from PDFs when available

2. Setting Confidence Too High

Problem: min_confidence=0.9 rejects most extractions

Solution: Use 0.5-0.7 for initial screening, manually verify high-impact papers

3. Not Checking Statistics

Problem: Low success rate goes unnoticed

Solution: Always check get_extraction_stats() after batch processing

4. Ignoring Component Confidence

Problem: Using overall confidence but missing low-quality individual components

Solution: Check individual component confidences for critical analyses

5. Processing Too Many Documents at Once

Problem: Long batch processing times without feedback

Solution: Use progress_callback and process in smaller batches (10-50 documents)

Future Enhancements

Planned features:

Parallel Processing: Process documents concurrently
Queue Integration: Submit PICO tasks to AgentOrchestrator
Incremental Export: Stream results to CSV/JSON during processing
PICO Validation: Cross-check extracted components against metadata
Enhanced Confidence: Machine learning model for confidence calibration
Multi-Model Extraction: Use ensemble of models for improved accuracy
Interactive Refinement: Allow users to refine extractions iteratively

FilesExpand file tree

pico_agent.md

Latest commit

History

pico_agent.md

File metadata and controls

PICO Agent Developer Documentation

Overview

Architecture

Class Hierarchy

Core Components

API Reference

PICOExtraction Dataclass

PICOAgent Class

Constructor

Core Methods

extract_pico_from_document()

extract_pico_batch()

get_extraction_stats()

format_pico_summary()

export_to_json()

export_to_csv()

Prompt Engineering

Extraction Prompt Structure

Confidence Scoring Guidelines

Handling Missing Information

Internal Implementation Details

Text Preprocessing

JSON Parsing with Retry

Statistics Tracking

Integration with BMLibrarian

Configuration System

Factory Pattern

Database Integration

Testing

Unit Tests

Running Tests

Test Coverage

Performance Considerations

Extraction Speed

Memory Usage

Model Selection

Extension Points

Custom Extraction Schema

Custom Prompts

Integration with Orchestrator

Common Pitfalls

1. Using Abstracts Instead of Full Text

2. Setting Confidence Too High

3. Not Checking Statistics

4. Ignoring Component Confidence

5. Processing Too Many Documents at Once

Future Enhancements

Related Documentation

References