This guide explains how to use BMLibrarian's Reporting Agent to create professional medical publication-style reports from extracted citations.
The Reporting Agent takes citations extracted by the Citation Finder and synthesizes them into cohesive, evidence-based reports formatted like peer-reviewed medical publications. It automatically creates proper reference lists, assesses evidence strength, and writes in professional medical style.
Key Benefits:
- ✅ Professional Format: Medical publication-style writing with proper citations
- ✅ Vancouver References: Industry-standard reference formatting
- ✅ Evidence Assessment: Automatic evaluation of evidence strength
- ✅ Quality Control: Validates citations and detects issues
- ✅ Complete Reports: Includes methodology notes and metadata
Citations → Validation → Reference Creation → Evidence Assessment → LLM Synthesis → Formatted Report
- Input: Takes citations from Citation Finder Agent
- Validation: Checks citation quality and completeness
- References: Creates numbered reference list with Vancouver formatting
- Assessment: Evaluates evidence strength (Strong/Moderate/Limited/Insufficient)
- Synthesis: Uses AI to write cohesive medical publication-style content
- Output: Generates complete formatted report with references
First, extract citations using the Citation Finder (see Citation Guide):
from bmlibrarian.agents import CitationFinderAgent, ReportingAgent
# Extract citations from your research question
citation_agent = CitationFinderAgent()
citations = citation_agent.process_scored_documents_for_citations(
user_question="What are the cardiovascular benefits of exercise?",
scored_documents=scored_docs,
score_threshold=2.5
)
print(f"Found {len(citations)} citations for report generation")Create a professional medical report from the citations:
# Initialize reporting agent
reporting_agent = ReportingAgent()
# Generate formatted report
report = reporting_agent.generate_citation_based_report(
user_question="What are the cardiovascular benefits of exercise?",
citations=citations,
format_output=True
)
if report:
print(report)
else:
print("❌ Could not generate report - check citations and connection")The generated report includes:
- Research Question: Your original question
- Evidence Strength: Assessment of evidence quality
- Synthesized Answer: Professional medical writing with numbered citations [1], [2], [3]
- References: Vancouver-style reference list
- Methodology: Description of synthesis approach
- Metadata: Generation details and statistics
Research Question: What are the cardiovascular benefits of exercise?
================================================================================
Evidence Strength: Moderate
Regular aerobic exercise provides significant cardiovascular benefits, including
improved coronary blood flow and reduced risk of major adverse cardiovascular
events [1]. Studies demonstrate exercise reduces systolic blood pressure by
4-9 mmHg and improves lipid profiles with HDL cholesterol increases of 5-10% [2].
Meta-analyses indicate 30 minutes of moderate exercise 5 days per week reduces
cardiovascular mortality by approximately 30% [3].
REFERENCES
--------------------
1. Smith, J., Johnson, A., Brown, K.. Exercise and Coronary Blood Flow Study. 2023; PMID: 37123456
2. Davis, M., Wilson, R., Garcia, L.. Blood Pressure Response to Exercise Training. 2023; PMID: 37234567
3. Miller, P., Anderson, S., Taylor, D., et al.. Exercise and Cardiovascular Mortality Meta-Analysis. 2023; PMID: 37345678
METHODOLOGY
--------------------
Synthesis based on 3 high-quality citations from peer-reviewed studies with
relevance scores ≥0.80. Evidence strength assessed considering citation quantity,
source diversity, and methodological rigor.
REPORT METADATA
--------------------
Generated: 2023-06-15 14:25:30 UTC
Citations analyzed: 3
Unique references: 3
Evidence strength: Moderate
Strong Evidence:
- ≥5 citations from ≥3 different sources
- High relevance scores (≥0.85 average)
- Comprehensive coverage of the research question
Moderate Evidence:
- 3-4 citations from ≥2 different sources
- Good relevance scores (≥0.75 average)
- Adequate coverage with minor gaps
Limited Evidence:
- 2-3 citations with decent relevance (≥0.70 average)
- May have significant evidence gaps
- Conclusions should be interpreted cautiously
Insufficient Evidence:
- <2 citations or very low relevance scores
- Report generation may fail or produce low-quality output
Control report generation standards:
# Generate report with custom minimum citation requirement
report = reporting_agent.synthesize_report(
user_question=question,
citations=citations,
min_citations=3 # Require at least 3 citations (default: 2)
)Choose appropriate AI model for synthesis:
# For high-quality medical writing (default)
reporting_agent = ReportingAgent(model="gpt-oss:20b")
# For faster processing
reporting_agent = ReportingAgent(model="medgemma4B_it_q8:latest")Choose between formatted and unformatted output:
# Full formatted report with references and metadata
formatted_report = reporting_agent.generate_citation_based_report(
user_question=question,
citations=citations,
format_output=True # Default
)
# Just the synthesized content without formatting
synthesis_only = reporting_agent.generate_citation_based_report(
user_question=question,
citations=citations,
format_output=False
)Always validate citations before report generation:
# Check citation quality
valid_citations, errors = reporting_agent.validate_citations(citations)
if errors:
print("⚠️ Citation Issues Found:")
for error in errors:
print(f" - {error}")
if len(valid_citations) >= 2:
print(f"✅ {len(valid_citations)} valid citations ready for reporting")
# Proceed with report generation
else:
print("❌ Insufficient valid citations for report generation")Check evidence strength before interpreting results:
# Assess evidence strength
strength = reporting_agent.assess_evidence_strength(citations)
print(f"Evidence Strength: {strength}")
# Adjust interpretation based on strength
if strength == "Strong":
print("✅ High confidence in findings")
elif strength == "Moderate":
print("⚠️ Moderate confidence - consider additional sources")
elif strength == "Limited":
print("⚠️ Limited evidence - interpret cautiously")
else:
print("❌ Insufficient evidence - additional research needed")For critical research, review key references:
# Create and review references
references = reporting_agent.create_references(citations)
print("📋 References for Manual Review:")
for ref in references:
vancouver_formatted = ref.format_vancouver_style()
print(f"{ref.number}. {vancouver_formatted}")
print(f" Document ID: {ref.document_id}")
if ref.pmid:
print(f" PubMed: https://pubmed.ncbi.nlm.nih.gov/{ref.pmid}")
print()question = "What are the safety concerns with long-term metformin use?"
# Get high-quality citations focused on safety
citations = citation_agent.process_scored_documents_for_citations(
user_question=question,
scored_documents=safety_focused_docs,
score_threshold=3.0, # High threshold for safety data
min_relevance=0.8 # High relevance requirement
)
# Validate citations
valid_citations, errors = reporting_agent.validate_citations(citations)
if len(valid_citations) >= 3:
# Generate comprehensive safety report
report = reporting_agent.generate_citation_based_report(
user_question=question,
citations=valid_citations,
format_output=True
)
# Save report
with open('metformin_safety_report.txt', 'w') as f:
f.write(report)
print("✅ Safety report generated and saved")
else:
print("❌ Insufficient evidence for comprehensive safety assessment")question = "How effective is cognitive behavioral therapy for depression?"
# Process citations with moderate thresholds for broad coverage
citations = citation_agent.process_scored_documents_for_citations(
user_question=question,
scored_documents=treatment_docs,
score_threshold=2.5,
min_relevance=0.75
)
# Generate report with custom synthesis
report_obj = reporting_agent.synthesize_report(
user_question=question,
citations=citations,
min_citations=4 # Require substantial evidence
)
if report_obj:
print(f"Evidence Assessment: {report_obj.evidence_strength}")
print(f"Based on {report_obj.citation_count} citations from {report_obj.unique_documents} studies")
# Format for display
formatted = reporting_agent.format_report_output(report_obj)
print(formatted)
# Extract just the synthesis for other uses
clinical_summary = report_obj.synthesized_answer
print(f"\nClinical Summary:\n{clinical_summary}")questions = [
"What is the effectiveness of aspirin for cardiovascular prevention?",
"What are the bleeding risks associated with aspirin use?",
"How do benefits and risks of aspirin compare?"
]
reports = []
for question in questions:
# Get relevant citations for each question
citations = get_citations_for_question(question) # Your citation extraction
# Generate focused report
report = reporting_agent.generate_citation_based_report(
user_question=question,
citations=citations,
format_output=False # Just get synthesis content
)
if report:
reports.append({
'question': question,
'content': report,
'citations': len(citations)
})
# Combine into comparative analysis
print("Aspirin Risk-Benefit Analysis")
print("=" * 50)
for report in reports:
print(f"\n{report['question']}")
print("-" * 40)
print(report['content'])
print(f"(Based on {report['citations']} citations)")Start with high-quality citations for better reports:
# Before generating reports, check citation statistics
from bmlibrarian.agents.citation_agent import CitationFinderAgent
citation_stats = citation_agent.get_citation_stats(citations)
print(f"📊 Citation Quality Check:")
print(f" Average relevance: {citation_stats['average_relevance']:.3f}")
print(f" Unique sources: {citation_stats['unique_documents']}")
print(f" Total citations: {citation_stats['total_citations']}")
# Recommend thresholds based on stats
if citation_stats['average_relevance'] < 0.75:
print("⚠️ Consider using higher relevance threshold in citation extraction")
if citation_stats['unique_documents'] < 3:
print("⚠️ Limited source diversity - consider broader search terms")Select appropriate models for different scenarios:
# For high-stakes medical reports (slower but higher quality)
clinical_reporting_agent = ReportingAgent(model="gpt-oss:20b")
# For rapid literature screening (faster but adequate quality)
screening_reporting_agent = ReportingAgent(model="medgemma4B_it_q8:latest")
# For development and testing
test_reporting_agent = ReportingAgent(model="test-model")For important decisions, manually verify key claims:
report = reporting_agent.generate_citation_based_report(question, citations)
if report and "clinical trial" in report.lower():
print("🔍 Clinical trial data detected - recommend manual verification")
# Extract cited documents for review
references = reporting_agent.create_references(citations)
high_impact_refs = [ref for ref in references if ref.pmid]
print("High-priority references for verification:")
for ref in high_impact_refs:
print(f" PMID {ref.pmid}: {ref.title}")Manage cases with limited evidence appropriately:
report_obj = reporting_agent.synthesize_report(question, citations)
if not report_obj:
print("❌ Insufficient evidence for report generation")
print("Recommendations:")
print(" - Lower score thresholds in citation extraction")
print(" - Broaden search terms")
print(" - Consider splitting complex questions")
print(" - Check database coverage for your topic")
elif report_obj.evidence_strength == "Limited":
print(f"⚠️ Limited evidence report generated")
print(f" Based on only {report_obj.citation_count} citations")
print(f" Consider as preliminary findings only")
elif report_obj.evidence_strength == "Insufficient":
print(f"⚠️ Very low evidence quality")
print(f" Report may not be reliable for decision-making")"No report generated"
- Check if you have at least 2 valid citations
- Verify Ollama service is running:
curl http://localhost:11434/api/tags - Run citation validation:
validate_citations(citations) - Try lowering
min_citationsparameter
"Poor quality synthesis"
- Improve citation quality (higher relevance scores)
- Use more citations (aim for ≥3)
- Check citation passage content for relevance
- Consider using a higher-quality model
"References look wrong"
- Check original citation author lists and titles
- Verify publication dates are properly formatted
- Ensure document IDs are from database
- Test Vancouver formatting:
reference.format_vancouver_style()
"Evidence strength seems incorrect"
- Review citation relevance scores
- Check number of unique source documents
- Verify citation count meets thresholds
- Consider evidence assessment criteria
"Report generation is slow"
- Use faster model:
medgemma4B_it_q8:latest - Reduce citation count if possible
- Check Ollama model is downloaded locally
- Monitor system resources during generation
"Running out of memory"
- Process smaller batches of citations
- Use lighter AI models
- Check citation content length
- Restart Ollama service if needed
"Cannot connect to Ollama"
# Check Ollama status
curl http://localhost:11434/api/tags
# If not running, start Ollama
ollama serve
# Check available models
ollama list
# Download required model if missing
ollama pull gpt-oss:20b"Model not found"
# Test connection and model availability
if reporting_agent.test_connection():
print("✅ Ollama connected")
else:
print("❌ Ollama connection failed")
print(" Check: ollama serve")
print(" Check: ollama pull gpt-oss:20b")# Get structured report object for custom processing
report_obj = reporting_agent.synthesize_report(question, citations)
if report_obj:
# Extract components for custom formatting
synthesis = report_obj.synthesized_answer
references = report_obj.references
evidence = report_obj.evidence_strength
# Custom output format
print(f"CLINICAL SUMMARY")
print(f"Question: {report_obj.user_question}")
print(f"Evidence Level: {evidence}")
print(f"Summary: {synthesis}")
# Export references in custom format
print(f"\nSOURCES ({len(references)} references):")
for ref in references:
print(f"[{ref.number}] {ref.title} ({ref.publication_date[:4]})")
if ref.pmid:
print(f" PubMed: {ref.pmid}")# Export report data for external systems
report_data = {
'question': report_obj.user_question,
'synthesis': report_obj.synthesized_answer,
'evidence_strength': report_obj.evidence_strength,
'generation_date': report_obj.created_at.isoformat(),
'references': [
{
'number': ref.number,
'title': ref.title,
'authors': ref.authors,
'date': ref.publication_date,
'pmid': ref.pmid,
'vancouver': ref.format_vancouver_style()
}
for ref in report_obj.references
]
}
# Save as JSON
import json
with open('report_export.json', 'w') as f:
json.dump(report_data, f, indent=2)# Generate multiple reports efficiently
questions_and_citations = [
("Question 1", citations_1),
("Question 2", citations_2),
("Question 3", citations_3)
]
reports = []
for question, citations in questions_and_citations:
print(f"Generating report for: {question[:50]}...")
report = reporting_agent.generate_citation_based_report(
user_question=question,
citations=citations,
format_output=False
)
if report:
reports.append({
'question': question,
'content': report,
'timestamp': datetime.now().isoformat()
})
print(f"Generated {len(reports)} reports")For additional support:
- Test Connection: Use
reporting_agent.test_connection()to verify Ollama - Validate Citations: Run
validate_citations()to check input quality - Check Logs: Review error messages for specific issues
- Try Demo: Run
python examples/reporting_demo.pyfor examples - Review Documentation: See developer docs for technical details
The Reporting Agent transforms citations into professional medical publication-style reports, providing the evidence-based synthesis needed for informed decision-making in biomedical research and clinical practice.