Generate CanvasXpress visualizations from natural language descriptions using Large Language Models (LLMs)
A backend service system that enables users to create scientific visualizations by describing them in plain English. Built with modern architecture and powered by LLMs and RAG (Retrieval Augmented Generation) technology.
- Natural Language Interface: Describe visualizations in plain English
- Multi-LLM Support: Works with OpenAI GPT-4o, Google Gemini, AWS Bedrock, and Ollama
- RAG-Enhanced Generation: Uses vector similarity search with BGE-M3 embeddings
- High Accuracy: Achieves >97% accuracy through engineered prompts and few-shot examples
- Backend Service Architecture: Designed as a service for CanvasXpress integration
- Docker
- Make (for using Makefile commands)
- Python 3.9+ Docker image (used by the containerized system)
- Git
Production Setup:
git clone https://github.com/buddyroo30/canvasxpress_gen.git
cd canvasxpress_gen
make build
make build_schema_context # Generate schema information
make build_vector_db # Create vector database for RAG
make run # Run as daemon (or 'make runi' for interactive) on port 5008Development Setup:
git clone https://github.com/buddyroo30/canvasxpress_gen.git
cd canvasxpress_gen
# Build and run development environment
make build_dev
make build_schema_context_dev
make build_vector_db_dev
make run_dev # Runs on port 5009Fresh Environment Setup (for testing without cached dependencies):
git clone https://github.com/buddyroo30/canvasxpress_gen.git
cd canvasxpress_gen
make buildfresh # Build without using Docker cache
make build_schema_context # Generate schema information
make build_vector_db # Create vector database for RAG
make run # Run as daemon on port 5008Configure LLM API access and system behavior by setting environment variables, here is a complete list of available environment variables for system configuration:
LLM API Configuration:
export OPENAI_API_TYPE="azure" # OpenAI API type (azure or openai)
export OPENAI_API_KEY="your-openai-key" # OpenAI API key
export AZURE_OPENAI_API_KEY="your-azure-key" # Azure OpenAI API key
export AZURE_OPENAI_ENDPOINT="your-endpoint" # Azure OpenAI endpoint URL
export OPENAI_API_BASE="your-base-url" # OpenAI API base URL
export OPENAI_API_VERSION="2023-05-15" # OpenAI API version
export AZURE_OPENAI_API_VERSION="2024-02-01" # Azure OpenAI API version
export GOOGLE_API_KEY="your-google-key" # For Google Gemini modelsSiteMinder Authentication (Corporate Environments):
export SMVAL="True" # Enable SiteMinder validation
export SMLOGIN="your-login-url" # SiteMinder login URL
export SMTARGET="your-target-url" # SiteMinder target URL
export SMFAILREGEX=".*<html.*AUTHENTICATION.*" # Login failure regex pattern
export SMFETCHFAILREGEX=".*<title>BMS.*" # Fetch failure regex patternSystem Configuration:
export DEV="True" # Enable development mode
export NUM_FEW_SHOTS=25 # Number of RAG examples to retrieve
export PORT=5000 # Server port (cx_llm_service only)
export SERVICE_URL="your-service-url" # Service URL (cx_llm_service only)- Open browser to
http://localhost:5008(or your domain if deployed online) - Upload a CSV/TSV data file with headers
- Describe your visualization in plain English
Example with automotive data:
- "Box plot of cty grouped by manufacturer"
- "Scatter plot of hwy vs cty colored by drv"
- "Area graph of hwy with title 'Highway MPG Distribution'"
Note: This web interface is a quick and easy way to see the system in action and confirm it's working. The production interface is integrated directly into CanvasXpress.
Configure CanvasXpress to use your running service:
// In your CanvasXpress configuration
var config = {
// Your existing CanvasXpress configuration
graphType: "Bar",
title: "My Visualization",
// Add LLM service configuration
llmServiceURL: "http://localhost:5008/ask" // or your domain: "https://your-domain.com:5008/ask"
};
var cx = new CanvasXpress("canvasId", data, config);curl -X POST http://localhost:5008/ask \
-F "prompt=Create a scatter plot of hwy vs cty colored by manufacturer" \
-F "datafile_contents=[[\"manufacturer\",\"hwy\",\"cty\"],[\"toyota\",35,28],[\"ford\",30,25]]"- Public: Use the publicly available CanvasXpress instance at canvasxpress.org
- Private: Run your own instance for data security within corporate networks
This system is designed as a backend service for CanvasXpress, not as a traditional Python library. Users interact with the system through:
- CanvasXpress Integration: Primary intended usage via CanvasXpress UI
- Direct API Calls: For custom integrations
- Development Interface: For testing and verification
- LLM Integration: Multiple LLM providers through unified interface
- RAG System: Vector database (Milvus) with BGE-M3 embeddings for semantic search
- Guided Autocomplete: Automatic synthetic example generation (part of main CanvasXpress library)
- Modular Architecture: Professional Python package structure
Create a .env file (commonly needed variables below, see above for exhaustive list):
# LLM API Keys (choose what you need)
GOOGLE_API_KEY=your_google_api_key_here
AZURE_OPENAI_API_KEY=your_azure_openai_key
AZURE_OPENAI_ENDPOINT=your_azure_endpoint
# RAG Configuration
NUM_FEW_SHOTS=25 # Number of examples to retrieve
# Optional: SiteMinder SSO for enterprise
SMVAL=False # Set to True for enterprise SSOEdit llm_models.json and copy to ~/.cache/:
{
"gemini-1.5-flash": {
"type": "google_gemini",
"provider": "google"
},
"gpt-4o": {
"type": "openai",
"provider": "openai"
}
}The system includes 85+ comprehensive automated tests covering all components:
# Run all tests (uses real APIs if configured, mocks otherwise)
python -m pytest
# Run with coverage
python -m pytest --cov=src/canvasxpress_gen --cov-report=term-missing
# Integration tests only
python -m pytest tests/test_integration.py -vTest Features:
- β Real API integration when keys available
- β Graceful fallback to mocks when unavailable
- β End-to-end RAG workflow validation
- β Works in fresh environments
For detailed testing instructions, see TESTING.md.
The codebase follows Python packaging standards:
src/canvasxpress_gen/
βββ llm/ # LLM service and model management
βββ rag/ # RAG system with embeddings and retrieval
βββ utils/ # JSON, text, file, and auth utilities
- API Documentation: Complete API reference for service endpoints
- Integration Guide: CanvasXpress integration details
- Testing Guide: Comprehensive testing instructions
- Contributing Guide: Development and contribution guidelines
We welcome contributions! See CONTRIBUTING.md for detailed guidelines on:
- Development setup and workflow
- Code standards and testing requirements
- Submission process
MIT License - see LICENSE file.
If you use this software in research, please cite:
@article{smith2024canvasxpress,
title={Generating Visualizations Conversationally using Guided Autocomplete and LLMs},
author={Smith, Andrew K and Neuhaus, Isaac},
year={2024}
}- Issues: Report bugs and request features via GitHub Issues
- Documentation: Check this README and linked guides
- Questions: Contact maintainers or open a discussion
Ready to integrate natural language visualization generation? π