We have implemented multiple layers of defenses to mitigate prompt injection attacks in our LLM pipelines:
- Input Validation: All user inputs are sanitized before being fed to any language model.
- Context Limiting: We restrict the amount of context and system prompts to prevent malicious overrides.
- Guardrails Enforcement: The system uses predefined guardrails to reject prompts containing forbidden instructions or attempts to manipulate model behavior.
- Minimal Data Storage: Only necessary user inputs are temporarily stored for processing; no sensitive data is logged persistently.
- Anonymization: User identifiers and sensitive information are anonymized wherever feasible.
- Secure Transmission: All data exchanged with the system is encrypted via HTTPS.
We use pip-audit for continuous dependency scanning. Critical CVEs in Python packages fail the CI pipeline, ensuring that insecure dependencies are not deployed.
Guardrails are implemented to enforce responsible AI usage:
- Content Filtering: Prompts that may generate harmful, illegal, or unsafe content are blocked.
- Role-Based Responses: The system ensures outputs are aligned with the intended purpose (e.g., energy insights) and do not hallucinate unrelated content.
- Audit Logging: Certain actions and rejected prompts are logged for review and compliance purposes.
- Continuous Monitoring: We periodically update rules and models to reflect best practices in AI safety.
If you discover a vulnerability or issue, please contact the project maintainers securely.