Senior AI Architect / Senior LLM Engineer building production-grade agentic LLM systems, RAG platforms, AI market-intelligence pipelines, and multi-agent automation systems.
I focus on transforming real-world business and data problems into reliable AI products: routing, retrieval, context assembly, structured outputs, eval loops, observability, and production deployment. My work sits at the intersection of LLM platform engineering, multi-agent orchestration, AI safety, and market/decision intelligence systems.
- 🧠 5+ years building AI, ML, and production LLM systems
- 🏢 Senior LLM Engineer @ AIR Property, Dubai
- 🤖 Former AI Engineer, Agentic Systems @ PropTy Global
- 🏗️ Former Chief AI Officer & Multi-Agent Architect @ Novel Mind Scientist
- 🔬 Research Intern, Safe & Generative AI @ Mathis Lab, EPFL
- 🎓 B.Sc. in Computer Science, Shahid Beheshti University
📄 See my full CV
I design AI systems that are not just chatbots, but controlled intelligence pipelines with clear routing, evidence handling, structured outputs, and measurable quality.
-
Agentic AI Systems
Multi-agent architectures with task routing, memory, tool orchestration, handoff logic, fallback strategies, and production guardrails. -
RAG & Retrieval Platforms
Retrieval pipelines with vector databases, hybrid search, reranking, context assembly, caching, evaluation, and governance. -
AI Market Intelligence Systems
Pipelines for collecting, structuring, and reasoning over external data sources such as market data, financial signals, crypto/on-chain data, news, and domain-specific knowledge. -
LLM Evaluation & Observability
Evals, tracing, prompt/context/output logging, quality metrics, regression tests, dashboards, and feedback loops tied to product KPIs. -
AI Safety & Security
Red-team evaluation, prompt/tool governance, fallback policies, robust evaluation, and research-to-production security practices.
- Production RAG + agent services
- Routing-first LLM architectures instead of one universal agent
- Structured-output pipelines for reliable automation
- AI observability with tracing, logging, evals, and debugging workflows
- Market-intelligence and decision-support systems powered by LLMs
- Research-to-production transfer in AI safety and model security
- Productized reusable multi-agent RAG platforms with standardized prompts, tools, memory, evaluations, and guardrails
- Built production agentic workflows for autonomous recommendations and business decision support
- Achieved 85%+ end-to-end task completion in real-world business workflows
- Designed SLO-driven AI services with tracing, dashboards, fallback trees, and safety metrics
- Improved reliability under peak load using vector caching, retrieval tuning, and inference optimization
- Built eval/feedback loops connected to live KPIs for continuous improvement and drift monitoring
- Co-authored AI safety research accepted at ICCV 2025 and NeurIPS 2024
Aug 2025 – Apr 2026
- Architected and shipped production LLM services across RAG, agents, and decision-support workflows
- Owned model selection, prompt/agent design, retrieval strategy, evaluation harnesses, and fallback trees
- Built governed AI toolchains with tool registries, policies, approvals, and reusable architecture patterns
- Implemented reliability workflows with SLOs, error budgets, monitoring, tracing, and safety metrics
- Optimized retrieval and inference through data curation, vector caching, routing improvements, and capacity planning
Aug 2024 – Sep 2025
- Architected multi-agent systems using LangChain, custom orchestration, and RAG pipelines
- Built context-aware planning with agent-to-agent protocols, memory, dynamic routing, and escalation paths
- Productionized AI services with Docker, Kubernetes, Prometheus/Grafana, and centralized logging
- Closed the loop with automated evaluation and feedback tied to live business KPIs
Oct 2022 – Sep 2025
- Led delivery of LLM-powered agents across SaaS, healthcare, and education domains
- Designed multi-agent automation pipelines using LangChain, Celery, APIs, and knowledge systems
- Established engineering standards including design docs, ADRs, evaluation protocols, onboarding guides, and review processes
- Integrated text, vision, and knowledge-graph components into reliable AI services
May 2025 – Sep 2025
- Co-authored ICCV 2025 accepted work on data-free diffusion-based trigger inversion for Trojaned models
- Built latent-diffusion pipelines with classifier-guided feedback for exposing adversarial vulnerabilities
- Developed zero-shot and data-free defense methods with large-scale benchmarking and robust evaluation practices
Sharif University & Shahid Beheshti University | 2023 – 2025
- Prototyped secure agentic ML pipelines with RAG, routing, memory management, and evaluation loops
- Worked on AI model security, agent evaluation, and optimization research
- Contributed to publication-driven research and mentored junior engineers
Mar 2023 – Feb 2024
- Delivered a Django-based agentic recommender system with hybrid search and automated workflows
- Led Agile delivery, CI/CD, and platform iteration for production AI features
- Agentic AI
- Multi-Agent Systems
- RAG
- Routing & Context Assembly
- Structured Outputs
- Prompt Engineering
- LLM Evaluation
- Safety Guardrails
- Memory Systems
- Tool Orchestration
- LangChain
- LangGraph
- LlamaIndex
- Eval Harnesses
- Prompt/Context/Output Logging
- Tracing
- Regression Testing
- SLOs & Error Budgets
- Drift Monitoring
- Prometheus
- Grafana
- ELK
- Langfuse-style Observability
- Python
- FastAPI
- Flask
- REST / GraphQL
- Docker
- Kubernetes
- Celery
- Redis
- MLflow
- Airflow
- CI/CD
- PostgreSQL
- MongoDB
- Redis
- Pinecone
- Weaviate
- Chroma
- pgvector
- Neo4j
- AWS
- GCP
- Azure
- Python
- C++
- Java
- C#
-
Routing-First Agent Systems
Designing workflows where user intent is routed to specialized pipelines instead of relying on one generic agent for every task. -
Multi-Agent RAG Platforms
Standardized prompts, tools, memory, retrieval, evals, fallback policies, and guardrails for reusable AI services. -
Context Assembly & Structured Outputs
Building controlled pipelines that retrieve, rank, compress, format, and validate context before producing machine-readable outputs. -
Evaluation & Optimization
Prompt A/B testing, routing comparisons, retrieval-quality checks, automated evals, and KPI-connected feedback loops. -
Safety & Governance
Prompt/tool policies, red-team evaluations, fallback strategies, incident-response-ready production patterns, and robust AI security practices.
A reusable platform for building agentic AI services with routing, retrieval, memory, tool orchestration, evaluation, and guardrails.
Core ideas: LangChain/LangGraph-style orchestration, vector search, prompt governance, fallback trees, eval harnesses, and observability.
A decision-intelligence architecture for collecting and analyzing market, news, crypto, and domain-specific signals through specialized LLM pipelines.
Core ideas: external data ingestion, retrieval, scenario-specific pipelines, structured outputs, signal summarization, tracing, and evaluation.
Research and engineering work on Trojan/backdoor detection, robust evaluation, and data-free defenses for safer AI systems.
Core ideas: diffusion-based inversion, adversarial scanning, benchmark evaluation, zero-shot defenses, and production risk translation.
- DISTIL: Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion — ICCV 2025 (accepted)
- Scanning Trojaned Models Using Out-of-Distribution Samples — NeurIPS 2024 (accepted)
- Comparison of Pre-Training and Classification Models for Early Detection of Alzheimer’s Disease Using MRI — I4C 2023
- Best Ideator — National Young Scientists Festival (2023)
- Top 0.2% National Entrance Exam — Rank 352 / 150,000 (2020)
B.Sc. in Computer Science
Shahid Beheshti University, Tehran
2021 – 2025
GPA: 3.4 / 4.0
- Persian — Native
- English — Professional
- Email: sepehrrezaee2002@gmail.com
- GitHub: github.com/SepehrRezaee
- LinkedIn: linkedin.com/in/sepehr-rezaee
- Website: sepehrrezaee.com



