I'm an ML & AI Engineer building production-grade systems at the intersection of biology and computer science, especially artificial intelligence. I specialize in turning complex requirements into reliable, deployed ML workflows, from RAG pipelines and agentic systems to computer vision and MLOps infrastructure.
Currently shipping AI tools used in real clinical and enterprise environments while contributing to open source in my free time.
- RAG & Document Intelligence — Building retrieval-augmented pipelines for compliance and clinical document analysis using LangChain, Pinecone, OpenAI, and AWS
- Agentic AI Systems — Designing multi-agent workflows with LangChain and LangGraph that integrate voice, image, and document inputs
- Data Engineering & MLOps — End-to-end pipelines from SharePoint/SQL ingestion to Power BI dashboards, with anomaly detection and automated reporting
- Bioinformatics & Computational Research — PET neuroimaging analysis, DNA methylation, RNA-seq, and published ML research in neuroscience
AI/ML
Cloud & MLOps
Data & Databases
Languages
- 🎓 M.S. Bioinformatics — Johns Hopkins University (GPA: 3.8 | 2023–2025)
- 🎓 B.S. Computer Science & Biology — UC Irvine (GPA: 3.7 | 2019–2023)
- ☁️ AWS Certified AI Practitioner (AIF-C01)
- 📊 Data Science Certification — HarvardX
- 🤖 Machine Learning Specialization — Stanford
Bai L., Sarkar R., Lee F., Wu J.C., Vawter M.P. Exploratory Analysis of Sleep Deprivation Effects on Gene Expression and Regional Brain Metabolism. Complex Psychiatry, 2025 · doi.org/10.1159/000545461
| Project | Description |
|---|---|
| Agentic-ResearchPaperChatbot | Agentic AI system combining automated document parsing with a conversational chatbot powered by Amazon Bedrock |
| document-intelligence-project | RAG-based Q&A system that ingests questions via Streamlit and retrieves answers from documents using LangChain and Pinecone |
| mlops-experiment-tracker | Lightweight, dependency-free experiment tracking tool for Python — a minimal alternative to MLflow for fast iteration |
| batch_record_mvp | Suite of MVPs for automated batch record analysis in clinical/regulated data environments |
| sentinelScan | CNN-based crack detection systems (PyTorch/U-Net) with FastAPI inference pipelines and damage heatmap generation |
Open to collaborations.


