Skip to content
View Ramanuj-Sarkar's full-sized avatar

Block or report Ramanuj-Sarkar

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Ramanuj-Sarkar/README.md

Hello!

I'm an ML & AI Engineer building production-grade systems at the intersection of biology and computer science, especially artificial intelligence. I specialize in turning complex requirements into reliable, deployed ML workflows, from RAG pipelines and agentic systems to computer vision and MLOps infrastructure.

Currently shipping AI tools used in real clinical and enterprise environments while contributing to open source in my free time.


What I've Worked On

  • RAG & Document Intelligence — Building retrieval-augmented pipelines for compliance and clinical document analysis using LangChain, Pinecone, OpenAI, and AWS
  • Agentic AI Systems — Designing multi-agent workflows with LangChain and LangGraph that integrate voice, image, and document inputs
  • Data Engineering & MLOps — End-to-end pipelines from SharePoint/SQL ingestion to Power BI dashboards, with anomaly detection and automated reporting
  • Bioinformatics & Computational Research — PET neuroimaging analysis, DNA methylation, RNA-seq, and published ML research in neuroscience

Tech Stack

AI/ML

Pytorch Tensorflow Scikit-learn LangChain LangGraph OpenAI Hugging Face

Cloud & MLOps

AWS GCP Azure FastAPI Streamlit Docker CI/CD MLflow

Data & Databases

NumPy Pandas OpenCV PaddlePaddle Pinecone MySQL AzureSQL PowerBI Tableau

Languages

Python SQL R language Bash HTML/CSS Javascript

Java C C++ Racket Rust


Education & Credentials

  • 🎓 M.S. Bioinformatics — Johns Hopkins University (GPA: 3.8 | 2023–2025)
  • 🎓 B.S. Computer Science & Biology — UC Irvine (GPA: 3.7 | 2019–2023)
  • ☁️ AWS Certified AI Practitioner (AIF-C01)
  • 📊 Data Science Certification — HarvardX
  • 🤖 Machine Learning Specialization — Stanford

Publication

Bai L., Sarkar R., Lee F., Wu J.C., Vawter M.P. Exploratory Analysis of Sleep Deprivation Effects on Gene Expression and Regional Brain Metabolism. Complex Psychiatry, 2025 · doi.org/10.1159/000545461


Featured Projects

Project Description
Agentic-ResearchPaperChatbot Agentic AI system combining automated document parsing with a conversational chatbot powered by Amazon Bedrock
document-intelligence-project RAG-based Q&A system that ingests questions via Streamlit and retrieves answers from documents using LangChain and Pinecone
mlops-experiment-tracker Lightweight, dependency-free experiment tracking tool for Python — a minimal alternative to MLflow for fast iteration
batch_record_mvp Suite of MVPs for automated batch record analysis in clinical/regulated data environments
sentinelScan CNN-based crack detection systems (PyTorch/U-Net) with FastAPI inference pipelines and damage heatmap generation

Let's Connect!

LinkedIn Email GitHub


Open to collaborations.

Pinned Loading

  1. Agentic-ResearchPaperChatbot Agentic-ResearchPaperChatbot Public

    This repository combines automated document parsing with a conversational AI agent powered by Amazon Bedrock.

    Python

  2. document-intelligence-project document-intelligence-project Public

    This document intelligence implementation takes in questions using Streamlit and analyzes relevant documents using Langchain and Pinecone to deliver answers.

    Python

  3. mlops-experiment-tracker mlops-experiment-tracker Public

    Prototype for a lightweight, dependency-free alternative to MLflow for quick experiment tracking in Python.

    Python

  4. batch_record_mvp batch_record_mvp Public

    A collection of MVPs to analyze batch records.

    Python

  5. sentinelScan sentinelScan Public

    CNN-based computer vision system using PyTorch (U-Net) to detect structural cracks in inspection images, generating damage heatmaps and severity scores, with a deployable inference pipeline

    Python