SQL-driven analytics pipeline for hospital readmission patterns in diabetic patients. Covers relational data modeling, ETL, analytical queries, and interactive visualization across 101,766 encounters from 71,518 patients.
Dataset: Diabetes 130-US Hospitals (UCI ML Repository) — real-world clinical records, publicly available and de-identified.
CSV Clinical Data → SQLite relational model → SQL analytical queries
↓
Python integration (Pandas)
↓
Interactive visualization (Plotly)
| Metric | Value |
|---|---|
| Hospital encounters | 101,766 |
| Unique patients | 71,518 |
| Global readmission rate | 46% |
| Low-risk group readmission | 35.52% |
| High-risk group readmission | 50.58% |
→ Full findings report — Baseline v1.0
Older patient groups show progressively higher readmission rates across all cohorts.
Patients with higher diagnostic burden present significantly increased readmission probability — risk buckets derived from composite clinical features.
Poor glycemic control correlates with higher readmission rates across patient segments.
SQL · SQLite · Python · Pandas · Plotly
Baseline analytics complete: relational modeling, ETL pipeline, SQL analysis, risk stratification, and visualization implemented.
Next phase (not yet started): data quality validation, feature engineering, and predictive modeling of readmission risk.
Engineering and research project — not medical advice. All data is publicly available and de-identified.


