Description: This project analyzes and visualizes medical examination data using Python, Pandas, Matplotlib, and Seaborn. The dataset contains information about patients’ body measurements, blood tests, lifestyle choices, and the presence of cardiovascular disease. The project demonstrates how to process, clean, and visualize data to extract meaningful insights.
Features:
- Calculates BMI and adds an
overweightcolumn to classify patients. - Normalizes
cholesterolandglucosevalues for easier analysis. - Generates categorical plots to compare features between patients with and without cardiovascular disease.
- Generates a heat map showing correlations between numerical variables after cleaning the data.
- Handles incorrect or extreme data (e.g., diastolic pressure higher than systolic, outliers in height and weight).
Dataset:
The dataset used is medical_examination.csv, which includes the following columns:
age,height,weight,genderap_hi(systolic blood pressure),ap_lo(diastolic blood pressure)cholesterol,gluc(glucose levels)smoke,alco,active(lifestyle factors)cardio(presence of cardiovascular disease)