StatMorph is a Streamlit dashboard for morphing 2D datasets into target shapes while preserving key statistical properties, then evaluating the impact on downstream machine learning performance. It wraps the data-morph-ai library with a visual, interactive workflow and adds statistical, ML, and clustering diagnostics plus exportable reports.
This project is especially useful for teaching or exploring how different geometric representations can keep means, variances, and correlations intact while dramatically changing visual structure.
Status: The app is fully runnable from app.py. The repository does not currently include a license file.
- Morph 2D point clouds into many predefined shapes while preserving statistics.
- Built-in starter datasets from
data-morph-ai(e.g.,dino,cat,dog). - Real-world datasets via public APIs (stock market and global weather).
- Custom shape creator using mathematical or parametric equations.
- Statistical integrity report with KS tests and correlation preservation.
- ML performance comparison across multiple classifiers.
- Clustering diagnostics (K-means inertia and silhouette).
- Export: animated GIF, CSV summaries, and PDF reports.
- Load a dataset (upload, starter set, real-world, or custom equation).
- Choose a target shape and morphing precision/iterations.
- Visualize original vs morphed datasets side-by-side.
- Review statistical integrity and ML preservation metrics.
- Export reports and data for downstream use.
- Python 3.9+ recommended
- Internet access if you want real-world datasets
pip install -r requirements.txtstreamlit run app.pyOpen the local URL shown by Streamlit in your browser.
StatMorph provides multiple dataset sources:
Your CSV must contain numeric x and y columns with at least 10 points. Example:
x,y
1.0,2.0
2.0,3.5
3.5,1.2Available shapes include:
dino, cat, dog, bunny, panda, sheep, gorilla, music, pi, python, soccer_ball, superdatascience.
The app includes two real-world datasets:
- Stock Market (AAPL): closing price vs trading volume (via Yahoo Finance /
yfinance) - World Weather Data: temperature vs humidity (via
wttr.in)
If those APIs fail or are unavailable, the weather loader falls back to realistic synthetic data.
You can generate datasets from:
- Mathematical functions:
y = f(x)(e.g.,sin(x/10),x**2) - Parametric equations:
x = f(t), y = g(t)(e.g.,x=cos(t),y=sin(t))
The sidebar exposes key controls:
- Target Shape: choose from built-in shape list
- Decimal Precision: lower values allow more visual change
- Iterations: higher values improve shape fidelity (slower)
- Animation: optional morphing GIF with easing and freeze frames
StatMorph compares original vs morphed datasets using:
- Means, standard deviations, and correlation
- Kolmogorov-Smirnov tests for distribution similarity
- Basic ML evaluation (Logistic Regression)
- Comprehensive ML evaluation (RF, SVM, KNN, Naive Bayes, Decision Tree, MLP)
- Clustering comparison (KMeans inertia and silhouette)
- Overall preservation score combining stats + ML metrics
From the UI you can generate:
- Animated GIF of the morphing process
- CSV summary of stats and ML metrics
- PDF report of the analysis
- Morphed dataset CSV
.
├─ app.py # Streamlit UI + orchestration
├─ morph_engine.py # Wrapper around data-morph-ai morphing
├─ evaluator.py # Statistical + ML evaluation
├─ real_world_datasets.py # Public API dataset loaders
├─ custom_shapes.py # Equation-based custom shape utilities
├─ requirements.txt # Python dependencies
├─ sample_data.csv # Example dataset
└─ sample_shape.png # Example shape image (not currently used in UI)
- Some advanced ML evaluation uses
GridSearchCVfor MLP; this can be slow on large datasets. - Real-world dataset fetching requires internet access;
yfinancealso requires its API to be reachable. - No license file is included in this repository.
- If
yfinanceis missing, install it withpip install yfinance. - If Streamlit fails to start, ensure you are in the project directory and using the right Python environment.
- If morphing fails, confirm your dataset has numeric
x/ycolumns and at least 10 rows.
StatMorph builds on the data-morph-ai library for the morphing algorithm and uses standard data science tooling for evaluation.