Data Scientist (analysis/experimentation) building decision-support models and clear, visual analytics for real-world problems.
- Strengths: Experiment Design, Feature Engineering, Robust Validation, Model Building, and Stakeholder Translation.
- Domains: Finance, Healthcare, Business, Retail/Commercial Analytics, and Sports Analytics
- Main Languages: Python, R, SQL, Typescript, and React
- Email: palashcscs@outlook.com
- LinkedIn: in/pgds
- GitHub: PGupta-Git
- Start with a decision and a measurable target (KPI definition comes first).
- Establish baselines, define leakage-safe splits, and validate with time-aware backtesting when appropriate.
- Prefer simple, explainable models when they perform; add complexity only when it wins on validated metrics.
- Communicate uncertainty (calibration, intervals, sensitivity checks) and translate results into concrete actions.
- Published Article: A pragmatic, parallel-arm, randomised trial on the effects of two repeated-sprint training protocols on fitness outcomes in semi-professional male soccer players: preliminary report (Science and Medicine in Football, 2026) | Data & Code Repository
- ORCID: https://orcid.org/0009-0000-0172-4009
Active contributor to popular open-source projects, libraries, and desktop utilities:
- brilliantnz/flickernaut (GNOME Shell Extension adding custom Nautilus context menu entries)
- Invalid Apps & Desktop Files Fix (PR #9): Resolved a critical bug causing extension crashes by implementing robust
try-excepterror handling for invalid or missing desktop files (handling nullGio.DesktopAppInforeturns). - Duplicate Preferences Entries Fix (PR #9): Prevented application entries from appearing twice in the preferences dialog by ensuring the chooser respects the
NoDisplay=truedesktop configuration.
- Invalid Apps & Desktop Files Fix (PR #9): Resolved a critical bug causing extension crashes by implementing robust
- andrewRowlinson/mplsoccer (Matplotlib-based soccer visualization library)
- Speedometer & Gauge Charts (PR #118): Designed and implemented the core
Speedometerclass, enabling highly customizable gauge and speedometer charts (implements Issue #16). - Curved Radar Text Labels (PR #120): Authored curved label support for radar charts along circular arcs using vector glyph paths (
TextPath/PathPatch) andTextToPathmetrics, resolving rounding jitter (fixes Issue #35). Supports multi-line curved labels and auto-flipping for bottom-half readability. - Wikipedia Rate-Limiting Fixes (PR #120): Resolved build failures in docstring gallery examples by standardizing Wikipedia thumbnail sizes and adding proper User-Agent headers to requests.
- Speedometer & Gauge Charts (PR #118): Designed and implemented the core
- ageron/handson-mlp (Aurélien Géron's Hands-On Machine Learning book repository)
- Polars Dataframe Integration (PR #41): Maintained a community-supported fork migrating the book's notebook exercises (all chapters) from Pandas to Polars dataframes, including custom Polars tools and database connection guides.
| Project | What it shows | Links |
|---|---|---|
| Repeated sprint training trial (research) | Statistical rigor, reproducible analysis artifacts | Open Access Paper Published in Science and Medicine in Football Data & Code Repo |
| Drill Design App (production) | Product thinking, data modeling, shipping a real app | https://github.com/PGupta-Git/case-study-drill-design-app https://www.drilldesignapp.com |
| Off-ball movement metric & player similarity (Premier League) | Metric design from raw tracking data, dual-signal similarity framework, structural-null semantics, sensitivity checks, visual storytelling | https://github.com/PGupta-Git/case-study-btla-framework |
| Player availability & decision support | KPI redesign, feature engineering, time-aware validation, decision framing | https://github.com/PGupta-Git/case-study-player-availability-decision-support |
| Tactical / recruitment / performance analysis | Benchmarking, robustness checks, communication through visuals | https://github.com/PGupta-Git/case-study-tactical-recruitment-performance-analysis |
| Open-data experimentation lab | Public code: clean DS workflow, evaluation, visual storytelling | https://github.com/PGupta-Git/open-football-experimentation-lab |
Note: case studies are anonymised (organisation names and private code/data are omitted).
- Designed a novel geometric metric (BtLA) from 25Hz optical tracking data to profile a player's off-ball movement between lines, with continuous passing-lane openness scoring and segment-aware smoothing (see BtLA Framework Case Study).
- Built a dual-signal player similarity framework (cosine on movement shape + Euclidean on scaled level/volume) across a full season of Premier League event data, bridging tracking and event data paradigms (see BtLA Framework Case Study).
- Built non-linear models on high-frequency biometric telemetry to separate signal from noise and predict failure-mode risk (injury-risk proxy; see Player Availability Case Study).
- Engineered new "availability" features with domain experts, replacing legacy KPIs with more predictive signals (see Player Availability Case Study).
- Built forecasting models and automated reporting workflows, saving ~15 hours/week and reducing data retrieval latency by ~30%.
- Delivered executive dashboards for decision-making (market sentiment, product performance, performance monitoring; see Drill Design App and Tactical/Recruitment Case Study).
- Led cross-functional delivery (Agile/Scrum) and bridged data engineering and non-technical stakeholders.

