Skip to content

[RFC]: Implement a broader range of statistical distributions #226

@Md-Zaid45

Description

@Md-Zaid45

Full name

Moh Zaid Khan

University status

Yes

University name

Madhav Institute of Technology and Science, Gwalior

University program

B.Tech in Mathematics and Computing

Expected graduation

June 2027

Short biography

I am Moh Zaid Khan, a third year Mathematics and Computing student at MITS Gwalior. I enjoy implementing Mathematics to solve real world problems. Through my degree curriculum, I have built a strong foundation in advanced mathematics and programming using C, C++, and Python. Recently I have been working on personal Full-Stack project in MERN stack. Therefore, I have all prerequisite skills to work on challenges in this proposal

Timezone

IST (UTC +5:30)

Contact details

mz1944680@gmail.com

Platform

Windows

Editor

I use VSCode. It is fast and the integrated terminal keeps my workflow centralized

Programming experience

I've been building full-stack web apps with a strong focus on algorithmic logic. Recently, I built Noto, a full-stack notes and flashcard application. For scheduling Flashcards, I wrote a custom implementation of the SM-2 Spaced Repetition algorithm in pure JavaScript, which calculates dynamic review intervals using exponential math based on user recall data.

JavaScript experience

I write a lot of JS for both the frontend and backend.

Favorite feature: I really like its asynchronous model and first-class functions it makes writing functional, event-driven code very natural.

Least favorite feature: Implicit type coercion. When writing numerical algorithms, having JS silently propagate NaN or coerce types instead of throwing an error makes debugging incredibly frustrating.

Node.js experience

I use Node.js for my backend projects. I'm comfortable with package management, the CommonJS/ESM module systems, and writing asynchronous scripts for file system operations.

C/Fortran experience

My university coursework involves writing C for Data Structures and Algorithm Design. I am comfortable with static typing, manual memory management.

Interest in stdlib

I really appreciate how @stdlib brings the mathematical rigor you usually only see in C or Python over to the JavaScript ecosystem. This makes developers to use JavaScript for Statistics, Data Analysis, Machine Learning, or other Mathematical applications

Version control

Yes

Contributions to stdlib

PR #11232: Fixed wrong header filename in @stdlib/math/base/special/fast/pow-int
renamed pow.h to pow_int.h to match the package name.
[Status: Open]

stdlib showcase

I built a Node.js CLI tool that simulates and analyzes spaced-repetition study sessions.

I used @stdlib/random-base-normal to generate mock daily study scores, and then applied @stdlib/stats-base-mean and @stdlib/stats-base-stdev to calculate average retention rates and study consistency over time.

Goals

My goal is to implement various statistical distributions for @stdlib while ensuring they hold up against extreme edge cases.

Key deliverables:

  • Core API: Writing standard methods (PDF, CDF, quantile, and random generation) for each target distribution.
  • Numerical Stability: Handling floating point limitations to prevent overflow, underflow, and precision loss at extreme parameter boundaries.
  • Performance Optimization: Improving root finding and approximation algorithms to reach machine-level accuracy faster.
  • Validation: Testing all code against established libraries like SciPy to guarantee high-precision results.

Why this project?

As a Mathematics and Computing student, this project perfectly aligns with my academic focus and my love for JavaScript. I am excited by the challenge of translating pure mathematical limits.

Qualifications

My university coursework directly supports this work:

Real and Complex Analysis,
Numerical Techniques,
Algorithm Analysis,
Probability and Statistics,
Optimization Techniques,
C/C++/Python.
And I have learned Full-Stack Development in MERN stack, Therefore, I acquire all the prerequisites skills to work on this proposal

Prior art

These distributions are well-established in Python's scipy.stats and C++'s Boost.Math. I plan to use SciPy's source code as a reference for handling edge-cases and use its outputs to generate my JSON test fixtures. R's extRemes package is also a great reference for the GEV and GPD logic.

Commitment

I plan to commit 35-40 hours a week to this project and can extend working hours if needed.

Schedule

Community Bonding (May 1 – May 25)
Commit generate_fixtures.py to establish the testing pipeline. Draft API signatures for the new distributions and review @stdlib sub-package structures to ensure seamless integration.

Weeks 1–2 (May 26 – Jun 8)
Establish JSDoc and benchmarks. Implement the genpareto (GPD) distribution. Focus heavily on implementing Taylor expansions for shape parameter limits.

Weeks 3–4 (Jun 9 – Jun 22)
Implement the genextreme (GEV) distribution. Handle parameter limit continuity and implement Halley's Method for quantile refinement to ensure cubic convergence.

Weeks 5–6 (Jun 23 – Jul 6)
Implement the invgauss distribution. Focus on mitigating log-space floating-point overflow and refining Halley root-finding for edge cases.

Weeks 7–8 (Jul 7 – Jul 20)
Implement the vonmises and rice distributions. The primary focus will be on handling Bessel function log-space expansions to maintain IEEE 754 precision.

Weeks 9–10 (Jul 21 – Aug 3)
Implement the betabinom and skellam distributions. Address and handle combinatorial overflow and recurrence drift typical in these discrete distributions.

Weeks 11–12 (Aug 4 – Aug 25)
Implement the zipf distribution. Run a comprehensive full test pass against all scipy.stats golden fixtures. Complete any stretch goals (e.g., lomax) and address final mentor feedback before submission.

Related issues

N/A

Checklist

  • I have read and understood the Code of Conduct.
  • I have read and understood the application materials found in this repository.
  • I understand that plagiarism will not be tolerated, and I have authored this application in my own words.
  • I have read and understood the patch requirement which is necessary for my application to be considered for acceptance.
  • I have read and understood the stdlib showcase requirement which is necessary for my application to be considered for acceptance.
  • The issue name begins with [RFC]: and succinctly describes your proposal.
  • I understand that, in order to apply to be a GSoC contributor, I must submit my final application to https://summerofcode.withgoogle.com/ before the submission deadline.

Metadata

Metadata

Assignees

No one assigned

    Labels

    20262026 GSoC proposal.rfcProject proposal.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions