Skip to content
This repository was archived by the owner on Apr 14, 2026. It is now read-only.

brycewang-stanford/PyStataR

Repository files navigation

PyStataR

PyPI Version Python Version License Status

🇬🇧 English · 🇨🇳 中文


⚠️ DEPRECATED — Superseded by StatsPAI (2026-04-14)

PyStataR is no longer maintained. All functionality has been migrated to the new package StatsPAI, which also extends its scope to a complete causal-inference and econometrics toolkit — 390+ functions covering classical econometrics, modern ML-based causal inference (DID, RDD, IV, DML, Causal Forest, Meta-Learners, TMLE, Neural Causal, Causal Discovery, Policy Learning, and more), plus publication-ready Excel / Word / LaTeX output and LLM-friendly workflows.

Migrate in 3 steps

pip install statspai
# Old (PyStataR)
from pystatar import pyegen, pywinsor2, pdtab, pyoutreg
pyegen.rowmean(df, ['x1', 'x2'])
pywinsor2.winsor2(df, ['wage'], cuts=(1, 99))
pdtab.tab2(df, 'x', 'y')
pyoutreg.outreg(model, 'out.xlsx')

# New (StatsPAI)
import statspai as sp
sp.rowmean(df, ['x1', 'x2'])
sp.winsor(df, ['wage'], cuts=(1, 99))
sp.tab(df, 'x', 'y')
sp.outreg2(model, filename='out.xlsx')

Full API mapping: see MIGRATION.md in this repository or the identical copy inside StatsPAI.

Promises to existing users

  • PyPI package preserved foreverpip install pystatar continues to work; existing versions will never be removed from PyPI.
  • GitHub repo preserved — this repository will be archived (read-only) but never deleted; all Issues, Stars, and citations remain intact.
  • No new features or bug fixes — please migrate to StatsPAI for ongoing updates.
  • Issues are frozen — new questions should be opened at StatsPAI/issues.

Why deprecated

PyStataR began as a Stata-to-Python bridge for the four most-used academic commands (egen, winsor2, tabulate, outreg2). Over time, the bigger vision became clear: build a Python-native, AI-era causal-inference toolbox rather than a Stata emulator. That package is StatsPAI. Running two packages in parallel only confuses users and doubles maintenance — so PyStataR's mission is being merged into StatsPAI, which does everything PyStataR did, and much more.

— Bryce Wang, 2026-04-14


Historical overview (for reference)

PyStataR was a unified interface to four mature Stata-equivalent PyPI packages:

Submodule Stata equivalent Purpose
pyegen egen Extended data generation: ranks, row statistics, group operations
pywinsor2 winsor2 Winsorizing / trimming at percentile or IQR cutoffs
pdtab tabulate One- and two-way cross-tabulation with statistical tests
pyoutreg outreg2 Publication-quality regression tables exported to Excel / Word

All four are fully covered — and substantially extended — in StatsPAI under the unified sp.* namespace.

Final version

  • v0.4.1 — adds the DeprecationWarning and migration notice.
  • v0.4.0 — last feature release, integrated pyoutreg.

Citation

If you used PyStataR in a publication, you may still cite the original repository; its GitHub URL and release history will remain available indefinitely. For new work, please cite StatsPAI instead.


License

MIT — see LICENSE.

About

Comprehensive Python package providing Stata-equivalent commands for pandas DataFrames - includes tabulate, egen, reghdfe, and winsor2 implementations

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages