Releases: business-science/pytimetk
pytimetk 2.5.0
Full Changelog: v2.4.1...v2.5.0
pytimetk 2.4.1
- Compatibility with pandas-flavor 0.8.0
pytimetk 2.4.0
pytimetk 2.4.0 Release Notes
Overview
This release introduces significant enhancements to visualization capabilities, integrates tidy selectors for improved column handling, completes the migration to Ray for parallelism, and includes various performance optimizations and bug fixes. Key themes include better diagnostics plotting, more flexible data manipulation, and improved documentation.
New Features
-
Advanced Plotting Diagnostics: Added new APIs for visualizing time series diagnostics. These functions provide interactive and insightful plots to analyze correlation, seasonality, and trends.
Function Description plot_acf_diagnosticsPlot ACF, PACF, and CCF with interactive dropdowns. plot_seasonal_diagnosticsBox/violin plots for seasonal features (e.g., hour, weekday, month). plot_stl_diagnosticsDecompose series into observed, season, trend, remainder, and seasonally adjusted components. plot_time_series_boxplotRolling distribution boxplots with optional smoothers. plot_time_series_regressionFit and visualize linear regressions with observed vs. fitted/residuals views. -
Tidy Selectors Integration:
- Added support for tidy selectors (
contains,starts_with,ends_with,matches) in multiple functions. - Enables human-readable column selection (e.g.,
contains("date")). - Integrated into:
acf_diagnostics,augment_adx,augment_atr,augment_bbands,augment_cmo,augment_diffs,augment_ewm,augment_fourier,augment_hilbert,augment_hurst_exponent,augment_lags,augment_leads,augment_macd,augment_ppo,augment_qsmomentum,augment_roc,augment_rsi,augment_stochastic_oscillator,augment_wavelet,future_frame,pad_by_time,seasonal_diagnostics,stl_diagnostics,summarize_by_time.
- Added support for tidy selectors (
-
Human-Friendly Durations:
- Added
parse_human_durationfor converting strings like "3 days" or "2 weeks" to offsets. - Integrated into lag/lead specs and bounds for
future_frameandpad_by_time.
- Added
-
Ray Parallelism:
- Completed migration to Ray for distributed computing.
- Added Ray migration guide and multiprocessing support.
-
Plotly Theme:
- Introduced
theme_plotly_timetkfor consistent styling on custom Plotly figures.
- Introduced
-
New Guides:
- Added guides for Polars workflows, tidy selectors, and human-friendly periods.
- Enhanced documentation with new examples and visual elements.
Improvements
-
Performance:
- Speed improvements in
future_frameandpad_by_time. - Migrated from
hmmlearntopomegranatefor better performance in regime detection. - Upgraded pandas datetime aliases for compatibility.
- Speed improvements in
-
Visualization:
- Updated plotting APIs roadmap.
- Enhanced examples in docs and docstrings.
-
Other:
- Added
fillnaoption topad_by_time. - Improved sorting in
plot_timeseries. - Added visual elements to docs.
- Added
Bug Fixes
- Fixed dropdown issues in
plot_acf_diagnostics. - Resolved missing multiprocessing in Ray setup.
- Fixed bugs in Fourier, Hilbert, and Wavelet transforms.
- Corrected connected smoothers in
plot_timeseries. - GH Actions fixes.
- Sorting review and fixes (#286).
Documentation
- Multiple doc updates for accuracy and completeness.
- Updated Quarto sidebar and YAML for better navigation.
- New guides for selectors, dates, and Polars.
- Enhanced examples and docstrings.
Distribution
- Built and added
pytimetk-2.4.0-py3-none-any.whlandpytimetk-2.4.0.tar.gz.
Breaking Changes
- None identified in this release.
For full details, see the commit history. Thanks to @mdancho84 for all contributions!
Full Changelog: v2.3.0...v2.4.0
pytimetk 2.3.0
pytimetk 2.3.0 Release Notes
We're excited to announce the release of pytimetk 2.3.0! This update focuses on significant performance and memory optimizations, particularly for users leveraging the Polars engine. We've introduced native Polars implementations for several key functions, reducing reliance on pandas fallbacks and unlocking faster, more efficient operations on large datasets. Benchmarks show impressive gains, including up to ~7X speedups in EWM (Exponential Weighted Moving) calculations.
Key Improvements
-
Polars-Native Optimizations:
- Added dedicated Polars paths for
pad_by_time,future_frame,augment_ewm,augment_rolling_apply,augment_expanding_apply, and the scalar branch ofapply_by_time. This eliminates unnecessary conversions to pandas, keeping data in Arrow buffers for zero-copy chaining and reduced memory overhead. - For EWM operations (
augment_ewm), Polars users now benefit from directewm_mean/std/varapplications, avoiding round-trips to pandas. Benchmark results on 200k rows demonstrate ~7X faster execution (0.004s vs. 0.030s in pandas). - Tightened conversion plumbing to use shallow copies (
copy(deep=False)) only when necessary, reusing cached pandas views during fallbacks to minimize copy-on-write churn.
- Added dedicated Polars paths for
-
Memory Efficiency:
- Implemented shallow pandas copies in various modules to avoid deep-copy overhead.
- Polars fallbacks now clone prepared frames before mutations, preserving originals for reuse and further reducing memory spikes.
- Enhanced row identifier handling inside Polars/cuDF, eliminating mutations on pandas frames.
-
cuDF Support:
- Added cuDF dataframe operations, extending GPU-accelerated capabilities for compatible environments.
-
Benchmarking and Documentation:
- New benchmarks (e.g.,
ewm_polars_profile.py) to quantify improvements—reproducible with commands likePYTHONPATH=src python zzz_local/benchmarks/ewm_polars_profile.py --rows 200000 --groups 25 --repeats 5. - Updated roadmap (
performance_memory_review.md) with completed items and remaining todos, including Polars-native wide-format forapply_by_timeand extending optimizations to finance indicators.
- New benchmarks (e.g.,
Breaking Changes
- None in this release. Existing pandas-based workflows remain fully compatible, with Polars enhancements opt-in via the
engine="polars"parameter.
Installation
Update via pip:
pip install --upgrade pytimetk
If you encounter any issues, please open a ticket on GitHub.
Full Changelog: v2.2.1...v2.3.0
pytimetk 2.2.1
pytimetk 2.2.1
Key Updates
- augment_ewm Enhancements: Now supports passing a sequence of
alphavalues to compute multiple EWM columns in one call. Improved Polars integration and documentation. - augment_qsmomentum Fixes: Resolved issues with grouped Polars DataFrames, including proper handling of nulls and ensuring sufficient data length (at least
max(roc_slow_period, returns_period)). FixedAttributeError: 'DataFrame' object has no attribute 'to_list'. - GPU Acceleration Docs: Updated examples to demonstrate
augment_rollingon Polars with GPU engine, including multi-symbol grouping.
Full Changelog: v2.2.0...v2.2.1
pytimetk 2.2.0
What's New
- Polars
.tkAccessor for LazyFrames: Added a.tkaccessor forpl.LazyFrameobjects, enabling direct access to pytimetk helpers (e.g.,df.lazy().tk.augment_rolling(...)) with performance parity to pandas and Polars DataFrame paths. This enhances support for lazy evaluation pipelines, improving efficiency for large datasets. - GPU Installation Improvements: Updated GPU support with
cudf-cu12dependency for Python 3.10+ on Linux (x86_64 or aarch64), replacing the genericcudfdependency. This ensures better compatibility and performance for GPU-accelerated workflows. - Polars LazyFrame Compatibility: Extended Polars support to include LazyFrame checks in utility functions (e.g.,
check_date_column,check_value_column), ensuring seamless integration with lazy evaluation pipelines.
Enhancements
- Refactored GroupBy Handling: Replaced direct
.objaccess on pandas GroupBy objects withresolve_pandas_groupby_frameacross the codebase. This improves compatibility with accelerated backends likecudf.pandas, ensuring robust DataFrame extraction even when proxies are used. - Memory Optimization: Enhanced
reduce_memory_usageto handle GroupBy objects more reliably by usingresolve_pandas_groupby_frameand attempting categorical conversion only when safe, improving memory efficiency for large datasets. - Documentation Updates: Improved GPU acceleration guide (
production/02_gpu_acceleration.html) with streamlined examples and updated changelog (changelog-news.html) to reflect new features and fixes.
Bug Fixes
- Sorting Issue in GPU Workflows: Fixed a sorting issue in GPU-compatible operations to ensure consistent ordering in Polars and pandas pipelines.
- Documentation Corrections: Addressed inconsistencies in the GPU guide documentation for clarity and accuracy.
Breaking Changes
- The
cudfdependency has been replaced withcudf-cu12for GPU support, requiring users to update their installation commands topip install pytimetk[gpu_cu12] --extra-index-url=https://pypi.nvidia.comfor Python 3.10+ on supported Linux platforms.
Installation
To install pytimetk 2.2.0 with GPU support:
pip install pytimetk[gpu_cu12] --extra-index-url=https://pypi.nvidia.com
pip install "polars[gpu]" --extra-index-url=https://pypi.nvidia.comContributors
Full Changelog: v2.1.0...v2.2.0
pytimetk 2.1.0
Release Summary
The pytimetk 2.1.0 release introduces GPU acceleration (Beta) with NVIDIA RAPIDS, enhancing performance for feature engineering and Polars lazy pipelines. This release focuses on computational efficiency while maintaining backward compatibility with CPU-based workflows.
What's New in pytimetk 2.1.0
GPU Acceleration (Beta)
This release adds optional GPU acceleration for faster computation of:
- Feature engineering (lags, differences, leads, rolling/expanding stats, financial indicators)
- Polars lazy pipelines with
collect(engine="gpu")
Key Features:
- Opt-in: Activates only with
cudfandpolars[gpu]installed. - CPU Fallback: Automatically reverts to CPU if GPU is unavailable or unsupported.
- Utilities:
pytimetk.utils.gpu_supportprovidesis_cudf_available()andis_polars_gpu_available()for runtime checks.
Installation
pip install pytimetk[gpu] --extra-index-url=https://pypi.nvidia.com
pip install "polars[gpu]" --extra-index-url=https://pypi.nvidia.comSee the GPU Acceleration Guide for setup details.
Example: GPU-Accelerated RSI
import polars as pl
import pytimetk as tk
df = tk.load_dataset("stocks_daily", parse_dates=["date"])
result = (
pl.from_pandas(df.query("symbol == 'AAPL'"))
.lazy()
.tk.augment_rsi(date_column="date", close_column="close", periods=[14, 28])
.collect(engine="gpu")
)Documentation and Examples
- New Guide: Added
02_gpu_acceleration.htmlfor GPU setup and usage. - Examples: Included GPU-accelerated examples for
augment_rollingand financial indicators. - Optimizations: Improved Polars lazy frame handling and group-by performance.
Performance Highlights
- Rolling Calculations: Up to 10X–100X faster with Polars GPU engine for large datasets.
- Financial Indicators: GPU-accelerated RSI/MACD 5X–50X faster on large datasets.
Backward Compatibility
- No Breaking Changes: GPU is opt-in; CPU workflows are unchanged.
- Fallback: Automatic CPU fallback for unsupported GPU operations.
Known Issues
- Custom Functions: Not yet GPU-compatible; fall back to CPU.
- Polars GPU: Limited to expression-based queries.
- Documentation: Multi-GPU setups need further documentation.
Next Steps
Explore the GPU Acceleration Guide and report issues on the pytimetk GitHub.
Full Changelog: v2.0.1...v2.1.0
pytimetk 2.0.1
augment_spline now accepts date_column and value_column, validates both, sorts rows by date before basis generation, and reuses the original order once spline columns are computed for pandas and polars backends
Full Changelog: v2.0.0...v2.0.1
pytimetk 2.0.0
Release Date: October 13, 2025
Highlights
This major release introduces first-class support for Polars DataFrames across core, feature engineering, and plotting functions, enabling faster computations on large datasets. Additionally, we've added a beta Feature Store & Caching system to persist and reuse expensive feature engineering steps, with optional MLflow integration for reproducible pipelines. Other key additions include spline basis expansions and various documentation improvements.
New Capabilities
| Feature | Description | Supported Backends | Example Use Case |
|---|---|---|---|
Polars .tk Accessor |
Direct access to core, feature engineering, and plotting functions on Polars DataFrames | Polars | df.tk.plot_timeseries(...) |
| Feature Store | Persist and reuse feature sets with versioning and metadata | pandas, Polars | Cache time-series signatures for ML pipelines |
| MLflow Integration | Log feature sets and artifacts for reproducible experiments | pandas, Polars | Track feature versions in MLflow experiments |
| Spline Features | Generate spline basis expansions for non-linear modeling | pandas, Polars | Model non-linear trends in sales data |
| Financial Indicators | Full Polars support for momentum and risk metrics | pandas, Polars | Calculate RSI or MACD on stock data |
New Features
Polars now has 1st class support in pytimetk via a new tk accessor
- New
.tkaccessor for Polars DataFrames, supporting core functions likeaugment_timeseries_signature,augment_rolling,summarize_by_time, and plotting helpers (plot_timeseries,plot_anomalies, etc.). Useengine='polars'for pandas-compatible functions to leverage Polars' performance. (#297) - Supports direct integration for financial indicators (e.g., ADX, ATR, BBands, MACD, PPO, RSI, ROC) and feature engineering (diffs, lags, leads).
- Example usage:
import polars as pl
import pytimetk as tk
from datetime import date
# Sample Polars DataFrame
df_pl = pl.DataFrame(
{
"date": pl.date_range(
start=date(2023, 1, 1),
end=date(2023, 1, 10),
interval="1d",
eager=True,
),
"value": [10, 12, 15, 14, 18, 20, 22, 19, 17, 16],
}
)
# Use .tk accessor to add timeseries signature features
result = df_pl.tk.augment_timeseries_signature(date_column="date")
# Display result
print(result)Feature Store & Caching (Beta)
- New
FeatureStoreclass to register, build, and load feature sets with automatic versioning and metadata. Supports local disk or pyarrow filesystems (e.g., S3). Optional MLflow logging for experiment tracking. (#308) - Example usage:
import pandas as pd import pytimetk as tk df = tk.load_dataset("bike_sales_sample", parse_dates=["order_date"]) store = tk.FeatureStore() store.register( "sales_signature", lambda data: tk.augment_timeseries_signature(data, date_column="order_date", engine="pandas"), default_key_columns=("order_id",), description="Calendar signatures for sales orders." ) result = store.build("sales_signature", df) print(result.from_cache) # False first run, True on subsequent builds
Spline Basis Expansions
- Added
augment_splineto generate spline features for numeric columns, useful for modeling non-linear relationships. Includes Polars support. (#300) - Example:
import pandas as pd
import polars as pl
import pytimetk as tk
df = tk.load_dataset('m4_daily', parse_dates=['date'])
df_spline = (
df
.query("id == 'D10'")
.augment_spline(
date_column='date',
value_column='value',
spline_type='bs',
df=5,
degree=3,
prefix='value_bs'
)
)
df_spline.head()Comprehensive Testing
- Added tests for financial momentum indicators (ADX, ATR, BBands, CMO, ROC, RSI, Stochastic Oscillator, etc.) and risk metrics to ensure reliability.
Improvements
- Upgraded
pandas_flavorto 0.7.0. - Refactored test folder structure to match source code.
- Fixed deprecation warnings in anomalize tests.
- Improved docstrings and examples for various functions.
- Enhanced Polars compatibility, including direct integration for diffs, pct change, lags/leads, MACD, PPO, RSI, BBands, ROC, and more.
- Added Hilbert transform bug fix and missing args for EWM.
- Updated README with Polars plotting support and Feature Store details.
- Pruned tracking on generated artifacts and added Polars 1.2.0 compatibility.
Bug Fixes
- Fixed x-axis date labels in
plot_timeseriesfunctions. (#307) - Resolved failing GitHub Actions.
- Fixed bugs in Hilbert transform and EWM args. (#297)
- Addressed issues with time series sequence, summarize_by_time, filter_by_time, and core module for Polars.
Breaking Changes
- Removed explicit
import pytimetk.polars_namespacerequirement. - Feature Store is in beta; APIs and on-disk format may change.
Documentation
- Comprehensive updates for Polars support, augment_spline, and Feature Store.
- New clustering tutorial for stock data analysis.
- Added examples and improved docstrings throughout.
Upgrade via pip install pytimetk --upgrade. We welcome feedback and bug reports on GitHub.
Full Changelog: v1.2.5...v2.0.0
pytimetk 1.2.5
Fixes for polars: .over() requires partion_by, order_by
Full Changelog: v1.2.4...v1.2.5