Skip to content

Releases: business-science/pytimetk

pytimetk 2.5.0

30 Mar 21:30

Choose a tag to compare

pytimetk 2.4.1

26 Nov 23:37

Choose a tag to compare

  • Compatibility with pandas-flavor 0.8.0

pytimetk 2.4.0

07 Nov 23:12

Choose a tag to compare

pytimetk 2.4.0 Release Notes

Overview

This release introduces significant enhancements to visualization capabilities, integrates tidy selectors for improved column handling, completes the migration to Ray for parallelism, and includes various performance optimizations and bug fixes. Key themes include better diagnostics plotting, more flexible data manipulation, and improved documentation.

New Features

  • Advanced Plotting Diagnostics: Added new APIs for visualizing time series diagnostics. These functions provide interactive and insightful plots to analyze correlation, seasonality, and trends.

    Function Description
    plot_acf_diagnostics Plot ACF, PACF, and CCF with interactive dropdowns.
    plot_seasonal_diagnostics Box/violin plots for seasonal features (e.g., hour, weekday, month).
    plot_stl_diagnostics Decompose series into observed, season, trend, remainder, and seasonally adjusted components.
    plot_time_series_boxplot Rolling distribution boxplots with optional smoothers.
    plot_time_series_regression Fit and visualize linear regressions with observed vs. fitted/residuals views.
  • Tidy Selectors Integration:

    • Added support for tidy selectors (contains, starts_with, ends_with, matches) in multiple functions.
    • Enables human-readable column selection (e.g., contains("date")).
    • Integrated into: acf_diagnostics, augment_adx, augment_atr, augment_bbands, augment_cmo, augment_diffs, augment_ewm, augment_fourier, augment_hilbert, augment_hurst_exponent, augment_lags, augment_leads, augment_macd, augment_ppo, augment_qsmomentum, augment_roc, augment_rsi, augment_stochastic_oscillator, augment_wavelet, future_frame, pad_by_time, seasonal_diagnostics, stl_diagnostics, summarize_by_time.
  • Human-Friendly Durations:

    • Added parse_human_duration for converting strings like "3 days" or "2 weeks" to offsets.
    • Integrated into lag/lead specs and bounds for future_frame and pad_by_time.
  • Ray Parallelism:

    • Completed migration to Ray for distributed computing.
    • Added Ray migration guide and multiprocessing support.
  • Plotly Theme:

    • Introduced theme_plotly_timetk for consistent styling on custom Plotly figures.
  • New Guides:

    • Added guides for Polars workflows, tidy selectors, and human-friendly periods.
    • Enhanced documentation with new examples and visual elements.

Improvements

  • Performance:

    • Speed improvements in future_frame and pad_by_time.
    • Migrated from hmmlearn to pomegranate for better performance in regime detection.
    • Upgraded pandas datetime aliases for compatibility.
  • Visualization:

    • Updated plotting APIs roadmap.
    • Enhanced examples in docs and docstrings.
  • Other:

    • Added fillna option to pad_by_time.
    • Improved sorting in plot_timeseries.
    • Added visual elements to docs.

Bug Fixes

  • Fixed dropdown issues in plot_acf_diagnostics.
  • Resolved missing multiprocessing in Ray setup.
  • Fixed bugs in Fourier, Hilbert, and Wavelet transforms.
  • Corrected connected smoothers in plot_timeseries.
  • GH Actions fixes.
  • Sorting review and fixes (#286).

Documentation

  • Multiple doc updates for accuracy and completeness.
  • Updated Quarto sidebar and YAML for better navigation.
  • New guides for selectors, dates, and Polars.
  • Enhanced examples and docstrings.

Distribution

  • Built and added pytimetk-2.4.0-py3-none-any.whl and pytimetk-2.4.0.tar.gz.

Breaking Changes

  • None identified in this release.

For full details, see the commit history. Thanks to @mdancho84 for all contributions!

Full Changelog: v2.3.0...v2.4.0

pytimetk 2.3.0

05 Nov 22:25

Choose a tag to compare

pytimetk 2.3.0 Release Notes

We're excited to announce the release of pytimetk 2.3.0! This update focuses on significant performance and memory optimizations, particularly for users leveraging the Polars engine. We've introduced native Polars implementations for several key functions, reducing reliance on pandas fallbacks and unlocking faster, more efficient operations on large datasets. Benchmarks show impressive gains, including up to ~7X speedups in EWM (Exponential Weighted Moving) calculations.

Key Improvements

  • Polars-Native Optimizations:

    • Added dedicated Polars paths for pad_by_time, future_frame, augment_ewm, augment_rolling_apply, augment_expanding_apply, and the scalar branch of apply_by_time. This eliminates unnecessary conversions to pandas, keeping data in Arrow buffers for zero-copy chaining and reduced memory overhead.
    • For EWM operations (augment_ewm), Polars users now benefit from direct ewm_mean/std/var applications, avoiding round-trips to pandas. Benchmark results on 200k rows demonstrate ~7X faster execution (0.004s vs. 0.030s in pandas).
    • Tightened conversion plumbing to use shallow copies (copy(deep=False)) only when necessary, reusing cached pandas views during fallbacks to minimize copy-on-write churn.
  • Memory Efficiency:

    • Implemented shallow pandas copies in various modules to avoid deep-copy overhead.
    • Polars fallbacks now clone prepared frames before mutations, preserving originals for reuse and further reducing memory spikes.
    • Enhanced row identifier handling inside Polars/cuDF, eliminating mutations on pandas frames.
  • cuDF Support:

    • Added cuDF dataframe operations, extending GPU-accelerated capabilities for compatible environments.
  • Benchmarking and Documentation:

    • New benchmarks (e.g., ewm_polars_profile.py) to quantify improvements—reproducible with commands like PYTHONPATH=src python zzz_local/benchmarks/ewm_polars_profile.py --rows 200000 --groups 25 --repeats 5.
    • Updated roadmap (performance_memory_review.md) with completed items and remaining todos, including Polars-native wide-format for apply_by_time and extending optimizations to finance indicators.

Breaking Changes

  • None in this release. Existing pandas-based workflows remain fully compatible, with Polars enhancements opt-in via the engine="polars" parameter.

Installation

Update via pip:

pip install --upgrade pytimetk

If you encounter any issues, please open a ticket on GitHub.

Full Changelog: v2.2.1...v2.3.0

pytimetk 2.2.1

05 Nov 17:17

Choose a tag to compare

pytimetk 2.2.1

Key Updates

  • augment_ewm Enhancements: Now supports passing a sequence of alpha values to compute multiple EWM columns in one call. Improved Polars integration and documentation.
  • augment_qsmomentum Fixes: Resolved issues with grouped Polars DataFrames, including proper handling of nulls and ensuring sufficient data length (at least max(roc_slow_period, returns_period)). Fixed AttributeError: 'DataFrame' object has no attribute 'to_list'.
  • GPU Acceleration Docs: Updated examples to demonstrate augment_rolling on Polars with GPU engine, including multi-symbol grouping.

Full Changelog: v2.2.0...v2.2.1

pytimetk 2.2.0

18 Oct 13:46

Choose a tag to compare

What's New

  • Polars .tk Accessor for LazyFrames: Added a .tk accessor for pl.LazyFrame objects, enabling direct access to pytimetk helpers (e.g., df.lazy().tk.augment_rolling(...)) with performance parity to pandas and Polars DataFrame paths. This enhances support for lazy evaluation pipelines, improving efficiency for large datasets.
  • GPU Installation Improvements: Updated GPU support with cudf-cu12 dependency for Python 3.10+ on Linux (x86_64 or aarch64), replacing the generic cudf dependency. This ensures better compatibility and performance for GPU-accelerated workflows.
  • Polars LazyFrame Compatibility: Extended Polars support to include LazyFrame checks in utility functions (e.g., check_date_column, check_value_column), ensuring seamless integration with lazy evaluation pipelines.

Enhancements

  • Refactored GroupBy Handling: Replaced direct .obj access on pandas GroupBy objects with resolve_pandas_groupby_frame across the codebase. This improves compatibility with accelerated backends like cudf.pandas, ensuring robust DataFrame extraction even when proxies are used.
  • Memory Optimization: Enhanced reduce_memory_usage to handle GroupBy objects more reliably by using resolve_pandas_groupby_frame and attempting categorical conversion only when safe, improving memory efficiency for large datasets.
  • Documentation Updates: Improved GPU acceleration guide (production/02_gpu_acceleration.html) with streamlined examples and updated changelog (changelog-news.html) to reflect new features and fixes.

Bug Fixes

  • Sorting Issue in GPU Workflows: Fixed a sorting issue in GPU-compatible operations to ensure consistent ordering in Polars and pandas pipelines.
  • Documentation Corrections: Addressed inconsistencies in the GPU guide documentation for clarity and accuracy.

Breaking Changes

  • The cudf dependency has been replaced with cudf-cu12 for GPU support, requiring users to update their installation commands to pip install pytimetk[gpu_cu12] --extra-index-url=https://pypi.nvidia.com for Python 3.10+ on supported Linux platforms.

Installation

To install pytimetk 2.2.0 with GPU support:

pip install pytimetk[gpu_cu12] --extra-index-url=https://pypi.nvidia.com
pip install "polars[gpu]" --extra-index-url=https://pypi.nvidia.com

Contributors

Full Changelog: v2.1.0...v2.2.0

pytimetk 2.1.0

16 Oct 13:36

Choose a tag to compare

Release Summary

The pytimetk 2.1.0 release introduces GPU acceleration (Beta) with NVIDIA RAPIDS, enhancing performance for feature engineering and Polars lazy pipelines. This release focuses on computational efficiency while maintaining backward compatibility with CPU-based workflows.


What's New in pytimetk 2.1.0

GPU Acceleration (Beta)

This release adds optional GPU acceleration for faster computation of:

  • Feature engineering (lags, differences, leads, rolling/expanding stats, financial indicators)
  • Polars lazy pipelines with collect(engine="gpu")

Key Features:

  • Opt-in: Activates only with cudf and polars[gpu] installed.
  • CPU Fallback: Automatically reverts to CPU if GPU is unavailable or unsupported.
  • Utilities: pytimetk.utils.gpu_support provides is_cudf_available() and is_polars_gpu_available() for runtime checks.

Installation

pip install pytimetk[gpu] --extra-index-url=https://pypi.nvidia.com
pip install "polars[gpu]" --extra-index-url=https://pypi.nvidia.com

See the GPU Acceleration Guide for setup details.

Example: GPU-Accelerated RSI

import polars as pl
import pytimetk as tk

df = tk.load_dataset("stocks_daily", parse_dates=["date"])
result = (
    pl.from_pandas(df.query("symbol == 'AAPL'"))
    .lazy()
    .tk.augment_rsi(date_column="date", close_column="close", periods=[14, 28])
    .collect(engine="gpu")
)

Documentation and Examples

  • New Guide: Added 02_gpu_acceleration.html for GPU setup and usage.
  • Examples: Included GPU-accelerated examples for augment_rolling and financial indicators.
  • Optimizations: Improved Polars lazy frame handling and group-by performance.

Performance Highlights

  • Rolling Calculations: Up to 10X–100X faster with Polars GPU engine for large datasets.
  • Financial Indicators: GPU-accelerated RSI/MACD 5X–50X faster on large datasets.

Backward Compatibility

  • No Breaking Changes: GPU is opt-in; CPU workflows are unchanged.
  • Fallback: Automatic CPU fallback for unsupported GPU operations.

Known Issues

  • Custom Functions: Not yet GPU-compatible; fall back to CPU.
  • Polars GPU: Limited to expression-based queries.
  • Documentation: Multi-GPU setups need further documentation.

Next Steps

Explore the GPU Acceleration Guide and report issues on the pytimetk GitHub.

Full Changelog: v2.0.1...v2.1.0

pytimetk 2.0.1

13 Oct 19:37

Choose a tag to compare

augment_spline now accepts date_column and value_column, validates both, sorts rows by date before basis generation, and reuses the original order once spline columns are computed for pandas and polars backends

Full Changelog: v2.0.0...v2.0.1

pytimetk 2.0.0

13 Oct 18:41

Choose a tag to compare

Release Date: October 13, 2025

Highlights

This major release introduces first-class support for Polars DataFrames across core, feature engineering, and plotting functions, enabling faster computations on large datasets. Additionally, we've added a beta Feature Store & Caching system to persist and reuse expensive feature engineering steps, with optional MLflow integration for reproducible pipelines. Other key additions include spline basis expansions and various documentation improvements.

New Capabilities

Feature Description Supported Backends Example Use Case
Polars .tk Accessor Direct access to core, feature engineering, and plotting functions on Polars DataFrames Polars df.tk.plot_timeseries(...)
Feature Store Persist and reuse feature sets with versioning and metadata pandas, Polars Cache time-series signatures for ML pipelines
MLflow Integration Log feature sets and artifacts for reproducible experiments pandas, Polars Track feature versions in MLflow experiments
Spline Features Generate spline basis expansions for non-linear modeling pandas, Polars Model non-linear trends in sales data
Financial Indicators Full Polars support for momentum and risk metrics pandas, Polars Calculate RSI or MACD on stock data

New Features

Polars now has 1st class support in pytimetk via a new tk accessor

  • New .tk accessor for Polars DataFrames, supporting core functions like augment_timeseries_signature, augment_rolling, summarize_by_time, and plotting helpers (plot_timeseries, plot_anomalies, etc.). Use engine='polars' for pandas-compatible functions to leverage Polars' performance. (#297)
  • Supports direct integration for financial indicators (e.g., ADX, ATR, BBands, MACD, PPO, RSI, ROC) and feature engineering (diffs, lags, leads).
  • Example usage:
import polars as pl
import pytimetk as tk
from datetime import date

# Sample Polars DataFrame
df_pl = pl.DataFrame(
    {
        "date": pl.date_range(
            start=date(2023, 1, 1),
            end=date(2023, 1, 10),
            interval="1d",
            eager=True,
        ),
        "value": [10, 12, 15, 14, 18, 20, 22, 19, 17, 16],
    }
)

# Use .tk accessor to add timeseries signature features
result = df_pl.tk.augment_timeseries_signature(date_column="date")

# Display result
print(result)

Feature Store & Caching (Beta)

  • New FeatureStore class to register, build, and load feature sets with automatic versioning and metadata. Supports local disk or pyarrow filesystems (e.g., S3). Optional MLflow logging for experiment tracking. (#308)
  • Example usage:
    import pandas as pd
    import pytimetk as tk
    
    df = tk.load_dataset("bike_sales_sample", parse_dates=["order_date"])
    store = tk.FeatureStore()
    store.register(
        "sales_signature",
        lambda data: tk.augment_timeseries_signature(data, date_column="order_date", engine="pandas"),
        default_key_columns=("order_id",),
        description="Calendar signatures for sales orders."
    )
    result = store.build("sales_signature", df)
    print(result.from_cache)  # False first run, True on subsequent builds

Spline Basis Expansions

  • Added augment_spline to generate spline features for numeric columns, useful for modeling non-linear relationships. Includes Polars support. (#300)
  • Example:
import pandas as pd
import polars as pl
import pytimetk as tk


df = tk.load_dataset('m4_daily', parse_dates=['date'])

df_spline = (
    df
        .query("id == 'D10'")
        .augment_spline(
            date_column='date',
            value_column='value',
            spline_type='bs',
            df=5,
            degree=3,
            prefix='value_bs'
        )
)

df_spline.head()

Comprehensive Testing

  • Added tests for financial momentum indicators (ADX, ATR, BBands, CMO, ROC, RSI, Stochastic Oscillator, etc.) and risk metrics to ensure reliability.

Improvements

  • Upgraded pandas_flavor to 0.7.0.
  • Refactored test folder structure to match source code.
  • Fixed deprecation warnings in anomalize tests.
  • Improved docstrings and examples for various functions.
  • Enhanced Polars compatibility, including direct integration for diffs, pct change, lags/leads, MACD, PPO, RSI, BBands, ROC, and more.
  • Added Hilbert transform bug fix and missing args for EWM.
  • Updated README with Polars plotting support and Feature Store details.
  • Pruned tracking on generated artifacts and added Polars 1.2.0 compatibility.

Bug Fixes

  • Fixed x-axis date labels in plot_timeseries functions. (#307)
  • Resolved failing GitHub Actions.
  • Fixed bugs in Hilbert transform and EWM args. (#297)
  • Addressed issues with time series sequence, summarize_by_time, filter_by_time, and core module for Polars.

Breaking Changes

  • Removed explicit import pytimetk.polars_namespace requirement.
  • Feature Store is in beta; APIs and on-disk format may change.

Documentation

  • Comprehensive updates for Polars support, augment_spline, and Feature Store.
  • New clustering tutorial for stock data analysis.
  • Added examples and improved docstrings throughout.

Upgrade via pip install pytimetk --upgrade. We welcome feedback and bug reports on GitHub.

Full Changelog: v1.2.5...v2.0.0

pytimetk 1.2.5

17 Aug 15:12

Choose a tag to compare

Fixes for polars: .over() requires partion_by, order_by

Full Changelog: v1.2.4...v1.2.5