Skip to content

harshitt13/Stock-Market-Prediction-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hybrid Stock Market Prediction Model

GitHub last commit GitHub repo size GitHub stars wakatime

🎯 Project Overview

This project implements an advanced stock price prediction model utilizing a Stacking Meta-Ensemble. The system combines three sophisticated machine learning approaches:

  1. Tree Ensemble (XGBoost + Random Forest)
  2. Deep Bidirectional LSTM (PyTorch)
  3. Time-Series Transformer (PyTorch)

A non-linear XGBoost meta-learner evaluates and optimally weights the predictions of these three models to generate highly accurate future price predictions with Volatility-Adjusted 95% confidence intervals.

Video Demonstration - https://youtu.be/z8sXhWrwU0o

💻 Features

  • Macro-Economic Engine: Retrieves data from Yahoo Finance and enriches the target stock with global indicators: S&P 500 (^GSPC), Volatility Index (^VIX), and Treasury Yield (^TNX).
  • Time-Series Transformer: Deep neural architecture using nn.TransformerEncoder with custom Positional Encoding to capture long-term chaotic trends.
  • Tree Ensemble: Uses a VotingRegressor combining tuned XGBoost and Random Forest.
  • Bidirectional LSTM: Deep 3-layer architecture with Batch Normalization, Dropout, Huber Loss, and learning rate scheduling in PyTorch.
  • Bayesian Optimization (Optuna): Optional CLI flag to dynamically run hundreds of mathematical trials to find the absolute perfect hyperparameter architecture for a specific stock.
  • Non-Linear Stacking Meta-Ensemble: An XGBoost meta-learner learns the optimal combination weights of the underlying sub-models, utilizing real-time VIX to protect against market panic.
  • Comprehensive Evaluation: Built-in module scoring RMSE, MAE, MAPE, R², and Directional Accuracy.
  • Visualization Dashboard: Automatically generates professional, dark-themed charts containing actual vs predicted data, residual histograms, future forecasts, and feature importance.
  • CLI Interface: Dynamic command-line inputs for custom tickers, date ranges, and forecast windows.

📚 Prerequisites

  • Python 3.x
  • PyTorch (torch)
  • XGBoost
  • scikit-learn
  • yfinance
  • pandas
  • numpy
  • matplotlib
  • joblib

🚀 Installation

  1. Clone the repository:
git clone https://github.com/harshitt13/Stock-Market-Prediction-Model.git
cd Stock-Market-Prediction-Model
  1. Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
  1. Install required packages:
pip install -r requirements.txt

📂 Project Structure

Stock-Market-Prediction-Model/
│
├── data/                    # Raw, predicted CSVs and model comparison tables
├── src/                     
│   ├── fetch_data.py        # Data retrieval and 27-feature engineering
│   ├── tree_model.py        # XGBoost + RF ensemble implementation
│   ├── lstm_model.py        # Deep BiLSTM implementation (PyTorch)
│   ├── evaluate.py          # Metrics module (MAPE, R², RMSE, DA)
│   ├── visualize.py         # Matplotlib dashboard generator
│   └── main.py              # CLI orchestrator & stacking meta-ensemble
├── models/                  # Saved .pkl and .pt model files
├── tests/                   # Pytest suite
├── images/                  # Generated performance and forecast visualizations
├── requirements.txt         
├── pytest.ini               
└── README.md                

🔧 Usage

The entire pipeline (fetching, training both sub-models, creating the meta-ensemble, and evaluation) is executed via the CLI.

python src/main.py --ticker AAPL --start 2015-01-01 --days 30

Command Line Arguments

  • --ticker: The stock symbol to predict (default: AAPL).
  • --start: Historical data start date in YYYY-MM-DD (default: 2010-01-01).
  • --days: Number of future business days to project (default: 30).
  • --optimize: Triggers Optuna to perform Bayesian hyperparameter searching before training.

⚖️ Model Components

  1. Time-Series Transformer (transformer_model.py)

    • The true standard of 2024 AI capabilities. Uses Multi-Head Attention to look directly at complex, long-term chart trends without fading memory.
  2. Bidirectional LSTM (lstm_model.py)

    • Deep neural network capturing complex temporal dependencies over a 90-day lookback window.
  3. Tree Model (tree_model.py)

    • Combines XGBoost and Random Forest. Learns non-linear feature interactions and isolates the most important technical indicators.
  4. Hybrid Meta-Ensemble (main.py)

    • Uses an XGBoost Regressor on the sub-models' validation outputs.
    • Fed with the Volatility Index (VIX), it learns to inherently shift its trust between the Deep Neural Networks and the Tree Ensemble depending on current market panic.

📏 Performance Metrics

In comprehensive historical backtesting (e.g., AAPL 2022-2024), the Hybrid XGBoost Meta-Learner achieves state-of-the-art precision:

  • Price Magnitude Accuracy: ~99.11% (MAPE of 0.89%)
  • Directional Accuracy: ~59.46% (Predicting the correct up/down direction of the next trading day)
  • R-squared (R²) Score: 0.975 (Highly correlated to true data variance)

Models are evaluated via:

  • Mean Squared Error (MSE) / Root Mean Squared Error (RMSE)
  • Mean Absolute Error (MAE)
  • Mean Absolute Percentage Error (MAPE)
  • R-squared (R²) Score
  • Directional Accuracy (%)

Comparison tables are saved to data/model_comparison.csv and visualized in the images/ directory.

🧪 Testing

Run the test suite via pytest to verify the components are functioning properly:

python -m pytest tests/

🤝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

⚠️ Limitations

  • Stock predictions are inherently probabilistic. This model utilizes advanced technicals and macroeconomic contexts, but omits fundamental balance sheet analysis (P/E, Debt/Equity) and NLP sentiment (news headlines).
  • Model performance depends heavily on structural market regimes.
  • Past performance does not guarantee future results. Do not use for real financial trading.

📝 License

Distributed under the MIT License. See LICENSE for more information.

📫 Contact

Github

Harshit Kushwaha 🧑‍💻
Developer

📧 find.harshitkushwaha@gmail.com


About

The ultimate open-source quantitative forecasting engine. Features a PyTorch Time-Series Transformer, Bayesian architecture optimization, and dynamic Volatility-Aware (VIX) Meta-Learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages