This project implements an advanced stock price prediction model utilizing a Stacking Meta-Ensemble. The system combines three sophisticated machine learning approaches:
- Tree Ensemble (XGBoost + Random Forest)
- Deep Bidirectional LSTM (PyTorch)
- Time-Series Transformer (PyTorch)
A non-linear XGBoost meta-learner evaluates and optimally weights the predictions of these three models to generate highly accurate future price predictions with Volatility-Adjusted 95% confidence intervals.
Video Demonstration - https://youtu.be/z8sXhWrwU0o
- Macro-Economic Engine: Retrieves data from Yahoo Finance and enriches the target stock with global indicators: S&P 500 (
^GSPC), Volatility Index (^VIX), and Treasury Yield (^TNX). - Time-Series Transformer: Deep neural architecture using
nn.TransformerEncoderwith custom Positional Encoding to capture long-term chaotic trends. - Tree Ensemble: Uses a
VotingRegressorcombining tuned XGBoost and Random Forest. - Bidirectional LSTM: Deep 3-layer architecture with Batch Normalization, Dropout, Huber Loss, and learning rate scheduling in PyTorch.
- Bayesian Optimization (Optuna): Optional CLI flag to dynamically run hundreds of mathematical trials to find the absolute perfect hyperparameter architecture for a specific stock.
- Non-Linear Stacking Meta-Ensemble: An XGBoost meta-learner learns the optimal combination weights of the underlying sub-models, utilizing real-time VIX to protect against market panic.
- Comprehensive Evaluation: Built-in module scoring RMSE, MAE, MAPE, R², and Directional Accuracy.
- Visualization Dashboard: Automatically generates professional, dark-themed charts containing actual vs predicted data, residual histograms, future forecasts, and feature importance.
- CLI Interface: Dynamic command-line inputs for custom tickers, date ranges, and forecast windows.
- Python 3.x
- PyTorch (
torch) - XGBoost
- scikit-learn
- yfinance
- pandas
- numpy
- matplotlib
- joblib
- Clone the repository:
git clone https://github.com/harshitt13/Stock-Market-Prediction-Model.git
cd Stock-Market-Prediction-Model- Create a virtual environment (optional but recommended):
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`- Install required packages:
pip install -r requirements.txtStock-Market-Prediction-Model/
│
├── data/ # Raw, predicted CSVs and model comparison tables
├── src/
│ ├── fetch_data.py # Data retrieval and 27-feature engineering
│ ├── tree_model.py # XGBoost + RF ensemble implementation
│ ├── lstm_model.py # Deep BiLSTM implementation (PyTorch)
│ ├── evaluate.py # Metrics module (MAPE, R², RMSE, DA)
│ ├── visualize.py # Matplotlib dashboard generator
│ └── main.py # CLI orchestrator & stacking meta-ensemble
├── models/ # Saved .pkl and .pt model files
├── tests/ # Pytest suite
├── images/ # Generated performance and forecast visualizations
├── requirements.txt
├── pytest.ini
└── README.md
The entire pipeline (fetching, training both sub-models, creating the meta-ensemble, and evaluation) is executed via the CLI.
python src/main.py --ticker AAPL --start 2015-01-01 --days 30--ticker: The stock symbol to predict (default:AAPL).--start: Historical data start date inYYYY-MM-DD(default:2010-01-01).--days: Number of future business days to project (default:30).--optimize: Triggers Optuna to perform Bayesian hyperparameter searching before training.
-
Time-Series Transformer (
transformer_model.py)- The true standard of 2024 AI capabilities. Uses Multi-Head Attention to look directly at complex, long-term chart trends without fading memory.
-
Bidirectional LSTM (
lstm_model.py)- Deep neural network capturing complex temporal dependencies over a 90-day lookback window.
-
Tree Model (
tree_model.py)- Combines XGBoost and Random Forest. Learns non-linear feature interactions and isolates the most important technical indicators.
-
Hybrid Meta-Ensemble (
main.py)- Uses an XGBoost Regressor on the sub-models' validation outputs.
- Fed with the Volatility Index (VIX), it learns to inherently shift its trust between the Deep Neural Networks and the Tree Ensemble depending on current market panic.
In comprehensive historical backtesting (e.g., AAPL 2022-2024), the Hybrid XGBoost Meta-Learner achieves state-of-the-art precision:
- Price Magnitude Accuracy: ~99.11% (MAPE of 0.89%)
- Directional Accuracy: ~59.46% (Predicting the correct up/down direction of the next trading day)
- R-squared (R²) Score: 0.975 (Highly correlated to true data variance)
Models are evaluated via:
- Mean Squared Error (MSE) / Root Mean Squared Error (RMSE)
- Mean Absolute Error (MAE)
- Mean Absolute Percentage Error (MAPE)
- R-squared (R²) Score
- Directional Accuracy (%)
Comparison tables are saved to data/model_comparison.csv and visualized in the images/ directory.
Run the test suite via pytest to verify the components are functioning properly:
python -m pytest tests/- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Stock predictions are inherently probabilistic. This model utilizes advanced technicals and macroeconomic contexts, but omits fundamental balance sheet analysis (P/E, Debt/Equity) and NLP sentiment (news headlines).
- Model performance depends heavily on structural market regimes.
- Past performance does not guarantee future results. Do not use for real financial trading.
Distributed under the MIT License. See LICENSE for more information.