Skip to content

Akajiaku11/Automated-ETL-Pipeline-for-Weather-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Automated ETL Pipeline for Weather Data

Overview

This project is an Automated ETL (Extract, Transform, Load) Pipeline designed to collect, process, and store weather data from various sources. The pipeline ensures seamless data integration for analytics, reporting, and visualization.

Features

  • Automated Data Extraction: Retrieves weather data from APIs, CSV files, or databases.
  • Data Transformation: Cleans, formats, and enriches the data to ensure consistency.
  • Efficient Data Loading: Stores the processed data into a database or a cloud storage solution.
  • Scheduled Execution: Uses cron jobs or workflow orchestration tools for automation.
  • Logging & Monitoring: Tracks data processing stages and errors.

Technologies Used

  • Python: Main scripting language for ETL tasks.
  • Pandas: Data manipulation and transformation.
  • SQL / PostgreSQL: Database storage.
  • Apache Airflow / Prefect: Workflow orchestration.
  • APIs / Web Scraping: For data extraction from online sources.
  • AWS S3 / Google Cloud Storage (Optional): Cloud-based data storage.

Installation

Prerequisites

  • Python 3.x installed
  • PostgreSQL or any other database service (optional)
  • Required Python packages (listed in requirements.txt)

Setup

  1. Clone the repository:
    git clone https://github.com/yourusername/Automated-ETL-Pipeline-for-Weather-Data.git
    cd Automated-ETL-Pipeline-for-Weather-Data
  2. Install dependencies:
    pip install -r requirements.txt
  3. Configure the .env file with API keys, database credentials, and other configurations.

Usage

Run the ETL script manually:

python main.py

Or schedule the pipeline using Airflow, cron jobs, or Prefect.

Project Structure

Automated-ETL-Pipeline-for-Weather-Data/
│-- src/
│   │-- extract.py       # Handles data extraction
│   │-- transform.py     # Cleans and processes data
│   │-- load.py          # Stores processed data
│   │-- config.py        # Configuration settings
│-- main.py              # Main execution script
│-- requirements.txt     # Dependencies
│-- README.md            # Project documentation

Contributing

Feel free to contribute by opening an issue or submitting a pull request.

License

This project is licensed under the MIT License.

Contact

For any inquiries, reach out via akajiakuflowz@gmail.com

About

This project implements an Automated ETL (Extract, Transform, Load) Pipeline that fetches daily weather data, processes it, and stores it locally. It uses the OpenWeatherMap API to extract current weather data for a given city

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors