Skip to content

Hamza-Abbasi222/web_scraping_project

Repository files navigation

Web Scrapping project break down into 16 parts. Project Breakdown and Understanding



  1. T1_automation_script.py Python Automation with Pandas Library:

Use Python and pandas to automate report creation and table extraction from websites. Tasks: Fetch data from websites. Process and clean the data using pandas. Generate reports in various formats (CSV, Excel).



  1. T1_scraping_script.py Extracting Tables from PDFs using Camelot:

Learn to extract tables from PDFs using the Camelot library. Tasks: Install and configure Camelot. Extract tables from sample PDF files. Save the extracted data into a pandas DataFrame.



  1. T3_xpath_example.py HTML Basics and XPath for Web Scraping:

Understand HTML structure and XPath syntax for effective web scraping. Tasks: Learn HTML tags and structure. Practice writing XPath queries to select elements.



  1. T4_selenium_automation.py Automating Websites Using Selenium 4:

Automate website interactions using Selenium 4. Tlasks: Install Selenium and configure WebDriver. Write scripts to interact with web pages (click, fill forms, etc.).



  1. T5_Automating_Websites_Using_Selenium4.py Extracting Specific Elements Using XPath:

Extract specific elements from a website using XPath. Tasks: Use Selenium to navigate websites. Use XPath to extract specific elements (e.g., headlines, product details).



6.T6_scrape_headlines.py Exporting Headlines to CSV Using Pandas:

Extract headlines from a website and export them to a CSV file. Tasks: Scrape headlines using Beautiful Soup or Selenium. Store the data in a pandas DataFrame and export it as a CSV.



  1. T7_main_script.py Preparing Script for Executable File:

Prepare a Python script for conversion into an executable file. Tasks: Use tools like PyInstaller to convert scripts to executables.


  1. T8_file_name_with_f_string.py Customizing File Names Using F-strings:

Customize file names using f-strings and concatenate variables. Tasks: Learn to use f-strings for dynamic file names. Implement this in your script.



  1. T9_schedule_task.py Scheduling an Executable File to Run at Any Time:

Schedule an executable file to run using tools like Cron (Linux) or Task Scheduler (Windows). Tasks: Set up a scheduled task for your executable.



  1. T10_pivot_table_to_excel.py Creating a Pivot Table and Exporting It to an Excel File:

Create a pivot table and export it to an Excel file. Tasks: Use pandas to create a pivot table. Export the pivot table to an Excel file.



  1. T11_bar_chart_sales.py Creating a Bar Chart for Sales by Product Line:

Create a bar chart for sales by product line. Tasks: Use matplotlib or seaborn to create bar charts. Integrate this into your report generation script.



  1. T12_create_excel_with_charts.py Creating Bar Charts and Formulas Using OpenPyXL in Python:

Create bar charts and formulas in a spreadsheet using OpenPyXL. Tasks: Learn to use OpenPyXL to manipulate Excel files. Add charts and formulas programmatically.



  1. T13_combined_script_for_charts.py Converting Python Script to Executable:

Convert a Python script to an executable file using tools like PyInstaller. Tasks: Package your script with PyInstaller.



  1. T14_automate_reports_and_whatsapp.py Automating Excel Reports and Sending WhatsApp Messages Using Python:

Automate the creation of Excel reports and send WhatsApp messages. Tasks: Use pandas and OpenPyXL for Excel report automation. Use a library like Twilio or PyWhatKit for WhatsApp automation.



  1. T15_send_whatsapp_messages_to_ind_or_grp.py Sending Messages in WhatsApp to Contacts and Groups Using Pi WhatKit:

Send messages in WhatsApp to contacts and groups using Pi WhatKit. Tasks: Install and configure Pi WhatKit. Write scripts to send messages.



  1. T16_Project.py Final Project:

Combine all the skills learned to create a comprehensive web scraping project. Tasks: Use Beautiful Soup or Scrapy to scrape book details from a website. Use Selenium for browser automation. Store the scraped data in a database (Firebase, MSSQL, MySQL). Next Steps Let's start with the first task: Python Automation with Pandas Library. Here, we'll automate the creation of reports and extraction of tables from websites.

Task 1: Python Automation with Pandas Library Fetch Data from Websites:

Identify a target website with tabular data. Use requests or Beautiful Soup to fetch the data. Process and Clean Data Using Pandas:

Load the data into a pandas DataFrame. Perform any necessary data cleaning and transformation. Generate Reports:

Create summary reports using pandas. Export the reports to CSV or Excel.

About

This project involves a series of tasks related to web scraping, data extraction, and automation using Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors