This project involves analyzing the employee data of Pewlett Hackard, a fictional company, from the 1980s and 1990s. The available dataset consists of six CSV files containing employee, department, and salary information. The project is structured into three key phases:
- Data Modeling – Designing an Entity Relationship Diagram (ERD) and defining the database schema.
- Data Engineering – Creating the database and tables, then importing the CSV data.
- Data Analysis – Writing SQL queries to extract insights from the dataset.
- Primary Keys:
- Each table has a uniquely identifying column (
emp_no,dept_no,title_id).
- Each table has a uniquely identifying column (
- Foreign Keys:
emp_title_idinemployeesreferencestitlesto ensure employees have valid job titles.emp_noinsalariesreferencesemployees, ensuring every salary entry belongs to a valid employee.dept_noindept_empanddept_managerensures valid department assignments.
- Many-to-Many Relationships:
dept_empenables employees to be associated with multiple departments.dept_managertracks managers of different departments.
- Cascade Deletion (
ON DELETE CASCADE):- When an employee is removed, their salary, department assignments, and management roles are also deleted automatically.
The ERD clearly shows the relationships of the database.
- The employees_schema.sql file contains the SQL commands to create tables with appropriate data types, primary keys, foreign keys, and constraints.
- Tables were created in a structured order to prevent foreign key conflicts.
- The CSV files are located in the EmployeeSQL folder.
- Each CSV file was imported into its respective table in the correct order to maintain data integrity.
After successfully importing the data, SQL queries were executed to answer the following business questions:
- List all employees with their salaries.
- Employees hired in 1986.
- Department managers with employee details.
- Department details for each employee.
- Employees named 'Hercules' with a last name starting with 'B'.
- Employees working in the Sales department.
- Employees working in Sales and Development departments.
- Frequency counts of common last names.
The employees_queries.sql file contains all the queries used to analyze the data.
