Contained within this repository is the final project for the IBM Applied Data Science Capstone. The full presentation of the work can be found in the pdf presentation: capstone_presentation.pdf. This readme serves as a concise summary of the steps taken during this work.
The data was collected in primarily two ways. First was using spacex API calls, which can be found in:
2_spacex_data_collecton_webscraping.ipynb
Second was using web scraping, as can be seein in:
2_spacex_data_collecton_webscraping.ipynb
In this section of work, the data was explored using PANDAS, SQL queries, and initial data visualizations using matplotlib and seaborn
Multiple informative plots from this EDA were made to gain a quick understanding of the data. One such example is launch site vs flight number, with the points colored based on success rate. Here you can see as time goes on more successful launches are made.
Following EDA, more advanced methods of reporting data was explored using interactive maps and creating an interactive dashboard.
One such example is a folium map showing the number of successful and failed launches for a single site, as can be seen below.
Finally, using the data provided four machine learning models were investigated to see how well we can model and predict the success of a launch based on the available features.
8_predictive_models.ipynb
The results of each model were compared against each other. Here below is one example, the confusion matrix for a simple logistic regression model.
