Project information
- Category: Machine Learning, Regression
- Role: Colaborator
- Project date: April 2023
- Project URL: Paper
Predicting Covid-19 cases
Developed a XGBoost model to predict Covid-19 cases in Ohio using country level Covid-19 cases and deaths and topic-awareness variable scores.
Performed exploratory data analysis, PCA, correlation matrix, and feature engineering to identify meaningful patterns and trends in the data that acted as a good starting point for the modeling process.
A variety of regression models were used to predict the Covid-19 cases, namely Linear, Logistic, Polynomial, RF, SVM, Gradient Boosting, Bagging, XGBoost, Lasso, and ExtraTreesRegressor.
Further, the ensemble method of weighted average was used using Logistic Regression, Decision Tree, and Random Forest in order to get the best results.
The best R2 score of 91.927% was achieved for XGBoost model with 100 estimators and therefore, XGBoost model was used to publish the final predicted values.