Western Kentucky University

Grade Level at Time of Presentation

Senior

Major

Business Data Analytics

Institution 24-25

Western Kentucky University

KY House District #

2

KY Senate District #

32

Department

Analytics & Information Systems

Abstract

This research develops a predictive model to forecast college student dropout rates using RapidMiner™, with the goal of identifying at-risk students early to facilitate timely interventions. Using a Kaggle dataset that includes 4,424 student records and 36 academic, demographic, and socioeconomic features, the study applies machine learning techniques to classify students as either graduates, dropouts, or currently enrolled. The model reveals critical factors influencing dropout rates, such as tuition fees, scholarship status, and second-semester grades. The Logistic Regression model, chosen based on accuracy and performance metrics, shows promising results with a precision of 82.64% and recall of 79.55% for predicting dropouts.

This research is particularly relevant for Kentucky senators as it addresses the growing issue of student retention, which directly impacts both the individual student’s future and the state’s economic growth. By identifying at-risk students early, the model enables Kentucky’s higher education institutions to implement targeted interventions that can reduce dropout rates, improve graduation rates, and ultimately create a more educated workforce.

Given that nearly 40% of students in the U.S. drop out before graduation, the findings of this study provide a valuable tool for improving retention in Kentucky, ensuring that state resources are utilized effectively, and supporting the state’s long-term economic development.

This model can aid policymakers in developing more effective strategies for student success, helping to reduce the financial burden on the state while promoting a more educated, productive workforce for Kentucky’s future.

Share

COinS
 

Predicting College Student Dropout Rates Using Predictive Analytics

This research develops a predictive model to forecast college student dropout rates using RapidMiner™, with the goal of identifying at-risk students early to facilitate timely interventions. Using a Kaggle dataset that includes 4,424 student records and 36 academic, demographic, and socioeconomic features, the study applies machine learning techniques to classify students as either graduates, dropouts, or currently enrolled. The model reveals critical factors influencing dropout rates, such as tuition fees, scholarship status, and second-semester grades. The Logistic Regression model, chosen based on accuracy and performance metrics, shows promising results with a precision of 82.64% and recall of 79.55% for predicting dropouts.

This research is particularly relevant for Kentucky senators as it addresses the growing issue of student retention, which directly impacts both the individual student’s future and the state’s economic growth. By identifying at-risk students early, the model enables Kentucky’s higher education institutions to implement targeted interventions that can reduce dropout rates, improve graduation rates, and ultimately create a more educated workforce.

Given that nearly 40% of students in the U.S. drop out before graduation, the findings of this study provide a valuable tool for improving retention in Kentucky, ensuring that state resources are utilized effectively, and supporting the state’s long-term economic development.

This model can aid policymakers in developing more effective strategies for student success, helping to reduce the financial burden on the state while promoting a more educated, productive workforce for Kentucky’s future.

 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.