Western Kentucky University
Grade Level at Time of Presentation
Senior
Major
Business Data Analytics
Institution 24-25
Western Kentucky University
KY House District #
2
KY Senate District #
32
Faculty Advisor/ Mentor
Lily Popova Zhuhadar, PhD.
Department
Analytics & Information Systems
Abstract
This research develops a predictive model to forecast college student dropout rates using RapidMiner™, with the goal of identifying at-risk students early to facilitate timely interventions. Using a Kaggle dataset that includes 4,424 student records and 36 academic, demographic, and socioeconomic features, the study applies machine learning techniques to classify students as either graduates, dropouts, or currently enrolled. The model reveals critical factors influencing dropout rates, such as tuition fees, scholarship status, and second-semester grades. The Logistic Regression model, chosen based on accuracy and performance metrics, shows promising results with a precision of 82.64% and recall of 79.55% for predicting dropouts.
This research is particularly relevant for Kentucky senators as it addresses the growing issue of student retention, which directly impacts both the individual student’s future and the state’s economic growth. By identifying at-risk students early, the model enables Kentucky’s higher education institutions to implement targeted interventions that can reduce dropout rates, improve graduation rates, and ultimately create a more educated workforce.
Given that nearly 40% of students in the U.S. drop out before graduation, the findings of this study provide a valuable tool for improving retention in Kentucky, ensuring that state resources are utilized effectively, and supporting the state’s long-term economic development.
This model can aid policymakers in developing more effective strategies for student success, helping to reduce the financial burden on the state while promoting a more educated, productive workforce for Kentucky’s future.
Included in
Predicting College Student Dropout Rates Using Predictive Analytics
This research develops a predictive model to forecast college student dropout rates using RapidMiner™, with the goal of identifying at-risk students early to facilitate timely interventions. Using a Kaggle dataset that includes 4,424 student records and 36 academic, demographic, and socioeconomic features, the study applies machine learning techniques to classify students as either graduates, dropouts, or currently enrolled. The model reveals critical factors influencing dropout rates, such as tuition fees, scholarship status, and second-semester grades. The Logistic Regression model, chosen based on accuracy and performance metrics, shows promising results with a precision of 82.64% and recall of 79.55% for predicting dropouts.
This research is particularly relevant for Kentucky senators as it addresses the growing issue of student retention, which directly impacts both the individual student’s future and the state’s economic growth. By identifying at-risk students early, the model enables Kentucky’s higher education institutions to implement targeted interventions that can reduce dropout rates, improve graduation rates, and ultimately create a more educated workforce.
Given that nearly 40% of students in the U.S. drop out before graduation, the findings of this study provide a valuable tool for improving retention in Kentucky, ensuring that state resources are utilized effectively, and supporting the state’s long-term economic development.
This model can aid policymakers in developing more effective strategies for student success, helping to reduce the financial burden on the state while promoting a more educated, productive workforce for Kentucky’s future.