USA Used Car Price Prediction Using Machine Learning
Institution
Eastern Kentucky University
KY House District #
6
KY Senate District #
22
Faculty Advisor/ Mentor
Dae Wook Kim, PhD
Department
Computer Science and Information Technology
Abstract
The used car market is on the increase due to many economic factors. New car sale prices are set by the manufacturer, so their prices are consistent with their actual market value. However, prices in the used car market are set by the dealer. So their prices are not necessarily consistent with their market value. Because of this, a model that can accurately predict the actual market value of a used car would be useful for both the seller and the buyer. Our study pertains to analysis and predictive modeling of the dataset containing information on the online auction sales of vehicles in the United States on the site AuctionExports.com to predict the price of a vehicle sold in an online auction. With machine learning and statistics analysis, we can train the model to account for as many variables as there are in the dataset. We used visualization tools to examine how the other variables interacted with the auction sale price variable and to use statistical test to find which variables had a significant pull on the price. We found that factors such as brand, age, title status, base color, state, remaining hours, and mileage were the best predictors of price. We used linear regression models to fit the data to predict the price based on the selected factors, utilizing simple linear models, stepwise functions, and normalization techniques, random forest and neural networks. By comparing metrics such as root mean square error (RMSE) and R-Squared, this led us to find the most effective models, random forest and neural network resulted in a satisfactory prediction, in terms a model that achieved an r-squared value of .5 or more and a lower RMSE than other models. So, an approach to predicting used car sale prices would benefit by using machine learning.
USA Used Car Price Prediction Using Machine Learning
The used car market is on the increase due to many economic factors. New car sale prices are set by the manufacturer, so their prices are consistent with their actual market value. However, prices in the used car market are set by the dealer. So their prices are not necessarily consistent with their market value. Because of this, a model that can accurately predict the actual market value of a used car would be useful for both the seller and the buyer. Our study pertains to analysis and predictive modeling of the dataset containing information on the online auction sales of vehicles in the United States on the site AuctionExports.com to predict the price of a vehicle sold in an online auction. With machine learning and statistics analysis, we can train the model to account for as many variables as there are in the dataset. We used visualization tools to examine how the other variables interacted with the auction sale price variable and to use statistical test to find which variables had a significant pull on the price. We found that factors such as brand, age, title status, base color, state, remaining hours, and mileage were the best predictors of price. We used linear regression models to fit the data to predict the price based on the selected factors, utilizing simple linear models, stepwise functions, and normalization techniques, random forest and neural networks. By comparing metrics such as root mean square error (RMSE) and R-Squared, this led us to find the most effective models, random forest and neural network resulted in a satisfactory prediction, in terms a model that achieved an r-squared value of .5 or more and a lower RMSE than other models. So, an approach to predicting used car sale prices would benefit by using machine learning.