Skip to content

Latest commit

 

History

History
46 lines (25 loc) · 1.5 KB

README.md

File metadata and controls

46 lines (25 loc) · 1.5 KB

allstate

DataMining-AllState For our project we entered the Kaggle Allstate Purchase Prediction Challenge. This challenge required us to create a model that predicts the exact insurance policy that a customer will purchase. The leading team is currently at 54.5% accuracy. Our team is at 53.8%. We used Naives Bayes classifier to get this far, but we are now experimenting with SVM and Random Forests to try to improve accuracy.

File List:

.. config: JSON configuration files for NB features

data data in different formats with submttions and additional accuracy logs

RF_output_bucket.csv Random Forest output

change_over.py Initial data tranformation script to see how people change options over time

change_over_first.py permutation of the above

change_over_no_demographic_change.py Final transformation script that flattens data based on internal options

flatten.py first flattening script using Wrangler

naive_bayse.py Working version of NB Classifier

playtest100000_v2_first_0_last_1.csv data input for RF

playtest10000_v2_first_0_last_1.csv data input for RF

playtest1000_v2_first_0_last_1.csvdata input for RF

playtest_v2_first_0_last_1.csv data input for RF

playtrain100000_first_0_last_1.csv data input for RF

playtrain10000_first_0_last_1.csv data input for RF

playtrain1000_first_0_last_1.csv data input for RF

playtrain_first_0_last_1.csv data input for RF

predictions_for_A_1000.csv data input for RF

predictions_for_A_10000.csv data input for RF

renu_forest.py Random Forest Classifer