Load Package And Data

Training

1. How to train the model and interpret the results?

Once you have chosen an algorithm, building the model is fairly easy using the train() function

train() does multiple other things like:

  1. Cross validating the model
  2. Tune the hyper parameters for optimal model performance
  3. Choose the optimal model based on a given evaluation metric
  4. Preprocess the predictors (what we did so far using preProcess())

2. How to compute variable importance?

Which variables came out to be useful?

Tuning

1. Preprocess the test dataset and predict

The pre-processing in the following sequence:

Missing Value imputation –> One-Hot Encoding –> Range Normalization

All the information required for pre-processing is stored in the respective preProcess model and dummyVar model.

pass the testData through these models in the same sequence:

preProcess_missingdata_model –> dummies_model –> preProcess_range_model

2. Predict on testData and Confusion Matrix