Exercise 8.6

The algorithm to fit a regression tree is as follows:

  1. Use the recursive binary splitting to obtain a large tree. Consider a minimum number of observations in the terminal nodes as a criterion to stop the recursive binary splitting.
  2. Apply the cost complexity pruning to get the optimum subtree. The cost complexity pruning is defined as a function of \alpha, which works as a tuning parameter.
  3. The tuning parameter \alpha is determined using the K-fold cross-validation. A training and a test set are then defined. In each fold, the training set is splitten applying the recursive binary splitting. To the resulting tree, the cost complexity pruning is applied. Then, for a different set of specific \alphas, the mean squared prediction error on the test set is evaluated. The results for the different folds and for the different \alphas are averaged, being \alpha the value that minimizes the average error.
  4. Once we get the \alpha value, we replace it in Step 2 equation to get the optimum subtree.

Some important definitions: