Prediction Algorithms & Settings
The prediction algorithms are listed in order from highest to lowest with respect to correlation and accuracy. These algorithms are off-the-shelve, meaning that they are widely available and do not require adjustment. It is not within the scope of this document to explain the details of each, however, general descriptions are provided below. Each prediction algorithm has its own set of advanced algorithm settings. The defaults will generally yield the best results.
|
Gradient boosting tree |
Gradient boosting tree is an ensemble of weak prediction models, typically decision trees. It builds the model in a stage-wise fashion like other boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable loss function. The idea is that in the next learning iteration it will try to learn from mistakes made in the previous one. The Tweedie loss function allows for additional weight on zero. |
|
Random forest |
Like Gradient boosting tree, this is an ensemble learning method for classification and regression that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision trees' habit of overfitting to their training set. It has been shown to perform very well in many machine learning applications. |
|
Linear regression |
Two linear regression algorithms are available:
Assuming the relationship between the predicted curve and input curves is linear, linear regression algorithms try to fit the model such that the cost function is minimized. Normally lease square error is used for the cost function. |
The algorithm settings vary depending on the prediction algorithm. In general, the defaults yield the best results. However, if you have knowledge of machine learning, you can adjust these settings.
When you are Fitting a model to training data, you have to try to strike a balance between underfitting and overfitting.
If we overfit, then the model performs extraordinarily well on the training data but doesn’t generalize well when we try to use it on new data. If we underfit, then it doesn’t give accurate or useful predictions for any data set.
The following list includes all advanced algorithm settings:
Fitting a model
Fitting a model means that you're making your algorithm learn the relationship between predictors and outcome so that you can predict the future values of the outcome. So the best fitted model has a specific set of parameters which best defines the problem at hand.
|
Overfitting |
Occurs when a statistical model or machine learning algorithm captures the noise of the data. In other words, overfitting occurs when the model or the algorithm fits the data too well. |
|
Underfitting |
Occurs when a statistical model or machine learning algorithm cannot capture the underlying trend of the data. In other words, underfitting occurs when the model or the algorithm does not fit the data well enough. |