Data table and column selection

To get started select the following in the Machine Learning Predictions dialog box:

Data table

This is the table on which you are going to run the calculations. Supported tables are Log Data, Grids, Zone, Horizon, and Interval.

The default is to use all data in the selected table. However, if you want to limit the data to some criteria, a specific producer for example, check Limit data using markings. In this case, the predictions will be based only on the marked data.

When the calculations are finished, 2 columns are written to the Data table:

Filled <column name>(prediction algorithm)—keeps any data from the column to calculate, and fills in the remaining rows with the predicted values. Comparing original values with predicted values can help determine the accuracy and validity of the predicted values.

Predicted <column name>(prediction algorithm)—replaces all data with predicted values.

Attribute to calculate

Available attributes are listed. Note that you cannot run calculations on certain columns.

Input attributes

Select the attributes (columns) to use in the calculations. In general, you want to select columns that are relevant to the output column and have enough values to contribute to the model. You can use columns with text values, but if the column has more than 50 unique strings, the column will not be used in the calculations.

You can save your selections as a template for use in another run.

Attribute analysis helps you determine the attributes required to yield the lowest mean average error (MAE).

These are the only selections you need to make. The Prediction Algorithm & Settings defaults generally produce the best results. The other options are optional, but can help minimize the mean average error and required input attributes.

Other options include:

Hyperparameter Tuning

Attribute Analysis

Impact Plots

Optimal Training Dataset