Visualizations created from Machine Learning Predictions
In addition to adding the calculated column to the data table, the application creates the following templates for you:
Model Analysis |
When the calculations are complete, Analytics Explorer opens on the Model Analysis tab. This tab has 2 charts:
|
Input Distributions |
Displays a histogram for each input variable showing the number of rows of input data for each value in the variable. |
Predicted Optimal Input Summary |
Displays a box chart of each input column showing key statistical measures, the Q1, median, and Q3 values. the actual values are listed below each chart. |
Note that each template produces its own visualizations tailored to specific workflows. Online resources and templates include the following:
- Analytics Explorer 2022.1
- Production Prediction Workflow
- Geometric Well Spacing
- Log Coverage template
Predicted Optimal Input Distributions
The idea behind the predicted optimal distributions is that you select an output column that you wish to maximize (Crude output over a well’s first 12 months, for example). If you choose to perform the feature analysis, the algorithm gives you a numeric range (on the boxplot) and visual distributions for each important variable (defined as being inside the top 90% total importance). These distributions show the predicted values that will lead to the highest output values for the selected label column.
The numeric range is obviously useful at a glance, but the distributions help show how strong the relationship is. The tighter the distribution, the smaller the confidence interval, and the more sure that you can be that those values will produce high output values. If the distribution is very wide or is multi-modal, then the underlying relationship might not be as strong.