1. Introduction
The findings of this study offer a more accurate representation of orebodies based on MWD data, resulting in an order of magnitude increase in spatial resolution compared to RC and diamond drill hole-based geological models. This advancement has been achieved without the need for additional exploration drilling. The proposed approach holds promise for mine technical services personnel seeking cost-effective and high-resolution delineation of subsurface rock conditions, thereby improving the efficiency and productivity of mining production.
2. Methods
2.1. Geological Setting
The current work investigates a single pit within the geological characteristics of the Brockman Formation (BR). A combination of 12 diamond core drill holes and 211 RC drill holes were used to characterize the pit’s subsurface geological conditions. The diamond and RC holes totaled 1089 and 16,880 drill meters, respectively, with an average depth of 90 m and 80 m per hole, respectively. Field observations were employed to log information concerning rock type, weathering profile, rock strength, stratigraphic unit, and Geological Strength Index (GSI). There was no need for further data engineering on the resource-definition data due to prior scrutiny of these datasets through the mining company’s internal assurance procedures.
2.2. Geotechnical Field Observation Categories
2.3. MWD Systems
Multiple drilling variables were recorded by the MWD system, including the penetration rate (rop; m/s), the torque (tor; Nm), the force on bit (fob; kgf, the bit air pressure, (bap; kgf/cm2), and rotations per minute (rpm). The rpm data were available for only about a quarter of the sample points due to inconsistencies in the onboard sensor. As a result, they were excluded from the investigated drilling measurements. Both manually operated rigs and semi-autonomous drills collected MWD data at approximately 0.1 m intervals along the depth of each drill hole.
This study analyzed the MWD dataset from the BR pit, encompassing 75,470 blast holes with a combined depth of 844,855 m. The analysis focused on MWD data from 2 m below the hole collars to the bottom of the blast holes, as the uppermost 2 m may not reliably represent in situ rock conditions due to potential toe charge effects from the blasting of the previous bench.
MWD Feature Engineering
Hence, feature engineering of the MWD data in this investigation was required. To minimize the potential effect on the representation of the in situ rock due to collaring effects at the beginning of the shaft and potential blast damage from previous holes, the initial MWD dataset excluded the first 2 m of each drilling hole.
Negative drilling values caused by sensor calibration issues, temporary signal loss, or data logging errors rather than actual negative drilling responses were eliminated. Such anomalies can also occur due to sudden rig stoppages, incorrect zeroing of sensors, or transient fluctuations in the onboard MWD data acquisition system.
Linear interpolation, quartile detection techniques, and a 1.5-factor threshold were used to fill gaps in anomalous data. A Gaussian filter with a smoothing factor of 0.3 was applied to the drilling data to reduce the local effects of noise.
The interval-based data of the MWD and exploration drilling datasets were transformed into point data, incorporating geospatial coordinates along with corresponding dataset values for each data point. For exploration holes, point data were derived from downhole wireline logged desurvey data, which recorded the azimuth and dip of each hole at 10-m intervals down to the final depth. In contrast, blast hole MWD data were not desurveyed due to the production-oriented nature of the holes; instead, each point’s location was estimated by assuming a straight trajectory from the hole collar to its bottom. To fuse these datasets, the K-Nearest Neighbors (KNNs) distance-based search method was applied to match its closest MWD data point to facilitate supervised ML. The accuracy of the dataset alignment was further refined by implementing distance thresholds.
2.4. Feature Selection Algorithms
The determination of the most important features in MWD data has solely used PCA. However, this research opts for appropriate feature selection to ascertain the importance of drilling variables identified for the following geotechnical categories: rock type, weathering intensity, stratigraphic unit, Geological Strengh Index and rock strength. For this purpose, non-parametric approaches, specifically MRMR and ReliefF, were utilized on the pre-processed BR dataset. These techniques assess feature selection in different ways than making assumptions about the relationships between the variables.
where the mutual information, I, quantifies the relationship between the two variables, x and y. This relationship is defined in the context of their joint probabilistic distribution, p(xi,yj), and the corresponding marginal probabilities, p(xi) and p(yj). Mutual information essentially provides a measure to determine a comparative level of similarity among the geotechnical classifications. In addition, the principle of minimum redundancy aims to select the outputs that are maximally dissimilar to each other. Minimal redundancy enhances the representational efficacy of the feature set with respect to the entire dataset. This not only makes the selected features a better representative of the full dataset, but it also determines the relative importance among MWD variables.
where Wji represents the weight of predictor Fj at the i-th iteration, while pyr and pyq represent the prior probabilities of the classes to which xr and xq belong, respectively. The variable m indicates the number of iterations, Δj(xr,xq) measures the difference in predictor Fj between observations xr and xq, xrj correspond to the values of predictor j for observation xr, and xq, respectively.
2.5. Classification-Based ML Methods
The effectiveness of the various models was compared using three specific measures: Accuracy, Overall Misclassification Cost (OMC), and Training Duration (TD).
- i.
Accuracy—this measure indicates the proportion of successful predictions made by the classification model. It is determined by dividing the number of correct predictions by the total number of predictions made.
- ii.
OMC—this is the total cost accumulated from incorrect predictions made by the model, computed by combining the cost matrix of misclassification with the corresponding confusion matrix.
- iii.
TD—this denotes the length of time it takes for the model to complete training phase.
where TN (True Negatives) represents instances correctly identified as not belonging to the class, while TP (True Positives) refers to instances accurately classified as part of the positive class. Conversely, FP (False Positives) denotes incorrect classifications where non-class instances are mistakenly labeled as positive, and FN (False Negatives) represents cases where positive instances are incorrectly predicted as negative.
where CostMi is the misclassification cost matrix and ConfMi is the confusion matrix for the respective model.
4. Discussion
This study highlights the effectiveness of classification-based machine learning techniques in predicting geotechnical property classes from MWD data. By leveraging these methods, rock mechanics characterization is significantly enhanced, exceeding an order of magnitude improvement with resource development drilling techniques. While this study focused on five geotechnical data categories—stratigraphic unit, rock or soil strength, rock type, GSI, and weathering properties—it has the potential to be expanded to other categorical orebody knowledge datasets. For example, higher resolution understandings of grade, trace contaminants, alteration intensity and mineralogy, as well as other rock mass classifications systems, including rock mass rating, rock quality designation, or Q, will greatly reduce the uncertainty resulting in increased mining confidence.
This study also evaluated the performance of models in predicting geotechnical categorical properties. The selection of the machine learning analytical model significantly influenced prediction results. This was evident through improved validation and testing accuracy, reduced training time, and lower validation and testing OMCs. DTs, LDA, and NB performed the weakest across the five geotechnical datasets while KNN and RFs displayed the strongest results, consistently above 90% for validation and testing accuracy for correct class identifications. Furthermore, KNN was quicker to train than RFs. For example, KNN, at 3 s, was over 20 times faster than RFs, at 64 s, for rock type. These results indicate that KNN is both the strongest and most computationally efficient model to predict geotechnical classification properties.
While this study focuses on conventional ML approaches due to their interpretability and practical application in mining operations, future research may explore deep learning methods to enhance classification performance. While these models can capture complex, nonlinear relationships in datasets, which may further refine the classification accuracy, deep learning models often function as “black boxes,” limiting their practical use in mining operations where explainability is critical. Therefore, while deep learning approaches hold potential, the trade-off between accuracy and interpretability remains a key consideration for real-time geotechnical decision-making.
This study assumes that MWD data are of sufficient quality and reliability for geotechnical classification, with sensor calibration and data preprocessing adequately mitigating noise and inconsistencies. The approach is most applicable to structured iron ore deposits with well-characterized geological formations, and additional validation may be required for different lithologies. Furthermore, MRMR and ReliefF identified the most influential MWD variables, but their importance may vary based on site-specific conditions.
This machine learning approach is intended to complement, rather than replace, traditional geotechnical testing, which remains essential for geotechnical validation and compliance. While the models can improve spatial resolution and provide real-time insights, they should be used in conjunction with conventional methods, such as laboratory strength tests, geophysical wireline logging, and geological mapping. Ensuring a balanced approach between the AI-driven insights and field validation is crucial for robust geotechnical characterization. Model interpretation should be in conjunction with traditional geotechnical assessments to ensure a comprehensive understanding of subsurface conditions.
This study demonstrated the success of a classification-based ML technique for geotechnical classification problems but also supports the valuable role of subject matter expert oversight in complementing ML studies regarding instances of misclassification, especially concerning materials with close or overlapping properties.
5. Conclusions
The application of classification-based ML techniques in conjunction with innovative datasets, such as MWD data, has introduced fresh opportunities in the field of rock mechanics characterization. This work provides evidence for the efficacy of ML techniques in estimating geotechnical conditions. Additionally, it highlights the improvements in the characterization of rock mechanics properties beyond the scale achieved by the traditional resource development methods. The MRMR and ReliefF feature selection methods support a balanced integration of the drilling features in multivariate analysis instead of depending solely on a single feature.
Moreover, a comprehensive assessment of diverse machine learning models yielded intricate observations regarding their predictive capabilities. The KNN and RFs algorithms demonstrated a superior performance, routinely obtaining validation, and with testing accuracies exceeding 90%. The short training duration for KNN compared with that of RFs highlights its remarkable computational efficiency. Nevertheless, it is important to acknowledge that these results are closely linked to the underlying data distributions within the geotechnical classifications.
Source link
Daniel Goldstein www.mdpi.com