Phase segmentation of X-ray computer tomography rock images using
machine learning techniques: an accuracy and performance
study

Chauhan, Swarup; Rühaak, Wolfram; Anbergen, Hauke; Kabdenov, Alen; Freise, Marcus; Wille, Thorsten; Sass, Ingo

doi:https://doi.org/10.5194/se-7-1125-2016

Articles | Volume 7, issue 4

https://doi.org/10.5194/se-7-1125-2016

© Author(s) 2016. This work is distributed under
the Creative Commons Attribution 3.0 License.

Special issue:

Pore-scale tomography & imaging - applications, techniques...

https://doi.org/10.5194/se-7-1125-2016

© Author(s) 2016. This work is distributed under
the Creative Commons Attribution 3.0 License.

Articles | Volume 7, issue 4

Research article

|

19 Jul 2016

Research article |

| 19 Jul 2016

Phase segmentation of X-ray computer tomography rock images using machine learning techniques: an accuracy and performance study

Swarup Chauhan, Wolfram Rühaak, Hauke Anbergen, Alen Kabdenov, Marcus Freise, Thorsten Wille, and Ingo Sass

Abstract. Performance and accuracy of machine learning techniques to segment rock grains, matrix and pore voxels from a 3-D volume of X-ray tomographic (XCT) grayscale rock images was evaluated. The segmentation and classification capability of unsupervised (k-means, fuzzy c-means, self-organized maps), supervised (artificial neural networks, least-squares support vector machines) and ensemble classifiers (bragging and boosting) were tested using XCT images of andesite volcanic rock, Berea sandstone, Rotliegend sandstone and a synthetic sample. The averaged porosity obtained for andesite (15.8 ± 2.5 %), Berea sandstone (16.3 ± 2.6 %), Rotliegend sandstone (13.4 ± 7.4 %) and the synthetic sample (48.3 ± 13.3 %) is in very good agreement with the respective laboratory measurement data and varies by a factor of 0.2. The k-means algorithm is the fastest of all machine learning algorithms, whereas a least-squares support vector machine is the most computationally expensive. Metrics entropy, purity, mean square root error, receiver operational characteristic curve and 10 K-fold cross-validation were used to determine the accuracy of unsupervised, supervised and ensemble classifier techniques. In general, the accuracy was found to be largely affected by the feature vector selection scheme. As it is always a trade-off between performance and accuracy, it is difficult to isolate one particular machine learning algorithm which is best suited for the complex phase segmentation problem. Therefore, our investigation provides parameters that can help in selecting the appropriate machine learning techniques for phase segmentation.

Received: 29 Feb 2016 – Discussion started: 01 Apr 2016 – Revised: 14 Jun 2016 – Accepted: 24 Jun 2016 – Published: 19 Jul 2016