Research article 31 Mar 2020
Research article  31 Mar 2020
Towards plausible lithological classification from geophysical inversion: honouring geological principles in subsurface imaging
 ^{1}Centre for Exploration Targeting (School of Earth Sciences), University of Western Australia, 35 Stirling Highway, 6009 Crawley, Australia
 ^{2}The International Centre for Radio Astronomy Research, University of Western Australia, 7 Fairway, 6009 Crawley, Australia
 ^{3}ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D)
 ^{1}Centre for Exploration Targeting (School of Earth Sciences), University of Western Australia, 35 Stirling Highway, 6009 Crawley, Australia
 ^{2}The International Centre for Radio Astronomy Research, University of Western Australia, 7 Fairway, 6009 Crawley, Australia
 ^{3}ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D)
Correspondence: Jérémie Giraud (jeremie.giraud@uwa.edu.au)
Hide author detailsCorrespondence: Jérémie Giraud (jeremie.giraud@uwa.edu.au)
We propose a methodology for the recovery of lithologies from geological and geophysical modelling results and apply it to field data. Our technique relies on classification using selforganizing maps (SOMs) paired with geoscientific consistency checks and uncertainty analysis. In the procedure we develop, the SOM is trained using prior geological information in the form of geological uncertainty, the expected spatial distribution of petrophysical properties and constrained geophysical inversion results. We ensure local geological plausibility in the lithological model recovered from classification by enforcing basic topological rules through a process called “postregularization”. This prevents the threedimensional recovered lithological model from violating elementary geological principles while maintaining geophysical consistency. Interpretation of the resulting lithologies is complemented by the estimation of the uncertainty associated with the different nodes of the trained SOM. The application case we investigate uses data and models from the Yerrida Basin (Western Australia). Our results generally corroborate previous models of the region but they also suggest that the structural setting in some areas needs to be updated. In particular, our results suggest the thinning of one of the greenstone belts in the area may be related to a deep structure not sampled by surface geological measurements and which was absent in previous geological models.
The idea of geoscientific integration is not new and has been advocated since the inception of quantitative geoscientific studies involving geophysics and during the advances of geophysics as a discipline (see, for instance, Wegener, 1920; Nettleton, 1949; Towles, 1952; Jupp and Vozoff, 1975; Lines et al., 1988; Li and Oldenburg, 2000). In the natural resource sector, the exploitation of the fundamental complementarity between geology and geophysics in modelling the same object (the Earth) has been recognized as one of the prerequisites to exploration success as early as the 1940s during the early years of the Society of Exploration Geophysicists (Eckhardt, 1940; Green, 1948). Numerous authors have since tackled the issue of integrating petrophysical and geological information to model geophysical quantities (seismic velocities, mass density, etc.) through inversion, with an increasing trend in the past 15 years or so (see, for instance, references reviewed in Lelièvre and Farquharson 2016; Meju and Gallardo 2016; Moorkamp et al., 2016; Giraud et al., 2017). In contrast, the recovery of geological quantities from geophysical inversion has seen much less effort. Recent studies have started to rectify this by proposing the idea of lithological differentiation of inversion results (Paasche et al., 2010; Sun and Li, 2015; Paasche and Tronicke, 2007), which consists of the identification of lithologies from inversion results. While lithological differentiation is expected to hold much potential in mineral exploration, it still remains underexplored (Li et al., 2019).
In oil and gas exploration scenarios, seismic facies analyses and classification using techniques developed for what is commonly called machine learning (neural networks, support vector machine algorithms, etc.) have become popular in recent years (Zhao et al., 2015; Chopra and Marfurt, 2018; Wrona et al., 2018; Zhang et al., 2018). This was driven by the need for quantitative interpretation methods in the geosciences and by the “renaissance” phase machine learning went through after 2006 (Goodfellow et al., 2016, chap. 5). Once recovered, spatial facies distribution can be used for geological interpretation and downstream decision making. However, like all modelling results, the identification of facies or lithologies using machine learning relies upon statistical models and is affected by ambiguity and uncertainty. One reason for this is that validation datasets are usually treated as the ground “truth”, while they are fraught with uncertainty. For instance, the interpretation of borehole data or outcrops with their uncertainty can lead to significantly different models honouring geological measurements equally well (Wellmann et al., 2010; de la Varga et al., 2019; PakyuzCharrier et al., 2018a, b, c).
Lithologies (or facies) can be characterized by a broad range of rock properties that are the result of geological processes, such as weathering, compaction, metamorphism and deformation. These physical processes are usually nonlinear, especially when in combination, and produce complex representations of different lithotypes which are difficult to discriminate from geophysical data. In this context, one possible solution is to use neural networks for lithological classification, as they are “universal approximators” (van der Baan and Jutten, 2000). As a consequence, however, lithological classification is affected by uncertainty from the data used to train and validate the algorithm. Such uncertainty is difficult to quantify and is rarely estimated or even considered.
To date, whether it be in oil and gas or mining exploration, uncertainty in recovered lithologies is a research avenue, which, to the best of our knowledge, only a few authors have addressed. Sun and Li (2019) assess uncertainty by varying the number of clusters in their lithological differentiation scheme, and Bauer et al. (2003) classify lithologies and estimate the resolution of their results using synthetic data. As a result of the lack of comprehensive uncertainty analyses, practitioners often lack quantitative, robust uncertainty modelling necessary to inform interpretation or risk evaluation (Jessell et al., 2018). In addition, apart from Zhao et al. (2017), who account for seismic datadriven stratigraphy in their seismic facies classification, established workflows relying on neural networks to identify facies or lithologies in three dimensions give little to no consideration of geological information and rules for their classification.
To complement existing methodologies, we propose a solution that partially addresses the lack of consideration given to geological information during classification. We introduce a general postprocessing (i.e. postinversion) workflow for the recovery of lithologies from geoscientific modelling results and the estimation of the related uncertainty. For this purpose, we complement existing classification techniques by ensuring the geological and geophysical consistency of, and estimating the confidence in, the recovered lithologies. Using an artificial neural network trained in a fully controlled environment (all variables in the model used for training being perfectly known) with attributes characterizing the inversion results, we perform lithological classification applying plausibility filters relying on geological principles, which we refer to as “geological postregularization”. The application of geological postregularization is to reduce the nongeological character of models obtained through classification. After classification, we calculate the frequentist probability of the different lithologies (i.e. apparition frequency relative to all lithologies) associated with each unit of the selforganizing map (SOM) and report it in each model cell discretizing the studied area for interpretation.
The methodology we propose can serve two main objectives. Our first objective is to introduce a methodology that is made efficient by leveraging existing geoscientific inputs and prior information, and costeffective by imposing requirements that do not exceed the computational power available on a personal computer. Secondly, our aim is to complement inversion workflows by providing a general, automated method to derive a nondeterministic lithological interpretation of inversion results. Thirdly, we propose a realworld application based on a case study in the Yerrida Basin (Western Australia), where we build upon recent work by Giraud et al. (2019a) and Lindsay et al. (2018), who performed the geophysical inversion and geological modelling of data collected in the area, respectively.
In this work, lithologies are identified though a classification technique relying on a simple artificial neural network. We chose SOMs (Kohonen, 1982a, b), a wellestablished algorithm that has been successfully applied to seismic facies classification and geological mapping purposes (Chang et al., 2002; Klose, 2006; Köhler et al., 2010; Bauer et al., 2012; Carneiro et al., 2012; Du et al., 2015; Roden et al., 2015). We first train and test the SOMs using data extracted from a semisynthetic dataset (i.e. a geophysical inversion feasibility study based on geological and petrophysical field data) assumed to represent the geophysical characteristics of the studied area. We utilize this controlled environment to estimate the accuracy our predictions for each class identified in the studied volume without the errors associated with well positioning or lithology interpretation errors. We then use the trained network to perform classification using field data only. We obtain, for each model cell, a suite of frequentist probabilities for each lithology observed in the area. In both cases, geological postregularization is applied before the calculation of uncertainty metrics to ensure the geological consistency of the results.
The rest of this paper develops as follows. Section 2 provides the theoretical background necessary to reproduce the work presented. It first briefly describes the geophysical and geological modelling schemes (Sect. 2.1) used to obtain the models that are used as input for classification using SOM (Sect. 2.2). Postregularization as applied to such classified lithologies (Sect. 2.3) and the related uncertainty analysis in terms of prediction accuracy and geophysical consistency is then detailed (Sect. 2.4). Following this, Sect. 3 presents an application case using data from the Yerrida Basin (Western Australia), which was investigated using gravity data, petrophysical and geological information. Geological and geophysical modelling results are first summarized and the rules defining the postregularization operator in the area are introduced (Sect. 3.1). The classification of results from geological and geological modelling and postregularization is then presented alongside the related uncertainty analysis, supporting a potential reinterpretation of the geological model of the area (Sect. 3.2). The discussion and conclusion sections follow and complete this contribution.
In this section, we first introduce essential information about the geophysical and geological modelling used as a prerequisite to this study. We then introduce the utilization of SOM and the tools we developed in sufficient detail to allow the reproducibility of the procedure.
2.1 Geophysical and geological modelling
Inverse geophysical modelling was performed using the leastsquare inversion TOMOFASTX platform. This inversion platform enables the use of a series of constraints as detailed in Martin et al. (2018), Giraud et al. (2019a, b). Constraints are enforced through a minimumstructure gradient regularization approach where weights vary locally accordingly with geological uncertainty (Giraud et al., 2019a). The cost function to minimize is given as
where d represents observed data and m is the model; g is the forward operator calculating the predicted data m produces; m_{p} is the prior model. In Eq. (1), subscripts d, m and s refer to data, model and smoothness, respectively. In this contribution, W_{d} is a diagonal matrix where each element is equal to the inverse of the sum of squares of the geophysical measurements; W_{m} and W_{s} are diagonal covariance matrices; here, W_{m} is the identity matrix. The scalars α_{m} and α_{s} are weights controlling the relative importance of the different terms in the equation; ∇ is the spatial gradient operator. The last term of the Eq. (1), the smoothness term, constrains the structural features of the inverted model. The values in diagonal matrix W_{s} are determined from prior information. In the presented work, W_{s} is obtained from geological modelling results and is a proxy for geological uncertainty
The matrix W_{s} is calculated following Giraud et al. (2019a), who use the probabilistic geological modelling approach described in PakyuzCharrier et al. (2018b, c, 2019). In the case of gravity inversion as presented here, the complete Bouguer anomaly of density contrast model m is calculated as the product of the Jacobian matrix G with model m. Therefore, we have g(m)=Gm.
Geological uncertainty is estimated from probabilistic geological modelling. During this process, an ensemble of geological models is generated using the Monte Carlo uncertainty estimator (MCUE) of PakyuzCharrier et al. (2018a, b, c, 2019). MCUE relies on the perturbation of orientation measurements (interfaces and foliations) defining structures of a reference geological model accordingly with their uncertainty. From this series of models, geological uncertainty can be estimated (Wellmann and RegenauerLieb, 2012) through calculation of Shannon's entropy (Shannon, 1948) for the simulated geological models. Shannon's entropy, which can be used as a proxy for geological uncertainty, indicates how well geological information constrains the model locally. It can be used to constrain inversion in a structural sense when integrated in W_{s} as per Eq. (1) (Giraud et al., 2019a).
More detailed information about the usage of MCUE results in geophysical inversion can be found in Giraud et al. (2017, 2018a, 2019a, b).
2.2 Classification using SOM
The SOM artificial neural network relies on competitive, nonsupervised learning. The relative simplicity and the efficiency of the SOM algorithm has made it a popular tool for classification, data imputation, visualization and dimensionality reduction (Vesanto and Alhoniemi, 2000; Kalteh et al., 2008; Miljkovic, 2017; Klose, 2006; Kohonen, 1998, 2013; Roden et al., 2015; Martin and Obermayer, 2009). In essence, it consists in the projection of the SOM's latent space onto a manifold of superior dimension (i.e. our dataset). This map, which can be 2D or 3D, is made of a predefined number of interconnected neurons (also referred to as “nodes” or “units”) that have a fixed network configuration. Projection occurs during the training phase, where the locations of the neurons in the manifold are iteratively adjusted so approximation is optimal.
In this study, we follow common practice by training twodimensional (2D) maps using a hexagonal lattice topology and applying a Gaussianshaped neighbourhood function. We chose to use a 2D map for the sake of simplicity after our testing revealed that other configurations did not improve results significantly. The hexagonal lattice topology seemed to provide better results than square lattice topology using the dataset we present here.
Ideally, the SOM should be trained in a controlled environment where all the variables used are perfectly known, which motivates the utilization of synthetic geophysical data. We calculate such data from a geological structural framework derived from realworld field geological and petrophysical field measurement data in the same fashion as for a geophysical feasibility study. The training and tests datasets are comprised of the following variables:

starting model for inversion m_{start};

inverted model m_{inv};

geological uncertainty W_{s};

spatial gradient in the inverted model ${\u2225\mathrm{\nabla}{m}_{\mathrm{inv}}\u2225}_{\mathrm{2}}$; and

most likely lithology obtained from geological modelling (training lithological model).
The starting model is obtained from prior information. Here, it is the expected petrophysical property model from geological modelling. Each datum from the training and tests datasets is a vector $x\phantom{\rule{0.125em}{0ex}}\mathit{\u03f5}\phantom{\rule{0.125em}{0ex}}{R}^{{n}_{\mathrm{v}}}$, with a number of variables n_{v}=5, where x(5) is the lithology assigned to this unit. This choice of training variables was motivated by the necessity to account for the available information in terms of geological modelling and measurements, uncertainty and structural setting. The starting model for inversion encapsulates the preinversion state of knowledge. The inverted model comprises the update of this model using information extracted from geophysical measurements and translated into a 3D model. The lithological model refers directly to the interpreted geological observations of the area. The spatial gradients of the inverted models provide structural information about the location of the geological units that can be recovered by interpretation from the inverted density contrast model.
During training, we examine SOM quality using quantization error Q and lithology prediction accuracy. Quantization error “measures the average distance between each data vector and its best matching unit [BMU]” (Uriarte and Martín, 2005), thereby indicating how well the different BMUs approximate the dataset. It can be interpreted as analogous to a misfit between calculated and observed data. The mean quantization error Q of the SOM is expressed as follows for n data vectors x_{i}:
where BMU_{i} is the BMU of x_{i}. In the application of SOM to field data, the trained map is used for the classification of inversion results, where inputs 1 through 4 listed above are obtained from previous modelling and lithologies are the quantity sought for. In our case, the utilization of SOM for partitioning the input models allows the recovery of lithology, which is a geological quantity reflective of all input data. It is also useful in that, as we will see later, the consistency of the recovered lithological model can be analysed from a geophysical point of view.
In the approach we follow, the optimum number of neurons (or units) is determined using the elbow curve of the mean quantization error Q (Eq. 2) of the trained SOM. Note that we apply the same principle as the wellknown Lcurve principle (Hansen and O'Leary, 1993; Hansen and Johnston, 2001; Santos and Bassrei, 2007) for the determination of optimum weights in leastsquare geophysical inversion. Here, we train the SOM using functions from the SOM Matlab toolbox implemented by Vatanen et al. (2015).
2.3 Geological postregularization
This subsection introduces the postregularization scheme used in this work and details its implementation and usage in the workflow introduced here.
2.3.1 Motivations
Geological rules have the potential to provide an important constraint on the classification of lithologies recovered from inversion. Such rules, like adjacency (Egenhofer and Herring, 1990), define which rock bodies can be in contact with each other and which cannot. These rules are typically expressed in geological terms as stratigraphy, where the relative age and event classification of geological units are stated. For example, a sedimentary depositional event of five separate units may define a simple subhorizontal layer cake configuration, where the oldest unit is never adjacent (or in contact) with the youngest unit. A magmatic event that follows may result in a vertical dyke that intrudes all sedimentary layers adjacent to all other rock units. Using geological rules as a constraint relies on finding those that are restrictive (such as the youngest unit never being in contact with the oldest) rather than permissive (such as the intruding dyke). Thiele et al. (2016), Pellerin et al. (2017) and Anquez et al. (2019) show how these can constrain parametric geological modelling. It is therefore important to honour geological rules if known and include them in classification schemes such as those to ensure that geological plausibility is not compromised in pursuit of an otherwise petrophysically and geophysically consistent model.
The process of postregularization, which consists in the application of spatial–contextual filters to the classification results to eliminate geologically unrealistic features, has been shown to increase prediction accuracy in surface (2D) geological mapping (Tarabalka et al., 2009; Stavrakoudis et al., 2014; Cracknell and Reading, 2015).
2.3.2 Implementation
The postregularization scheme we develop for the recovery of lithologies in 3D relies on two hypotheses. Firstly, we assume that the presence of isolated lithologies contradicts the geological principle of continuity. Although such postregularization has been used mostly in 2D or shallow 3D, there is no theoretical obstacle to the extension of this methodology to the purely 3D classification case we present here. Secondly, we introduce the utilization of adjacency relationships between the different lithologies in postregularization to ensure that base topological rules are respected across the entirety of the threedimensional volume. This is particularly important for structural geological interpretation (Freeman et al., 2010; Godefroy et al., 2019). Here, we extend existing postregularization approaches (i.e. Tarabalka et al., 2009; Stavrakoudis et al., 2014; Cracknell and Reading, 2015) by integrating geological information in the classification analysis in the form of topological relationships (see Egenhofer and Herring, 1990; Zlatanova, 2000; Thiele et al., 2016, for the different topologies) defined by geological principles.
The general formulation of postregularization is as follows, for a given model cell:
where arg min returns the argument k satisfying the conditions it precedes. Here, we set
where 1 is n_{l}×1 column vector of ones, and ${\mathrm{0}}_{{n}_{\mathrm{l}},{n}_{\mathrm{l}}}$ is the n_{l}×n_{l} matrix of zeros; M^{f} and M^{k} are adjacency matrices, and M^{k}⋅M^{f} is the Hadamard product of M^{k} and M^{f}, such that ${\left({\mathbf{M}}^{k}\cdot {\mathbf{M}}^{f}\right)}_{ij}={\left({\mathbf{M}}^{k}\right)}_{ij}{\left({\mathbf{M}}^{f}\right)}_{ij}$, and ${\mathrm{0}}_{{n}_{\mathrm{l}},{n}_{\mathrm{l}}}$ is the zero matrix of dimension n_{l}×n_{l}; M^{f} encapsulates geological knowledge and principles about contacts between lithologies; M^{k} contains the adjacency relationships between the considered cell and its neighbourhood U. The derivation of M^{f} and M^{k} is detailed below.
The first part of the condition in Eq. (4) is enforced using morphological closing where isolated lithologies are replaced by the most prevalent one in their neighbourhood (see Benavent et al., 2012; AckoraPrah et al., 2015). In such cases, the BMU is updated as follows. Isolated cells (in terms of their lithology) are identified through examination of the 26cell 3D cubic Moore neighbourhood U of every model cell. A cell is considered isolated if, and only if, at least 25 cells of U have a lithology that differs from it. Once such a cell is identified, its BMU is updated using the closest neuron, ensuring continuity between adjacent cells in the neighbourhood U, where the lithology to be assigned is determined by a majority vote in U subject to adjacency conditions. These conditions are determined by geological knowledge (as explained below). This process is repeated for all locations until the lithological model stops changing. The general principle of postregularization is illustrated in Fig. 1.
We point out that in contrast to Tarabalka et al. (2009) and Stavrakoudis et al. (2014), who used the first and second Chamfer neighbourhoods in 2D around the considered model cell, we do not follow the same approach in 3D. Our implementation of the extension of their approach to 3D showed that, in our application case study, the adjustments of the recovered lithological model it imposes are detrimental to the consistency of the classification with geophysical measurements. That is, the perturbation of the corresponding geophysical response of the model it generates exceeds noise level and compromises the geophysical validity of the recovered model (see geophysical validation subsection below for more details). The same remark applies to the utilization of a mode filter with a $\mathrm{3}\times \mathrm{3}\times \mathrm{3}$ kernel.
The conditions relating to adjacency relationships forces the model to honour adjacency relationships extracted from surface geology (Burns, 1988; Thiele et al., 2016) in the recovered lithological model.
We determine lithological topology by identifying the contacts between adjacent model cells and represent the topological signature of lithological models using the adjacency representation of Godsil and Royle (2001). Let the adjacency matrix M of a given cell be defined as
where n_{ij} is the number of contacts between lithologies i and j. Similarly, geological laws and knowledge allow the derivation of a matrix M^{f} defined as follows:
From there, it is straightforward to identify occurrences of forbidden contacts by calculating M_{k}⋅M^{f}. Therefore, ${\mathrm{1}}^{\mathbf{T}}{\mathbf{M}}_{k}\cdot {\mathbf{M}}^{f}\mathrm{1}=\mathrm{0}$ (Eq. 4) indicates that no contact violating the condition imposed by M^{f} is observed. The last condition in Eq. (4) can be used to prevent the local stratigraphy (in the Moore neighbourhood U of the considered cell) from violating geological rules such as “lithology B must be in conformable sequence between lithologies A and C”.
After identification of configurations forbidden by the conditions set in Eq. (4), its BMU is updated using the closest neuron honouring the set of conditions (Fig. 2, box 4a).
The next stage of the methodology we introduce is the calculation of the apportionment of each neuron in terms of the lithologies of the testing data vectors (from the synthetic survey) they predict (Fig. 2, box 5a). For instance, in a twolithology scenario, a given node may be found to predict lithology A using the validation dataset correctly 80 % of the time (80 % accuracy) and lithology B correctly 20 % of the time (20 % accuracy). This process is described below.
The methodology is summarized in Fig. 2.
2.4 Uncertainty analysis
2.4.1 Prediction accuracy of the recovered lithologies
The prediction accuracy τ_{i} of lithology $i\mathit{\u03f5}[\mathrm{1},\mathrm{\dots},{n}_{\mathrm{l}}]$ (with n_{l} the total number of lithologies) is the ratio of correct predictions to the total number of predictions. It is obtained from the “matching matrix” of the recovered lithologies M^{c}. We remind that the “matching matrix” (or “confusion matrix” in supervised learning) is a matrical representation of the number of occurrences of true/false positives/negatives.
We use the prediction accuracy τ_{i} as a metric measuring the capability of the node i of the trained SOM to recover lithologies. Let τ_{i} be expressed as
which is a particular case of the overall accuracy τ:
From Eq. (7), it appears that τ_{i} is equivalent to the frequentist probability of the ith lithology over the entire SOM. When considering a specific node j, it becomes the relative frequentist probability of the lithology i for cells classified as having the jth node as their BMU, noted τ_{ij}.
2.4.2 Geophysical consistency
The consistency of the classification performed using SOM after application of postregularization with field geophysical measurements might be altered by both the classification itself and by postregularization. It is therefore necessary to ensure that the approximation of the dataset by values from the units of SOM is consistent with geophysical measurements. To this end, we verify that the geophysical response of the density contrast model m_{SOM} corresponding to the BMUs of each cell in the studied area fits the field measurement d within a certain tolerance assumed to approximate noise level. Consequently, we ensure that the difference between m_{inv} and m_{SOM}, $\mathrm{\Delta}m=\phantom{\rule{0.125em}{0ex}}{m}_{\mathrm{inv}}\phantom{\rule{0.125em}{0ex}}{m}_{\mathrm{SOM}}$, satisfies the following condition:
where tol is the threshold depending on noise levels in the data above which m_{inv} and m_{SOM} are not considered geophysically equivalent.
The implication of Eq. (9) is that the difference Δm belongs to the null space of the inverse problem considered. The null space is characteristic of geophysical inversion's nonuniqueness. It is defined as the ensemble of models that reproduce geophysical data with a comparable misfit. The models honouring Eq. (9) can therefore be considered equivalent from a geophysical data point of view (Muñoz and Rath, 2006; Chen et al., 2007; Deal and Nolet, 1996). For qualitative assessment of geophysical consistency, we complement the utilization of Eq. (9) with the visual comparison of data misfit maps corresponding to the model corresponding to m_{inv} and m_{SOM}.
3.1 Survey setting
This subsection introduces and summarizes the geological and geophysical context of the application case presented here. More details about the geology of the area and the initial geophysical inversion can be found in Giraud et al. (2019a) and Lindsay et al. (2018, 2020).
3.1.1 Geological and geophysical setting
The Paleoproterozoic Yerrida Basin is located in the southern part of the Capricorn Orogen (WA) and covers approximately 150 km N–S and 180 km E–W (Pirajno and Adamides, 2000) (Fig. 3a). The structures of interest in this work are Archean greenstone belts (Fig. 3), as they are prospective for Au and Ni and underlie the younger basin rocks. The basement to the Yerrida Basin is considered to be Archean granite–gneiss or greenstone rocks of the Yilgarn Craton. Lithospheric extension initiated the formation of the Yerrida Basin at approximately 2200 to 1990 Ma with deposition of the Windplain Group (Occhipinti et al., 2017; Pirajno and Adamides, 2000). The Goodin Inlier remains exposed in the central part of the basin and is in unconformable contact with the Windplain Group. A hiatus ensued, followed by deposition of the younger Mooloogool Group, which was then overlain in the east by the Tooloo Group of the Earaheedy Basin.
The density contrast of the lithologies observed in the area ranges between 0 and 330 kg m^{−3}, making it appropriate for gravity modelling and inversion (Giraud et al., 2019a; Lindsay et al., 2018). While basin rocks exhibit some density contrast, the greenstone is conspicuous in gravity data with a density contrast expected to lie between 190 and 270 kg m^{−3}, making it an attractive subject for gravity inversion. Field geological measurements (orientation data in the form of interfaces and foliations) and petrophysical data were used to build the reference geological model. Airborne geophysical data, Landsat and Aster 8 satellite data were also used to support the interpretation of geological measurements.
The gravity anomaly dataset we consider (Fig. 3b) is comprised of a total of 4882 measurement points. The model is discretized into $\mathrm{100}\times \mathrm{100}\times \mathrm{42}$ cells of dimensions 2.335 km × 1.875 km × 1.0475 km down to approximately 44 km depth. Weights and parameters used for the inversion of synthetic data follow the settings of Giraud et al. (2019a) on field data.
3.1.2 Geological modelling and synthetic geophysical survey
This subsection introduces the semisynthetic survey we performed for the training of SOM.
The volumes of most probable lithology, W_{S} and the starting model are shown in Fig. 4. Volumes shown in Fig. 4a, b and c are used for the training and validation dataset for SOM training as explained in Sect. 2.2.
We use the modelling results shown in Fig. 4 to calculate a synthetic geophysical dataset (Fig. 5d). The model used to generate the synthetic geophysical measurements is shown in Fig. 5a. The corresponding inverted model and its gradients are shown in Fig. 5b and c. Volumes shown in Fig. 5b and c are used for training and validation.
3.1.3 Field geophysical data inversion
The density contrast model obtained from inversion of field geophysical data and its gradients are shown in Fig. 6a and b. Visual comparison of inverted models shown in Figs. 6a and 5b reveals that mesoscale structures are similar with the exception of large structures presenting lowdensity contrasts at depth (darker shades of blue in Fig. 6a). This is reflected in Fig. 6b, which exhibits low gradient values in these areas.
The classification of lithologies using SOM is performed applying the trained network to volumes shown in Figs. 4b, c and 6a, b. The next subsections describe the geological laws used for postregularization (Sect. 3.1.4) and the classification process (Sect. 3.2.2).
3.1.4 Geological rules for postregularization
As mentioned above, for clarity in this demonstration, only adjacency relationships are considered. The utilization of simple relationships such as adjacency is also chosen because in areas of sparse data, a full description of geological rules (fault relationships with fault and stratigraphy) is often not known. Given the complexity of the Yerrida Basin and its magmatic and deformation history, several base geological rules can be derived to assess the plausibility of recovered lithological models. Using fundamental geological principles (such as uniformitarianism, superposition, Walther's law, crosscutting relationships and original horizontality), the two most likely restrictive adjacency rules are as follows. We assume that the mafic greenstone bodies cannot be in contact with the Killara Formation (in the Mooloogool Group) since our field data suggest that the Killara Formation is a volcanic unit that is restricted to the Yerrida Basin and thus not in contact with the mafic greenstone. In addition, we assume that the mafic greenstone cannot be in contact with the Goodin Inlier and background (or basement), as the mafic greenstone is modelled to be enveloped by the felsic component of the greenstone.
The matrix M^{f} defining the contacts forbidden by geology as described above is given in Fig. 7.
3.2 Geologically constrained SOM classification and uncertainty analysis
3.2.1 Training the neural network
In this work, we use approximately 500 neurons (units) for the training of SOM. This number is inferred from the analysis of the elbow curve we calculated using the validation datasets (see Fig. 8) and approximately matches the proposed value of $\mathrm{5}\sqrt{N}$ (with N the total number of observation) proposed by Vesanto and Alhoniemi (2000) and commonly used since (Shalaginov and Franke, 2015). The chosen number of units is corroborated by the lithology prediction accuracy (Fig. 8) as approximating the point of diminishing returns, i.e. the number of nodes beyond which additional nodes are becoming nearly redundant.
The trained map presents a mean quantization Q equal to 0.075, which indicates relatively good approximation of the datasets by the trained SOM. This is illustrated by Fig. 8, where all lithologies are recovered with a prediction accuracy superior to 90 %.
3.2.2 Classification and postregularization
After classification of the recovered model, we performed postregularization to remove geologically unrealistic features from the classified lithological volume. Figure 9 shows the classification results before and after postregularization, along with the associated adjacency matrix.
As can be inferred from the adjacency matrices plotted in Fig. 9, the number of contacts between units with indices 1 and 2, respectively, is reduced by the application of postregularization. One reason for this is the presence of a number of inclusions of lithology 1 in lithology 2, and vice versa; a total of 2561 such inclusions was identified. Overall, the number of contacts between lithologies 5 and 2 increased slightly due to postregularization because a large percentage of contacts between lithologies 5 and 1 has been reassigned as contacts between lithologies 5 and 2. The elimination of contacts between lithologies 5 and 6 is also visible in Fig. 9.
3.2.3 Estimated confidence in recovered lithologies
For each node of the SOM, we calculate the prediction accuracy τ_{i} (Eq. 3) for the different lithologies observed in the area using the crossvalidation dataset. After application of the trained SOM for the classification of inversion results obtained from the inversion of field geophysical data, we obtain a frequentist probability volume for each lithology. Figure 10 shows the resulting frequentist probability volumes for the six lithologies present in the Yerrida Basin.
Figure 10 exhibits probabilities between 0.3 and 0.6 in the area marked by the two arrows. This suggests that these zones are the least well constrained. Note that from Fig. 10, we can interpret the presence of mafic greenstone with confidence, as it shows high frequentist probability nearly everywhere classification suggests its presence. For completeness, assessment of the prediction accuracy of the different lithologies is shown by the corresponding box plot in Appendix B.
3.2.4 Geophysical null space validation and implications for geological interpretation
Applying Eq. (9), we obtain ${\u2225{W}_{\mathrm{d}}\mathbf{G}\mathrm{\Delta}m\phantom{\rule{0.125em}{0ex}}\u2225}_{\mathrm{2}}^{\mathrm{2}}=\mathrm{6}\times {\mathrm{10}}^{\mathrm{4}}$, indicating that the model can be considered geophysically equivalent overall. This is illustrated by the map of the corresponding data misfits (see Fig. B1 in Appendix B), which indicates that we can consider the recovered lithological model after application of postregularization as reflective of both geophysical and geological information. Focusing on the mafic greenstone belts of interest in the area (Fig. 11), the classification results allow us to propose the following geological interpretations.
Figure 11 shows that the northern portion of the greenstone belts A and B recovered by geophysical inversion and SOM classification is thinner in their northern part than was proposed by the initial geological model. Likewise, greenstone belt C seems to be much thinner near its centre than expected. Given the data density and lack of understanding we have about the depth of this greenstone belt, this observation is plausible. It confirms and refines considerably the crude, preliminary lithology differentiation of Giraud et al. (2019a) that was based only on density contrast value. The cause of the thinning of mafic greenstone belt C could be attributed to faulting, folding or the topography of the palaeoenvironment where the protoliths to the belt were formed, which are not captured directly by surface geology. The portion of the southern Merrie Greenstone Belt (mafic greenstone C) is shown to be thinner than expected, prompting a review of the structure of existing models. A plausible reason is the presence of structure that has not been identified from the initial interpretation of geophysical data. In addition, the potential presence of deeppenetrating faults or shear zones, as shown in Fig. 11b by the arrows around greenstone C, hints at a possible false assumption that the Merrie Greenstone Belt is a single and coherent geological body.
The application of the technique presented here is not restricted to usage of the particular geophysical or geological modelling schemes generating the modelling inputs to this study. The methodology we introduced is general and any different standalone geophysical and geological modelling schemes could also be used.
The work presented here relied on SOM, which can be seen as an extension of the kmeans and cmeans clustering algorithms used for lithological differentiation (Paasche and Tronicke, 2007; CarterMcAuslan et al., 2015; Sun and Li, 2015, 2016; Maag and Li, 2018; Ward et al., 2014; Singh and Sharma, 2018), with which it shares a number of characteristics. We can therefore assume that our findings may hold true for these techniques.
We have shown that the utilization of postregularization can be effective for increasing geological realism in the recovered lithological models while preserving the geophysical validity of the corresponding model. The geological principles we used to design our postregularization operator apply to lithological topology and focus on the adjacency relationship between cells. Ideally, postregularization should also consider the surface area of contacts and their topology. This could be followed by, for instance, a 3D extension of the geological modelediting approach of Anquez et al. (2019) to produce genuine geological models honouring age relationships, stratigraphic principles, etc. Provided that the resulting models honour Eq. (9), this approach would ensure that while they are geophysically valid, they can be readily used for interpretation or by commercial or noncommercial geological modelling engines, reservoir simulations, etc. without further processing.
We also believe that postregularization can be successfully applied to other clustering techniques. In addition, the implementation of postregularization presented here can be readily applied to existing classification, regardless of the classification algorithm used, as it only adjusts the classification using spatial–contextual features in the classified model and could assist the geological characterization of inversion results (Melo et al., 2017).
The example we have shown uses a covariance matrix W_{s} (Eq. 1) that results from geological modelling. It is used as a proxy for the uncertainty about our knowledge in terms of structural geology. Such prior information could also be derived from techniques other than geological modelling such as prior geophysical modelling, be it using the same or different geophysical methods. While we do not address the uncertainty in the density model directly, we assume that nonuniqueness and measurement uncertainty affect both field data and synthetic data in the same manner due to the noise component and parameterization of each being the same.
An important result produced here involves the identification of regions which do not adequately conform to the initial model parameters (Fig. 11). While this issue remains unresolved, the capability of our method to identify problematic regions is useful to drive reinterpretation of data, consideration of additional models and, eventually, increased geological knowledge of the target.
Future work may include the generation of multiple lithological models using the trained SOM and the frequentist probability volume associated with it. By selecting models belonging to the null space of the geophysical data (i.e. satisfying Eq. 9), we expect that this would allow the identification of a series of a few archetypes that would be representative of the various datasets used in the geoscientific modelling workflow.
We have introduced a postinversion classification technique relying on SOM that enables the recovery of lithologies, the corresponding frequentist probability voxet thereby remediating to some of the limitations of deterministic inversion. The proposed technique utilizes a postregularization scheme enforcing elementary geological principles to the recovered lithological model while maintaining geophysical validity. We have applied this new methodology to the Yerrida Basin (Western Australia) and shown how it improves the geological plausibility of the recovered model. Results allowed us to confirm previous results and bring new insights into possible reinterpretation of the geometry of prospective greenstone belts.
In Fig. A1, data misfit and the absolute data misfit difference (Fig. A1b and c, respectively) show values which, in places, are relatively high but which are in line with the gravity data inversions. Figure A1a and b show similar features to the exception of a patch in the central part of the model (northing $\sim \mathrm{7.09}\times {\mathrm{10}}^{\mathrm{6}}$ m – easting $\sim \mathrm{7.7}\times {\mathrm{10}}^{\mathrm{5}}$ m), where Fig. A1b shows values that are approximately 1.5 mGal higher. Figure A1c shows misfit differences generally on the order of, or lower than, 2 mGal. Also note that there are places where Fig. A1b shows lower misfit than Fig. A1a.
The input and output of the synthetic survey are made available at https://doi.org/10.5281/zenodo.3522841 by Giraud (2019). The field data from the Yerrida Basin are made available at https://doi.org/10.5281/zenodo.3522841 (Giraud et al., 2018b).
JG designed the methodology, adapted the SOM algorithm and performed all modelling except geological modelling. MG is the main writer of the manuscript, which was redacted with support from the rest of the authors. ML performed geological modelling and interpretation of the recovered lithologies. MJ provided guidance and supervision while the project was being carried out. VO assisted in the development of the parts of the methodology relating to geophysics and the writing of the paper on aspects relating to SOM.
The authors declare that they have no conflict of interest.
Vitaliy Ogarko acknowledges the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3D (ASTRO 3D) for supporting some of his research efforts. Finally, the authors thank Evren PakyuzCharrier and Roland Martin for interesting discussions relating to topics covered in this paper.
Mark Jessell was supported by a Western Australian fellowship. Mark Lindsay was supported by the Geological Survey of Western Australia and the Exploration Incentive Scheme and the Australia Research Council (grant no. DE190100431). Part of this work has been supported by an Australian Government International Postgraduate Research Scholarship. The authors acknowledge partial financial support from the MinEx Cooperative Research Centre. The research presented here has been supported, in part, by LP170100985: Loop – Enabling Stochastic 3D Geological Modelling, funded by the Australian Research Council and supported by Monash University, University of Western Australia, Geoscience Australia, the Geological Surveys of Western Australia, Northern Territory, South Australia and New South Wales, as well as the Research for Integrative Numerical Geology, the Université de Lorraine, RWTH Aachen, the Geological Survey of Canada, the British Geological Survey, the Bureau de Recherches Géologiques et Minières (French Geological Survey) and AuScope.
This paper was edited by Michal Malinowski and reviewed by Tom Horrocks and one anonymous referee.
AckoraPrah, J., Ayekple, Y. E., Acquah, R. K., Andam, P. S., Sakyi, E. A., and Gyamfi, D.: Revised Mathematical Morphological Concepts, Adv. Pure Math., 5, 155–161, https://doi.org/10.4236/apm.2015.54019, 2015.
Anquez, P., Pellerin, J., Irakarama, M., Cupillard, P., Lévy, B., and Caumon, G.: Automatic correction and simplification of geological maps and crosssections for numerical simulations, C. R. Geosci., 351, 48–58, https://doi.org/10.1016/j.crte.2018.12.001, 2019.
Bauer, K., Schulze, A., Ryberg, T., Sobolev, S. V., and Weber, M. H.: Classification of lithology from seismic tomography: A case study from the Messum igneous complex, Namibia, J. Geophys. Res.Sol. Ea., 108, 1–15, https://doi.org/10.1029/2001JB001073, 2003.
Bauer, K., Muñoz, G., and Moeck, I.: Pattern recognition and lithological interpretation of collocated seismic and magnetotelluric models using selforganizing maps, Geophys. J. Int., 189, 984–998, https://doi.org/10.1111/j.1365246X.2012.05402.x, 2012.
Benavent, X., Dura, E., Vegara, F., and Domingo, J.: Mathematical Morphology for Color Images: An ImageDependent Approach, Math. Probl. Eng., 2012, 1–18, https://doi.org/10.1155/2012/678326, 2012.
Burns, K.: Lithologic topology and structural vector fields applied to subsurface prediction in geology, in: Proceedings of GIS/LIS 88, ACSMASPRS, San Antonio, 25–34, 1988.
Carneiro, C. D. C., Fraser, S. J., Crósta, A. P., Silva, A. M., and de M. Barros, C. E.: Semiautomated geologic mapping using selforganizing maps and airborne geophysics in the Brazilian Amazon, Geophysics, 77, K17–K24, https://doi.org/10.1190/geo20110302.1, 2012.
CarterMcAuslan, A., Lelièvre, P. G., and Farquharson, C. G.: A study of fuzzy c means coupling for joint inversion, using seismic tomography and gravity data test scenarios, Geophysics, 80, W1–W15, https://doi.org/10.1190/geo20140056.1, 2015.
Chang, H.C., KopaskaMerkel, D. C., and Chen, H.C.: Identification of lithofacies using Kohonen selforganizing maps, Comput. Geosci., 28, 223–229, https://doi.org/10.1016/S00983004(01)00067X, 2002.
Chen, J., Hoversten, G. M., Vasco, D., Rubin, Y., and Hou, Z.: A Bayesian model for gas saturation estimation using marine seismic AVA and CSEM data, Geophysics, 72, WA85–WA95, https://doi.org/10.1190/1.2435082, 2007.
Chopra, S. and Marfurt, K. J.: Seismic facies classification using some unsupervised machinelearning methods, in SEG Technical Program Expanded Abstracts 2018, Society of Exploration Geophysicists, 2056–2060, 2018.
Cracknell, M. J. and Reading, A. M.: SpatialContextual Supervised Classifiers Explored: A Challenging Example of Lithostratigraphy Classification, IEEE J. Sel. Top. Appl. Earth Obs. Remote, 8, 1–14, https://doi.org/10.1109/JSTARS.2014.2382760, 2015.
Deal, M. M. and Nolet, G.: Nullspace shuttles, Geophys. J. Int., 124, 372–380, https://doi.org/10.1111/j.1365246X.1996.tb07027.x, 1996.
de la Varga, M., Schaaf, A., and Wellmann, F.: GemPy 1.0: opensource stochastic geological modeling and inversion, Geosci. Model Dev., 12, 1–32, https://doi.org/10.5194/gmd1212019, 2019.
Du, H., Cao, J., Xue, Y., and Wang, X.: Seismic facies analysis based on selforganizing map and empirical mode decomposition, J. Appl. Geophys., 112, 52–61, https://doi.org/10.1016/j.jappgeo.2014.11.007, 2015.
Eckhardt, E. A.: Partnership between geology and geophysics in prospecting for oil, Geophysics, 5, 209–214, https://doi.org/10.1190/1.1441804, 1940.
Egenhofer, M. and Herring, J.: Categorizing binary topological relations between regions, lines, and points in geographic databases, The, 1–28, available at: https://pdfs.semanticscholar.org/b303/39af3f0be6074f7e6ac0263e9ab34eb84271.pdf (last access: 18 March 2020), 1990.
Freeman, B., Boult, P. J., Yielding, G., and Menpes, S.: Using empirical geological rules to reduce structural uncertainty in seismic interpretation of faults, J. Struct. Geol., 32, 1668–1676, https://doi.org/10.1016/j.jsg.2009.11.001, 2010.
Giraud, J.: Synthetic geophysical survey using geological modelling from the Yerrida Basin (Western Australia), https://doi.org/10.5281/zenodo.3522841, 2019.
Giraud, J., PakyuzCharrier, E., Jessell, M., Lindsay, M., Martin, R., and Ogarko, V.: Uncertainty reduction through geologically conditioned petrophysical constraints in joint inversion, Geophysics, 82, ID19–ID34, https://doi.org/10.1190/geo20160615.1, 2017.
Giraud, J., PakyuzCharrier, E., Ogarko, V., Jessell, M., Lindsay, M., and Martin, R.: Impact of uncertain geology in constrained geophysical inversion, ASEG Ext. Abstr., 2018, 1–6, https://doi.org/10.1071/ASEG2018abM1_2F, 2018a.
Giraud, J., Lindsay, M., and Ogarko, V.: Yerrida Basin Geophysical Modeling – Input data and inverted models, https://doi.org/10.5281/zenodo.1238216, 2018b.
Giraud, J., Lindsay, M., Ogarko, V., Jessell, M., Martin, R., and PakyuzCharrier, E.: Integration of geoscientific uncertainty into geophysical inversion by means of local gradient regularization, Solid Earth, 10, 193–210, https://doi.org/10.5194/se101932019, 2019a.
Giraud, J. J., Ogarko, V., Lindsay, M., PakyuzCharrier, E., Jessell, M., and Martin, R.: Sensitivity of constrained joint inversions to geological and petrophysical input data uncertainties with posterior geological analysis, Geophys. J. Int., 218, 666–688, https://doi.org/10.1093/gji/ggz152, 2019b.
Godefroy, G., Caumon, G., Laurent, G., and Bonneau, F.: Structural Interpretation of Sparse Fault Data Using Graph Theory and Geological Rules, Math. Geosci., 51, 1091–1107, https://doi.org/10.1007/s11004019098000, 2019.
Godsil, C. and Royle, G.: Algebraic Graph Theory, 1–18, available at: http://link.springer.com/10.1007/9781461301639_1 (last access: 18 March 2020), 2001.
Goodfellow, I., Bengio, Y., and Courville, A.: Deep Learning, MIT Press, Cambridge, MA, UWA, available at: http://www.deeplearningbook.org (last access: 18 March 2020), 2016.
Green, C. H.: Integration in exploration, Geophysics, 13, 365–370, https://doi.org/10.1190/1.1437404, 1948.
Hansen, P. C. and Johnston, P. R.: The LCurve and its Use in the Numerical Treatment of Inverse Problems, in Computational Inverse Problems in Electrocardiography, 119–142, available at: https://www.sintef.no/globalassets/project/evitameeting/2005/lcurve.pdf (last access: 18 March 2020), 2001.
Hansen, P. C. and O'Leary, D. P.: The Use of the LCurve in the Regularization of Discrete IllPosed Problems, SIAM J. Sci. Comput., 14, 1487–1503, https://doi.org/10.1137/0914086, 1993.
Jessell, M., Pakyuzcharrier, E., Lindsay, M., Giraud, J., and de Kemp, E.: Assessing and Mitigating Uncertainty in ThreeDimensional Geologic Models in Contrasting Geologic Scenarios, chap. 4, in: Metals, Minerals, and Society, 63–74, https://doi.org/10.5382/SP.21.04, 2018.
Jupp, D. L. B. and Vozoff, K.: Joint inversion of geophysical data, Geophys. J. R. Astron. Soc., 42, 977–991, 1975.
Kalteh, A. M., Hjorth, P., and Berndtsson, R.: Review of the selforganizing map (SOM) approach in water resources: Analysis, modelling and application, Environ. Model. Softw., 23, 835–845, https://doi.org/10.1016/j.envsoft.2007.10.001, 2008.
Klose, C. D.: Selforganizing maps for geoscientific data analysis: geological interpretation of multidimensional geophysical data, Comput. Geosci., 10, 265–277, https://doi.org/10.1007/s105960069022x, 2006.
Köhler, A., Ohrnberger, M., and Scherbaum, F.: Unsupervised pattern recognition in continuous seismic wavefield records using SelfOrganizing Maps, Geophys. J. Int., 182, 1619–1630, https://doi.org/10.1111/j.1365246X.2010.04709.x, 2010.
Kohonen, T.: Analysis of a simple selforganizing process, Biol. Cybern., 44, 135–140, https://doi.org/10.1007/BF00317973, 1982a.
Kohonen, T.: Selforganized formation of topologically correct feature maps, Biol. Cybern., 43, 59–69, https://doi.org/10.1007/BF00337288, 1982b.
Kohonen, T.: The selforganizing map, Neurocomputing, 21, 1–6, https://doi.org/10.1016/S09252312(98)000307, 1998.
Kohonen, T.: Essentials of the selforganizing map, Neural Networks, 37, 52–65, https://doi.org/10.1016/j.neunet.2012.09.018, 2013.
Lelièvre, P. G. and Farquharson, C. G.: Integrated Imaging for Mineral Exploration, in: Integrated Imaging of the Earth: Theory and Applications, American Geophysical union, 137–166, 2016.
Li, Y. and Oldenburg, D. W.: Incorporating geological dip information into geophysical inversions, Geophysics, 65, 148–157, https://doi.org/10.1190/1.1444705, 2000.
Li, Y., Melo, A., Martinez, C., and Sun, J.: Geology differentiation: A new frontier in quantitative geophysical interpretation in mineral exploration, Lead. Edge, 38, 60–66, https://doi.org/10.1190/tle38010060.1, 2019.
Lindsay, M., Occhipinti, S., Ramos, L., Aitken, A., and Hilliard, P.: An integrated view of the Yerrida Basin with implications for its architecture and mineral prospectivity, Crawley, 2018.
Lindsay, M., Occhipinti, S., Laflamme, C., Aitken, A., and Ramos, L.: Mapping undercover: integrated geoscientific interpretation and 3D modelling of a Proterozoic basin, Solid Earth Discuss., https://doi.org/10.5194/se2019192, in review, 2020.
Lines, L. R., Schultz, A. K., and Treitel, S.: Cooperative inversion of geophysical data, Geophysics, 53, 8–20, https://doi.org/10.1190/1.1442403, 1988.
Maag, E. and Li, Y.: Discretevalued gravity inversion using the guided fuzzy c – means clustering technique, Geophysics, 83, G59–G77, https://doi.org/10.1190/geo20170594.1, 2018.
Martin, R. and Obermayer, K.: SelfOrganizing Maps, in Encyclopedia of Neuroscience, Elsevier, 551–560, 2009.
Martin, R., Ogarko, V., Komatitsch, D., and Jessell, M.: Parallel threedimensional electrical capacitance data imaging using a nonlinear inversion algorithm and Lp normbased model, Measurement, 128, 428–445, https://doi.org/10.1016/j.measurement.2018.05.099, 2018.
Meju, M. A. and Gallardo, L. A.: Structural Coupling Approaches in Integrated Geophysical Imaging, American Geophysical union, 49–67, 2016.
Melo, A. T., Sun, J., and Li, Y.: Geophysical inversions applied to 3D geology characterization of an iron oxide coppergold deposit in Brazil, Geophysics, 82, K1–K13, https://doi.org/10.1190/geo20160490.1, 2017.
Miljkovic, D.: Brief review of selforganizing maps, in 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE, Opatija, Croatia, 1061–1066, 2017.
Moorkamp, M., Heincke, B., Jegen, M., Hobbs, R. W., and Roberts, A. W.: Joint Inversion in Hydrocarbon Exploration, in Integrated Imaging of the Earth: Theory and Applications, American Geophysical union, 167–189, 2016.
Muñoz, G. and Rath, V.: Beyond smooth inversion: the use of nullspace projection for the exploration of nonuniqueness in MT, Geophys. J. Int., 164, 301–311, https://doi.org/10.1111/j.1365246X.2005.02825.x, 2006.
Nettleton, L. L.: Geophysics, geology and oil finding, Geophysics, 14, 273–289, https://doi.org/10.1190/1.1437535, 1949.
Occhipinti, S., Hocking, R., Lindsay, M., Aitken, A., Copp, I., Jones, J., Sheppard, S., Pirajno, F., and Metelka, V.: Paleoproterozoic basin development on the northern Yilgarn Craton, Western Australia, Precambrian Res., 300, 121–140, https://doi.org/10.1016/j.precamres.2017.08.003, 2017.
Paasche, H. and Tronicke, J.: Cooperative inversion of 2D geophysical data sets: A zonal approach based on fuzzy c – means cluster analysis, Geophysics, 72, A35–A39, https://doi.org/10.1190/1.2670341, 2007.
Paasche, H., Tronicke, J., and Dietrich, P.: Automated integration of partially colocated models: Subsurface zonation using a modified fuzzy c means cluster analysis algorithm, Geophysics, 75, P11–P22, https://doi.org/10.1190/1.3374411, 2010.
PakyuzCharrier, E., Giraud, J., Lindsay, M., and Jessell, M.: Common Uncertainty Research Explorer Uncertainty Estimation in Geological 3D Modelling, ASEG Ext. Abstr., 2018, 1, https://doi.org/10.1071/ASEG2018abW10_2D, 2018a.
PakyuzCharrier, E., Giraud, J., Ogarko, V., Lindsay, M., and Jessell, M.: Drillhole uncertainty propagation for threedimensional geological modeling using Monte Carlo, Tectonophysics, 16–39, https://doi.org/10.1016/j.tecto.2018.09.005, 2018b.
PakyuzCharrier, E., Lindsay, M., Ogarko, V., Giraud, J., and Jessell, M.: Monte Carlo simulation for uncertainty estimation on structural data in implicit 3D geological modeling, a guide for disturbance distribution selection and parameterization, Solid Earth, 9, 385–402, https://doi.org/10.5194/se93852018, 2018c.
PakyuzCharrier, E., Jessell, M., Giraud, J., Lindsay, M., and Ogarko, V.: Topological analysis in Monte Carlo simulation for uncertainty propagation, Solid Earth, 10, 1663–1684, https://doi.org/10.5194/se1016632019, 2019.
Pellerin, J., Botella, A., Bonneau, F., Mazuyer, A., Chauvin, B., Lévy, B., and Caumon, G.: RINGMesh: A programming library for developing meshbased geomodeling applications, Comput. Geosci., 104, 93–100, https://doi.org/10.1016/j.cageo.2017.03.005, 2017.
Pirajno, F. and Adamides, N. G.: Geology and Mineralization of the Palaeoproterozoic Yerrida Basin, Western Australia, Perth, available at: https://catalogue.nla.gov.au/Record/524116 (last access: 18 March 2020), 2000.
Roden, R., Smith, T., and Sacrey, D.: Geologic pattern recognition from seismic attributes: Principal component analysis and selforganizing maps, Interpretation, 3, SAE59–SAE83, https://doi.org/10.1190/INT20150037.1, 2015.
Santos, E. T. F. and Bassrei, A.: L and Θcurve approaches for the selection of regularization parameter in geophysical diffraction tomography, Comput. Geosci., 33, 618–629, https://doi.org/10.1016/j.cageo.2006.08.013, 2007.
Shalaginov, A. and Franke, K.: A new method for an optimal SOM size determination in neurofuzzy for the digital forensics applications, in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Springer, Palma de Mallorca, 549–563, https://doi.org/10.1007/9783319192222_46, 2015.
Shannon, C. E. E.: A Mathematical Theory of Communication, Bell Syst. Tech. J., 27, 379–423, https://doi.org/10.1002/j.15387305.1948.tb01338.x, 1948.
Singh, A. and Sharma, S. P.: Identification of different geologic units using fuzzy constrained resistivity tomography, J. Appl. Geophys., 148, 127–138, https://doi.org/10.1016/j.jappgeo.2017.11.014, 2018.
Stavrakoudis, D., Dragozi, E., Gitas, I., and Karydas, C.: Decision Fusion Based on Hyperspectral and Multispectral Satellite Imagery for Accurate Forest Species Mapping, Remote Sens., 6, 6897–6928, https://doi.org/10.3390/rs6086897, 2014.
Sun, J. and Li, Y.: Multidomain petrophysically constrained inversion and geology differentiation using guided fuzzy cmeans clustering, Geophysics, 80, ID1–ID18, https://doi.org/10.1190/geo20140049.1, 2015.
Sun, J. and Li, Y.: Jointclustering inversion of gravity and magnetic data applied to the imaging of a gabbro intrusion, in SEG Technical Program Expanded Abstracts 2016, Society of Exploration Geophysicists, 2175–2179, 2016.
Sun, J. and Li, Y.: Magnetization clustering inversion Part II: Assessing the uncertainty of recovered magnetization directions, Geophysics, 8, 1–86, https://doi.org/10.1190/geo20180480.1, 2019.
Tarabalka, Y., Benediktsson, J. A., and Chanussot, J.: Spectral–Spatial Classification of Hyperspectral Imagery Based on Partitional Clustering Techniques, IEEE Trans. Geosci. Remote, 47, 2973–2987, https://doi.org/10.1109/TGRS.2009.2016214, 2009.
Thiele, S. T., Jessell, M. W., Lindsay, M., Ogarko, V., Wellmann, J. F., and PakyuzCharrier, E.: The topology of geology 1: Topological analysis, J. Struct. Geol., 91, 27–38, https://doi.org/10.1016/j.jsg.2016.08.009, 2016.
Towles, H. C.: A study in integration of geology and geophysics, Geophysics, 17, 876–899, https://doi.org/10.1190/1.1437821, 1952.
Uriarte, E. A. and Martín, F. D.: Topology Preservation in SOM, Int. J. Appl. Math. Comput. Sci., 1, 19–22, 2005.
van der Baan, M. and Jutten, C.: Neural networks in geophysical applications, Geophysics, 65, 1032–1047, https://doi.org/10.1190/1.1444797, 2000.
Vatanen, T., Osmala, M., Raiko, T., Lagus, K., SysiAho, M., Orešič, M., Honkela, T., and Lähdesmäki, H.: Selforganization and missing values in SOM and GTM, Neurocomputing, 147, 60–70, https://doi.org/10.1016/j.neucom.2014.02.061, 2015.
Vesanto, J. and Alhoniemi, E.: Clustering of the selforganizing map, IEEE Trans. Neural Networks, 11, 586–600, https://doi.org/10.1109/72.846731, 2000.
Ward, W. O. C., Wilkinson, P. B., Chambers, J. E., Oxby, L. S., and Bai, L.: Distributionbased fuzzy clustering of electrical resistivity tomography images for interface detection, Geophys. J. Int., 197, 310–321, https://doi.org/10.1093/gji/ggu006, 2014.
Wegener, A.: Die Entstehung der Kontinente und Ozeane, Braunschweig, Braunschweig, 1920.
Wellmann, J. F. and RegenauerLieb, K.: Uncertainties have a meaning: Information entropy as a quality measure for 3D geological models, Tectonophysics, 526–529, 207–216, https://doi.org/10.1016/j.tecto.2011.05.001, 2012.
Wellmann, J. F., Horowitz, F. G., Schill, E., and RegenauerLieb, K.: Towards incorporating uncertainty of structural data in 3D geological inversion, Tectonophysics, 490, 141–151, https://doi.org/10.1016/j.tecto.2010.04.022, 2010.
Wrona, T., Pan, I., Gawthorpe, R. L., and Fossen, H.: Seismic facies analysis using machine learning, Geophysics, 83, O83–O95, https://doi.org/10.1190/geo20170595.1, 2018.
Zhang, G., Wang, Z., and Chen, Y.: Deep learning for Seismic Lithology Prediction, Geophys. J. Int., 215, 1368–1387, https://doi.org/10.1093/gji/ggy344, 2018.
Zhao, T., Jayaram, V., Roy, A., and Marfurt, K. J.: A comparison of classification techniques for seismic facies recognition, Interpretation, 3, SAE29–SAE58, https://doi.org/10.1190/INT20150044.1, 2015.
Zhao, T., Li, F., and Marfurt, K. J.: Constraining selforganizing map facies analysis with stratigraphy: An approach to increase the credibility in automatic seismic facies classification, Interpretation, 5, T163–T171, https://doi.org/10.1190/int20160132.1, 2017.
Zlatanova, S.: On 3D topological relationships, in Proceedings 11th International Workshop on Database and Expert Systems Applications, IEEE Comput. Soc., 2000, 913–919, 2000.
 Abstract
 Introduction
 Methodology for geoscientific modelling and classification
 Application case: Yerrida Basin
 Discussion
 Conclusions
 Appendix A: Data misfit generated by SOM classification and postregularization
 Appendix B: Data misfit generated by SOM classification and postregularization
 Data availability
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References
 Abstract
 Introduction
 Methodology for geoscientific modelling and classification
 Application case: Yerrida Basin
 Discussion
 Conclusions
 Appendix A: Data misfit generated by SOM classification and postregularization
 Appendix B: Data misfit generated by SOM classification and postregularization
 Data availability
 Author contributions
 Competing interests
 Acknowledgements
 Financial support
 Review statement
 References
selforganizing maps, to which we add plausibility filters to ensure that the results respect base geological rules and geophysical measurements. Application in the Yerrida Basin (Western Australia) reveals that the thinning of prospective greenstone belts at depth could be due to deep structures not seen from surface.