The quality of a 3-D geological model strongly depends on the type of integrated geological data, their interpretation and associated uncertainties. In order to improve an existing geological model and effectively plan further site investigation, it is of paramount importance to identify existing uncertainties within the model space. Information entropy, a voxel-based measure, provides a method for assessing structural uncertainties, comparing multiple model interpretations and tracking changes across consecutively built models. The aim of this study is to evaluate the effect of data integration (i.e., update of an existing model through successive addition of different types of geological data) on model uncertainty, model geometry and overall structural understanding. Several geological 3-D models of increasing complexity, incorporating different input data categories, were built for the study site Staufen (Germany). We applied the concept of information entropy in order to visualize and quantify changes in uncertainty between these models. Furthermore, we propose two measures, the Jaccard and the city-block distance, to directly compare dissimilarities between the models. The study shows that different types of geological data have disparate effects on model uncertainty and model geometry. The presented approach using both information entropy and distance measures can be a major help in the optimization of 3-D geological models.

Three-dimensional (3-D) geological models have gained importance in
structural understanding of the subsurface and are increasingly used as a
basis for scientific investigation

In order to assess the quality and reliability of a 3-D geological model as
objectively as possible, it is essential to address underlying uncertainties.
Numerous methods have recently been proposed that enable estimates,
quantification and visualization of uncertainty

Study site and location of the model area and area of interest (AOI).

The city of Staufen suffers from dramatic ground heave that resulted in
serious damage to many houses (southwest Germany, Fig.

Staufen is located west of the Black Forest at the eastern margin of the
Upper Rhine Graben (URG). It is part of the
“Vorbergzone”

Three geological units play an important role for the swelling problem at the
site: the Triassic Gipskeuper (“Gypsum Keuper”) formation, which contains
the swelling zone, and the underlying Lettenkeuper formation and Upper
Muschelkalk formation, which are aquifers providing groundwater that accesses
the swelling zone via pathways along the BHE. The Gipskeuper formation
consists of marlstone and mudstone and contains the calcium-sulfate minerals
anhydrite (

Input data for the 3-D geological modeling include all available geological
data that indicate (1) boundaries between geological units, (2) the presence of
geological units and faults at a certain positions, and (3) orientation (dip
and azimuth) of the strata. These data were classified into four
categories (Fig.

Data categories and geological input data used to build four initial
3-D geological models. The green square indicates the area of interest (AOI),
where data were extracted for further analysis. For geological formation color
code, see Fig.

The non-site-specific data category comprises geological data that are
generally available from published maps

Stratigraphic overview of the study area and modeled geological units with average thicknesses.

The 3-D geological models were constructed using the geo-modeling software
SKUA/GoCAD^{®} 15.5 by Paradigm. They
cover an area of about 0.44 km

The strata of the models cover 10 distinct geological units including
Quaternary sediments, Triassic and Jurassic bedrock, and crystalline basement
at the lower model boundary (Fig.

Four initial models were consecutively built, according to the four previously described data categories. Model 1 was constructed based only on non-site-specific data (maps, literature, etc.); Model 2 additionally considered site-specific data (drill logs of the seven geothermal drillings); Model 3 also included “direct” problem-specific data (exploration boreholes); and finally, Model 4 included “indirect” problem-specific data (seismic campaign). Through this approach, data density and structural model complexity increase from Models 1 to 4 and the models required successively higher efforts in data acquisition in the field.

First, an explicit modeling approach ^{®}
was used as the interpolation method ^{®}. The implicit modeling approach uses a
potential field interpolation considering the orientation of
strata

Our approach for assessing uncertainties in the 3-D geological models consists
of four distinct steps (Fig.

Building the initial 3-D geological models of increasing data density and structural complexity (see above).

The definition of fault and horizon uncertainties. Horizon uncertainties were specified in
SKUA^{®} by a maximum displacement parameter or by alternative surface
interpretations, resulting in a symmetric envelope of possible surface locations around the initial
surface. To constrain the shape of generated horizons, SKUA^{®} uses a
variogram that spatially correlates perturbations applied to the initial surfaces

The creation of 30 model realizations for each initial model based on the surface variations defined above, applying the Structure Uncertainty workflow of SKUA^{®}.

The extraction of the geological information from all model realizations for analysis,
comparison and visualization. For this purpose, the AOI was divided into a regular 3-D grid of
5 m cell size, resulting in 180 000 grid cells. The membership of a grid cell to a
geological unit was defined as a discrete property of each grid cell and extracted for all 30 model realizations. Based on these data, we calculated the probability of each geological unit being
present in a grid cell in order to derive the information entropy at the level of (1) a single grid
cell, (2) a subset representing the area of extent of a geological unit and (3) the overall AOI.
Furthermore, the fuzzy set entropy was calculated to determine the ambiguousness of the targeted geological
units Gipskeuper (km1), Lettenkeuper (ku) and Upper Muschelkalk (mo) within the AOI. Calculations were
conducted using the statistics package R

Uncertainty assessment workflow with four distinct steps. This
workflow is applied to four initial models that are based on the different
data sets illustrated in Fig.

The concept of information entropy (or Shannon entropy) was first introduced
by

By subdividing the model domain

Applied to all

From the probabilities of occurrence

In a next step, information entropy

The stepwise addition of input data to the models (see
Sect.

The set of locations for which the probability

Distance measures used to calculate dissimilarities between models
(

A threshold value of

Accordingly, the dissimilarity between models can be expressed by the Jaccard
distance:

Even though the use of binary dissimilarities is straightforward and suitable
to quantify absolute changes in position of a geological unit between models,
it does not account for fuzziness (see Sect.

The four consecutively constructed initial models show a stepwise increase in
structural complexity (Fig.

Cross section through Models 1 and 4. The multiple lines show
30 model realizations with shifted faults and horizons (for the location of
the cross sections, see Fig.

In Model 2, horizon positions of the Schilfsandsteinkeuper (km2),
Gipskeuper (km1) and Lettenkeuper (ku) were locally constrained by
site-specific information provided by drill logs of the geothermal wells,
slightly impacting fault displacement and thickness of the formations.
However, changes in model geometry were minor, as no further information on
horizon orientations was available and no additional faults could be located.
By adding the direct problem-specific data from the exploration wells to
Model 3, a horst–graben structure was identified that entailed a
considerable displacement at two normal faults between and to the northwest
of the wells with a displacement of 120 and 70 m, respectively. Furthermore,
the drill logs included orientation measurements of the strata, resulting in
a shift in position and inclination of layers, compared to the previous
models. Thus, large parts of the model domain within the AOI changed from
Model 2 to Model 3, and, as a consequence, dissimilarities between these
models are particularly high (see Sect.

3-D view of the AOI with a discretization of 5 m for

The multiple (30) model realizations created by the Structural Uncertainty
workflow of SKUA^{®} are illustrated in
Fig.

Information entropy, quantified at the level of individual grid cells, can be
visualized in 3-D to identify areas of uncertainty and evaluate changes in
geometry resulting from successive data integration. Figure

The overall distribution of uncertainty was clearly affected by additional
geological information from site- and problem-specific input data (Model 4).
This effect is highlighted by the changes in entropy between the models
(Fig.

The calculated average information entropy

Average entropy

Overall, comparing the pre- to post-site-investigation situations (Models 1–4), site and problem-specific investigations were all equally successful in adding information to the model and reducing uncertainties in the area of the targeted horizons. While the benefits from the different data are equal, the costs in data acquisition (i.e., work, money and time required) may vary considerably, depending on the exploration method (e.g., drillings and seismic survey). An economic evaluation was not within the scope of this study. Nevertheless, the approach presented could improve cost and benefit analyses by quantifying the gain in information through different exploration stages.

The fuzzy set entropy was calculated to indicate how well-defined a
geological unit is within the model space. Applied to the swelling problem of
our case study, a high degree of uncertainty remains with regard to the
position of the relevant geological units (km1, ku, mo) after full data
integration. We obtained fuzzy set entropy values (

Fuzzy set entropy

In the case of the Lettenkeuper formation (unit ku), boundaries are even slightly
less well-defined in Model 4 compared to Model 1. This is likely related to
the low thickness of the formation (5–10 m,
Fig.

A gain in structural information through newly acquired data usually not only
impacts model uncertainty but is also associated with a change in model
geometry. The calculated distances between models can identify the data
category with the strongest impact on model geometry and make it possible to
determine whether model geometry and uncertainty are related.
Figure

Calculated distances between models are rather high, with values of up to
0.78; indicating a pronounced shift in position of the geological
units after data were added. The addition of both direct and indirect problem-specific data to Model 3 had a strong impact on model geometry, which can be
seen by comparing the calculated distances between Models 2, 3 and 4 for both
Jaccard and city block (Fig.

Overall, the city-block distance, which considers the fuzziness of geological
boundaries, shows a similar trend to the Jaccard distance; however, changes
are much less pronounced, especially for unit ku. According to the low
city-block distance, absolute changes in probability

Dissimilarities between the different models expressed by

Nonetheless, both distance measures allow the quantification and assessment of
different aspects of dissimilarities and therefore, changes in geometry
across models. Nevertheless, the city-block distance is preferable when sets of
multiple realizations are compared because it factors in the probability of
the occurrence of a geological unit at a discrete location. In recent years,
various distance measures have already been applied in other contexts to
create dissimilarity distance matrices and compare model realizations in
history matching and uncertainty analysis, particularly in reservoir
modeling

Prior work has demonstrated the effectiveness of information entropy in
assessing model uncertainties and providing valuable insight into the
geological information used to constrain a 3-D model.

We presented a new workflow and methods to describe the effect of data integration on model quality, overall structural understanding of the subsurface and model geometry. Our results provide a better understanding of how model quality can be assessed in terms of uncertainties in a data acquisition process of an exploration campaign, showing that information entropy and model dissimilarity are powerful tools to visualize and quantify uncertainties, even in complex geological settings. The main conclusions of this study are as follows:

Average and fuzzy set entropy can be used to evaluate uncertainties in 3-D geological modeling and, therefore, support model improvement during a consecutive data integration process. We suggest that the approach could be used to also perform a cost–benefit analysis of exploration campaigns.

The study confirms that 3-D visualization of information entropy can reveal hot spots and changes in the distribution of uncertainty through newly added data in real cases. The method provides insight into how additional data reduce uncertainties in some areas and how newly identified geological structures may create hot spots of uncertainty in others. Furthermore, the method stresses that parsimonious models can locally underestimate uncertainty, which is only revealed after new data are available and being considered.

Dissimilarities in model geometry across different sets of model realizations can effectively be quantified and evaluated by a single value using the city-block distance. A combination of the concepts of information entropy and model dissimilarity improves uncertainty assessment in 3-D geological modeling.

However, some limitations of the presented approach are noteworthy. Although it was designed to assess uncertainties in the position and thickness of horizons, uncertainty in orientation could only be included indirectly through perturbations based on alternative surface interpretations but not by explicit dip and azimuth parameter values indicated for this purpose. This may result in a systematic underestimation of uncertainties at greater depths of the model domain. Furthermore, our study site (Vorbergzone) is a highly fragmented geological entity, and epistemic uncertainties due to missing information about unidentified but existing geological structures are likely substantial.

Future work should therefore aim to include “fault block uncertainties”
more effectively into the workflow, for example by including multiple fault
network interpretations

The underlying research data were collected and provided by
the state geological survey, LGRB. They are freely available in the form of two
extensive reports

Daniel Schweizer, Christoph Butscher and Philipp Blum designed the study and developed the methodology. Daniel Schweizer performed the 3-D geological modeling, implemented the approach for uncertainty assessment and analyzed the results. Daniel Schweizer prepared the paper with contributions from all coauthors.

The authors declare that they have no conflict of interest.

The financial support for Daniel Schweizer from the German Research Foundation (DFG) under grant number BU 2993/2-1 is gratefully acknowledged. We acknowledge support by the Deutsche Forschungsgemeinschaft and the Open Access Publishing Fund of the Karlsruhe Institute of Technology. Furthermore, we thank the Geological Survey of Baden-Württemberg (LGRB), especially Gunther Wirsing and Clemens Ruch (LGRB), for data provision and the Paradigm support team for their technical support. Finally, we would like to thank Jan Behrmann (GEOMAR) and Thomas Bohlen (KIT) for discussions of the tectonic setting and seismic interpretations, respectively. The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association. Edited by: G. Peron-Pinvidic Reviewed by: F. Wellmann and G. Caumon

^{™}– Paradigm

^{®}15.5 User Guide, available at: