The Impact of Seismic Interpretation Methods on the Analysis of Faults: A Case Study from the Snøhvit Field, Barents Sea

10 Five seismic interpretation experiments were conducted on an area of interest containing a fault relay in the Snøhvit field, Barents Sea, Norway, to understand how interpretation method impacts the analysis of fault and horizon morphologies, fault lengths, and throw. The resulting horizon and fault interpretations from the least and most successful interpretation methods were further analysed to understand their impact on geological modelling and hydrocarbon volume 15 calculation. Generally, the least dense manual interpretation method of horizons (32 inlines (ILs) x 32 crosslines (XLs), 400m) and faults (32 ILs, 400m), resulted in inaccurate fault and horizon interpretations and underdeveloped relay morphologies and throw, which are inadequate for any detailed geological analysis. The densest fault interpretations (4 ILs, 50m) and 3D auto-tracked horizons (all ILs and XLs spaced 12.5 m), provided the most detailed interpretations, most 20 developed relay and fault morphologies, and geologically realistic throw distributions. Sparse interpretation grids generate significant issues in the model itself, which make it geologically inaccurate and lead to misunderstanding of the structural evolution of the relay. Despite significant differences between the two models, the calculated in-place petroleum reserves are broadly similar in the least and most dense experiments. However, when considered at field25

scale, the differences in volumes that are generated by the contrasting interpretation methodologies clearly demonstrate the importance of applying accurate interpretation strategies.

Introduction
An accurate understanding of faults in the subsurface is critical for many elements of the hydrocarbon exploration and production industry. For example, faults control sediment and 30 reservoir depositional systems, act either as conduits or baffles to fluid flow, are often the defining elements of structural traps, and impact the design of exploration and production wells (e.g. Athmer et al., 2010;Athmer and Luthi, 2011;Botter et al., 2017;Fachri et al., 2013a;Knipe, 1997;Manzocchi et al., 2008aManzocchi et al., , 2010. Subsurface faults are commonly interpreted on either reflection seismic data or attributes of that data by creating fault sticks on vertical cross 35 sections (e.g. inlines ILs or crosslines XLs), which are then used to generate fault surfaces (e.g. Yielding and Freeman, 2016). Fault displacement is analysed by studying the interaction between the displaced horizon reflectors and the fault surface (e.g. Dee et al., 2005;Freeman et al., 1990;Needham et al., 1996). Although this is a commonly used interpretation method, the impact of changing interpretation density (i.e. IL or XL spacing), interpretation on vertical vs 40 horizontal sections, and the effects of manual (2D line-by-line auto-tracking) vs 3D auto-tracking techniques have not been systematically investigated.
The interpretation of faults in seismic data has been the focus of many studies. Badley et al. (1990) were the first to publish a systematic approach to the seismic interpretation of faults using fault displacement analysis. Freeman et al. (1990) explained how fault displacement analysis can 45 be used in the quality control process of fault interpretation. The interpreted horizon-fault intersections, and subsequent fault displacement profiles in seismic data have also been described as ellipsoidal in isolated, single faults. When faults are not isolated, displacement profiles exhibit more complex geometries (i.e. multiple maxima), which can help to determine the structural history of fault linkage (Needham et al., 1996). A complete workflow for 3D-50 structural interpretation in seismic data using various attribute volumes, reflection data, rendered volumes, and an overview of structural framework building has been presented by Yielding and Freeman (2016). In addition, Solum et al (2016) recommended a combination of seismic interpretation, the analysis of structure maps, fieldwork, and geomodelling as the fundamentals of structural analysis. Interpreted fault surfaces can be quality controlled by projecting 55 3 longitudinal and shear strain (vertical and horizontal components of dip separation gradient) onto fault planes, and assigning realistic strain limits in order to identify interpretation errors (Freeman et al., 2010). The forementioned strain measurement are applied to determine the strain relationships of interpreted faults, assuming the data occur within a reasonable strain limit and after the quality control process is complete (Freeman et al., 2010). 60 Uncertainty in fault interpretation has also been readily analysed, and previous works have focused on how significant uncertainties and interpretation biases exist in 2D and 3D seismic interpretation (Bond, 2015;Bond et al., 2011Bond et al., , 2007Schaaf and Bond, 2019), and the impact of the image quality of seismic data on uncertainty in seismic interpretation (Alcalde et al., 2017).
Uncertainty pertaining to fault properties, and the effect fault properties have on fluid flow 65 simulations have also been analysed (Manzocchi et al., 2008b;Miocic et al., 2019). The impact of interpretation variability on structural trap definition, juxtaposition of hydrocarbon bearing reservoirs, and the subsequent implications for exploration/volume calculations were tested and prove the impact of seismic interpretation bias in structurally defined hydrocarbon systems (Richards et al., 2015). 70 Many techniques have extended basic fault interpretation techniques to better understand the link between faults in seismic and their properties in the subsurface. Dee et al. (2005) studied the application of structural geological analysis to a number of common industry based techniques and workflows (e.g. fault seal, fluid accumulation, migration, fault property modelling). Seismic attributes have been analysed to study fault architecture and investigate fault sealing potential 75 (Dutzer et al., 2010). Imber (2010, 2012) used interpreted seismic surfaces to measure regional dip changes, in order to map fault deformation in both a normal fault array and a relay ramp. Studies such as these, combined with the increasing availability of high-resolution 3D seismic data, have driven seismic structural analysis towards more detailed and quantitative studies. Iacopini and Butler (2011) and Iacopini et al. (2012) generated a workflow combining 80 seismic attribute visualization, opacity filtering, and frequency decomposition to characterize deep marine thrust faults. In a case study from the Snøhvit field, a linkage between unsupervised seismic fault facies and fault related deformation was established, and seismic amplitude was analysed to understand how folding near faults might influence near fault amplitudes (Cunningham et al., 2019). 85 Synthetic seismic modelling has shed important light on the impact of seismic frequency on fault imaging, the seismic amplitudes contained in and around faults, and their linkage to fault related deformation and fault illumination (Botter et al., 2014(Botter et al., , 2016b(Botter et al., , 2016a. A comparison of faults in the Snøhvit field with synthetic seismic, showed the importance of incidence angle, azimuthal separation, and frequency on fault imaging (Cunningham et al., in review). 90 Fluid flow across faults, through deformed bedding, and the sealing properties of faults have long been important topics in the petroleum industry (e.g. Bretan et al., 2011;Caine et al., 1996;Cerveny et al., 2004;Davatzes and Aydin, 2005;Edmundson et al., 2019;Fachri et al., 2013bFachri et al., , 2013aFachri et al., , 2016Fisher and Knipe, 1998;Knipe, 1997Knipe, , 1992Yielding et al., 1997). In addition, reservoir modelling techniques have been used to simulate this process (Fachri et al., 2013a), and 95 synthetic seismic modelling has been used to understand the impact of faulting and fluid flow on seismic images (Botter et al., 2017).
Fault interpretation in seismic data has formed the basis of many studies over the decades, but no single study has looked specifically into seismic interpretation methodologies. It would seem logical to assume that increased interpretation density will result in a higher resolution output 100 (i.e. fault and horizon interpretation), but at the expense of the increased time required to perform the interpretation. It has yet to be fully evaluated, whether these more detailed interpretations justify this increased time and effort, or whether the end results are comparable to much more efficient interpretation strategies. Similarly, auto-tracking algorithms would appear to offer a shortcut to high-resolution horizon and fault interpretations, but how do these algorithms 105 compare to the results of detailed manual interpretations? We address the impact of interpretation strategy on the quality of the final products, and whether it is possible to identify an optimum balance between interpretation density, time required to do the interpretation, and the accuracy of the end-result.
Our study tests the effect of interpretation method (faults and displaced horizons) on aspects of 110 fault analysis, with the aim to provide geoscientists with a better knowledge of seismic interpretation/analysis of faults, and an explanation of the implications of improper interpretation and best practice interpretation methods. We designed five fault and horizon interpretation experiments, which were conducted on a seismic volume from the Snøhvit field, Barents Sea.

Geologic Setting
The Snøhvit gas and condensate field is located in the centre of the Hammerfest Basin on the southwest margin of the Barents Sea (Fig. 1a, b: Linjordet and Olsen, 1992). The ENE-WSW trending Hammerfest basin is ~150 km long by 70 km wide and is bound in the north, southeast and west by the Loppa High, Finnmark Platform and Tromsø Basin, respectively. Rifting in the 125 basin initiated in the Late Carboniferous-Early Permian and drove the formation of the NE-SE trending basin bounding faults (Gudlaugsson et al., 1998). A second phase of rifting in the Late Jurassic-Early Cretaceous reactivated the basin bounding faults and caused the basin to undergo large amounts of subsidence on both the northern and southern margins (Doré, 1995;Linjordet and Olsen, 1992;Ostanin et al., 2012;Sund et al., 1984). Due to differential subsidence during 130 this period, the Hammerfest Basin widened and deepened westward, allowing for the accumulation of thicker sediment packages in the west (Linjordet and Olsen, 1992). A dome at the basin's central axis and a subsequent east-west trending fault system, formed during basin extension in the Early Jurassic-Barremian (Sund et al., 1984). These east-west trending faults define the structure of the Snøhvit field and divide the field into northern and southern petroleum 135 provinces (Sund et al., 1984). The main petroleum system components of the Snøhvit field are located within the Upper Triassic-Jurassic strata (Fig. 1c;Linjordet and Olsen, 1992). The focus of this study is in two of the east-west trending faults across the Snøhvit field (Fig. 1b, blue and red lines). These two faults dip to the north, offset the Jurassic strata, and form a relay ramp structure (Fig. 1d). The area was chosen because relays are structurally complex and require 140 special attention in their interpretation. Relays are also important in petroleum systems as they can create sediment distribution pathways, enable or disable fault seal (as all faults can), act as fluid flow pathways, and finally can be a part of trap definitions (Athmer et al., 2010;Athmer and Luthi, 2011;Botter et al., 2017;Fachri et al., 2013a;Fossen and Rotevatn, 2016;Gupta et al., 1999;Knipe, 1997;Peacock and Sanderson, 1994;Rotevatn et al., 2007).

Methodology:
Five interpretation experiments (Exps 1-5) were designed to test the impact of different seismic interpretation methods on the analysis of faults (Fig. 2). Each of these experiments (Fig. 2a) was completed on a chosen 5 x 5 km area covering the relay ramp (orange rectangle in Fig. 1b), and a fault analysis workflow was applied to the interpreted seismic horizon and fault surfaces from 150 each experiment (Fig. 2b). The fault analysis workflow (Fig. 2b) integrated a comparison of seismic interpretation results and analyses of fault length, throw, dip separation gradients (longitudinal and shear strain), juxtaposed lithology, geological modelling, and calculation of hydrocarbon volumes. While the individual components of the fault analysis workflow have been applied previously (e.g. Elliott et al., 2012;Fachri et al., 2013a;Long and Imber, 2010, 155 2012; Rippon, 1985;Townsend et al., 1998;Wilson et al., 2009Wilson et al., , 2013, no earlier studies have considered the impact of the seismic interpretation strategy on the outcomes of the fault analysis workflow. The computer programs Petrel™ and T7™ (formerly TrapTester™) were used in the seismic interpretation and fault analysis workflows, respectively. The seismic dataset used in this study 160 was survey ST15M04, a merge of five 3D seismic streamer surveys that was provided by Equinor ASA and their partners (Petoro AS, Total E&P Norge AS, Neptune Energy Norge AS, and Wintershall Dea Norge AS) in the Snøhvit field, Norwegian Barents Sea. The ST15M04 volume was zero phase, pre-stack depth migrated (PSDM, Kirchhoff), and both partial and full offset stacks were available. It was assumed that the velocity model used in the PSDM was 165 correct and that the vertical scale of the processed volume (in depth) represents depth in meters.
The inlines (ILs) and crosslines (XLs) are spaced at 12.5 m, and an increase in acoustic impedance is represented by a red peak (blue-red-blue). The interpretation was performed in depth to give the most representative view of the geological and structural relationships, and to avoid re-stretching the data back into time. All five interpretation experiments were conducted 170 on the near stack data (5-20°), as this dataset has been proven to give the most consistent fault imaging and best reflector continuity (i.e. Shuey 1985). As the data are a merge of multiple datasets and vintages, the acquisition orientation geometries could not be considered although they are known to impact fault imaging (Cunningham et al., in review).

7
Two east-west trending, north dipping faults that form the relay ramp were interpreted (Fig. 1b,   d). These two faults are termed the western and eastern faults ( Fig. 1b and d, blue and red faults, respectively). Two faulted seismic reflectors (top Fuglen and Fruholmen formations; were also interpreted. These reflectors were chosen because the top Fuglen is a very strong, easily interpreted reflector while the top Fruholmen is poorly imaged and is more challenging to 180 interpret. Both the top Fuglen and top Fruholmen are peaks (increases in acoustic impedance).
The Stø Formation, which falls between the tops Fuglen and Fruholmen, is a prolific petroleum reservoir. Five different seismic interpretation methods (Exps 1-5) were used with the aim to systematically study how seismic interpretation techniques (Fig. 2a) influence the fault analysis workflow (Fig. 2b). The first three experiments are manual 2-D auto-tracking horizon 185 interpretation techniques with different IL and XL spacing (from every 8 to 32 lines), while the fourth and fifth experiments are a combination of automated (3D auto-tracked horizons) and manual fault interpretations. In all experiments, the faults were interpreted first, followed by the horizons. 2D and 3D auto-tracking of all horizons used seed confidence of 30% and a basic 3 x 3 seed expansion value, which pushed the interpretation to the nearest 8 seed points of the 190 interpreted seed point on the peak. In 2D auto-tracking, seed expansion only occurs in 2D on the IL or XL being interpreted, while in 3D auto-tracking, the seed points extend in both X and Y directions from the interpreted seed point to the 8 nearest seed locations. In both 2D and 3D, if a fault was encountered, the interpretation stopped and needed to be guided to the correct horizon on the other side of fault. In all experiments, faults were interpreted as simple planar features in 195 the seismic data. Although faults are complex 3D bodies in the subsurface, due to the seismic resolution of the data (i.e. Wood et al., 2015), this detail was not captured and has therefore not been considered further.

Exp 1: 32 x 32
The top Fuglen and top Fruholmen reflectors were interpreted on every 32 nd IL (north-south) and 200 XL (east-west) using 2D auto-tracking (Fig. 3a, columns 1 and 2). Fault sticks were interpreted perpendicular to the average strike of the faults on every 32 nd IL, as largely planar features (Fig.   3a, column 3). The IL/XL spacing of interpretation in this experiment was equal to 400 m (every 32 ILs/XLs x 12.5 m IL/XL spacing).

8
The interpretation of the two horizons and the two faults took the least amount of time when 205 compared to all other experiments because of the large IL/XL spacing (Fig. 3a, column 4).
Overall, this experiment was the quickest but sparsest interpretation method. Since the interpretation was manually conducted on an IL and XL basis, there was no QC needed for the top Fuglen due to the high quality of this reflector. In particularly dim areas, 2D auto-tracking of the top Fruholmen required more manual input and some QC. The two horizons were interpreted on every 16 th IL and XL using 2D auto-tracking of the peaks for each reflector (Fig. 3b, columns 1 and 2). Fault sticks were interpreted on every 16 th IL and are largely planar (Fig. 3b, column 3). The IL/XL spacing in this experiment was equal to an interpretation spacing of 200 m (every 16 ILs/XLs x 12.5 m).

215
The interpretation of both the horizons and faults in this experiment took twice the amount of time of Exp 1, since the IL/XL spacing was halved. This experiment was ranked the second most time consuming, and the second sparsest overall (Fig. 3b, column 4). Since the interpretation in this experiment was manual, a similar level of QC was needed. There was high to lower confidence in the interpretation quality of the top Fuglen and top Fruholmen reflectors, as 220 described in Exp 1.

Exp 3: 8 x 8
The two horizons were interpreted on every 8 th IL and XL (Fig. 3c, columns 1 and 2). Fault sticks were interpreted on every 8 th IL (Fig 3c, column 3). The IL/XL spacing in this experiment is equal to an interpretation spacing of 100 m (every 8 IL/XL x 12.5 m).

225
The horizons and faults in this experiment took approximately three times longer to interpret than Exp 1. This experiment was the densest of the manual interpretation methods (experiments 1-3) and was therefore the most time consuming (Fig. 3c, column 4). The quality control and interpretation confidence of the two reflectors is as described for Exps 1 and 2.

Exp 4: 3D tracked method with dip-parallel fault sticks 230
Horizons were tracked using the 3D auto-tracking algorithm in Petrel™, which resulted in complete interpretation coverage (all ILs and XLs interpreted) for the top Fuglen compared to 9 almost complete coverage for the top Fruholmen (Fig 3d, columns 1 and 2). Initially, we planned to apply a 3D automated fault interpretation method (Adaptive Fault Interpretation;Cader, 2018) for this experiment, but the algorithms currently available do not provide geologically realistic 235 fault sticks that could be used in our workflow. As a result, fault sticks were interpreted on every 4 th IL to capture the densest and most geologically realistic morphologies possible (Fig 3d, column 3). The IL/XL spacings of horizon and fault interpretations in this experiment are 12.5 m (every IL/XL x 12.5 m spacing) and 50 m (every 4 ILs x 12.4 IL spacing), respectively. A 30% seed confidence and a basic 3 x 3 seed point expansion were set in the auto-tracking of these 240 surfaces.
The 3D auto-tracked interpretation of the top Fuglen was the fastest method as the reflector is well imaged and therefore easily auto-tracked ( Fig. 3d, column 1). The top Fruholmen was a little slower to run through the auto-track due to its poor seismic imaging (Fig. 3d, column 2). As a result, the top Fruholmen required more manual guidance for the auto-track to be successful, 245 but it was still faster than all three manual interpretation methods (Exps 1-3). The fault interpretation for this experiment was the most time consuming as the spacing of fault sticks was the densest (Fig 3d, column 3). Overall, Exp 4 was tied for the second fastest to interpret (Fig 3d, column 4), but it also contains the highest density of interpretation lines for both the horizons and faults. The QC of the top Fuglen was completely unnecessary in this small study area as the 250 reflector was strong and easily auto-tracked. The QC of the top Fruholmen was more important since the reflector imaging is quite poor in some areas. The interpretation confidence for this case is high to moderately high for the top Fuglen and top Fruholmen, respectively.

Exp 5: 3D Auto-tracked horizons with horizontal (strike parallel) fault sticks
This experiment used the same 3D auto-tracked horizons as discussed in Exp 4 ( Fig. 3e, columns 255 1 and 2, interpretation). However, faults were manually interpreted horizontally on depth slices spaced every 50 m, using the tensor attribute to guide the interpretation (e.g. Fig. 3e, columns 1 and 2, tensor slices). The tensor attribute is generated using a symmetric and structurallyoriented tensor, which detects the localized reflector orientation and is sensitive to changes in both the amplitude and continuity of the seismic reflectors (Bakker, 2002). This attribute was

265
The fault interpretation for this experiment was time consuming as it required the generation of a tensor attribute prior to interpretation (Fig. 3e, column 4). Once the attribute was produced, the time to generate the fault interpretation was in the middle range of the time used for the other experiments. The interpretation confidence of the two reflectors are as described in Exp 4. To understand the relative differences between the horizons from each experiment, thickness maps were generated between the most densely interpreted 3D auto-tracked horizons (Exps 4 and 5), and the horizons generated from each of the manual based experiments (Exps 1-3).

280
Anywhere where there was a good correlation between the auto-tracked and manual surfaces, there is very little or no thickness change, while in the case of a poor correlation, a greater range in thickness may result.

Fault length and morphology
Fault length (Fig. 4a) is defined as the maximum horizontal distance of a fault in three 285 dimensions (Peacock et al., 2016;Walsh and Watterson, 1988). An analysis of fault length was conducted on the western and eastern faults (Fig. 1b, d) using the gridded fault surfaces. These data were extracted from the edge of the study area to the fault tipline for both faults. The data were graphically compared to understand the impact of interpretation method on fault length.

11
To analyse fault morphology, the horizon surfaces described in Sect. 3.1.6 were used. In creating 290 the surfaces, all horizon interpretations that fall within the fault polygons were removed, leaving behind a gap in the surface where the faults' extent and morphology through that horizon are clear. These fault polygons were generated using patch and trim distances; this is explained in detail in Sect. 3.3 Fault Throw. The analysis of morphology considers these voids in the horizon surfaces. The graphical representations of fault throw (Sect. 3.3) can also be used to understand 295 fault length.

Dip separation gradient and strain
The dip separation gradient, and the longitudinal and shear strains are useful tools for QC 315 seismic interpretations (Freeman et al., 2010). The dip separation gradient was calculated using the top Kolje, top Fuglen and top Fruholmen cutoff-lines. The longitudinal strain (also known as the vertical gradient) is the dip separation gradient in the direction of fault dip, while shear strain (horizontal gradient) is the dip separation gradient along the strike of the fault (Freeman et al.,12 2010; Walsh and Watterson, 1989). In this study, we use the principles introduced in Freeman et 320 al. (2010) to analyse these measurements. This can help us to understand how the different seismic interpretations produce results that differ from what is considered geologically realistic, and to compare how the different methods affect the value of these properties.

Juxtaposed lithology
Juxtaposed lithology (a.k.a. Allan diagram) is a representation of the hanging wall and footwall 325 lithologies and their juxtaposition on the fault plane (Allan, 1989;Knipe, 1997). To calculate juxtaposed lithology (JL), horizons, faults and a well (NO 7120/6-1, Fig. 1b, d) containing lithological information were used. JL was calculated using the resulting horizon and fault surfaces from the five experiments. The key lithological units were defined in the well using a combination of logs, core photographs, information from the NPD Fact pages, and post-well 330 reports. Sonic and density logs were used to generate a well synthetic seismogram, which was tied to the seismic. Using the same hanging wall and footwall cutoff-lines as in the fault throw analysis, and the interpreted horizons as guiding surfaces, the well lithologies were projected onto the faults and used to generate a JL (Allan) diagram.

335
The geological modelling and volume calculations were conducted on the least and most densely interpreted experiments (Exps 1 and 4). This analysis was completed using a combination of structural and property modelling workflows in Petrel™, and the 5 x 5 km study area was considered to represent the limits of the hydrocarbon field. Firstly, fault and horizon surfaces from Sect. 3.1 were used to create a structural model for each experiment (Fig. 5a). A 3D corner-340 point grid was generated, and the cells were then populated between the top Fuglen and top Fruholmen horizons using a grid cell size of 12.5 x 12.5 x 1 m (i, j, k direction), matching the resolution of the original horizon surfaces (Fig. 5b). These two horizons define the main reservoir interval (Fig. 5e; Linjordet and Olsen, 1992;Ostanin et al., 2012). In the depth (k) direction, the cells were divided using the proportional method with an approximate thickness of 345 1 m (~250 cells in total between the Fuglen and Fruholmen top surfaces). The grid follows the shape of the interpreted horizons precisely and the grid pillars align with the fault dip, making an accurate geological representation (Fig. 5b). The faults were included into the grid as zig-zag faults, meaning they were not precisely represented in i and j, but the detailed grid resolution 13 cancelled out most of this effect. Facies and porosity data (Fig. 5c) were upscaled from the logs 350 of a single well (NO 7120/6-1) to the grid cells at the well locations, and then they were populated across the structural models for each experiment. The facies were extrapolated using the sequential indicator simulation method (Fig. 5d). For simplicity, all sands were considered to be net reservoir. A constant oil saturation of 0.9 was used over the whole model for cells located inside the oil-leg. Finally, an area wide oil-water contact (OWC) was placed at a depth of 2420 355 m, the deepest point of the top Fuglen surface within the model area, to simulate a spill point with a footwall trap. Volumes were calculated, including gross rock volume, pore volume, and in-place hydrocarbon volume (STOIIP) for both Exps 1 and 4 (Fig. 5f). This simplified modelling was used to quantify the effects of interpretation methodology on the hydrocarbon related volume calculations.

360
For the volume calculations, there was a concern that any differences between Exps 1 and 4 might be caused, or at least exaggerated, by the stochastic facies and porosity modelling.
Different facies and porosity realizations will result in different volumes. We needed to be certain that any variations in volumetric were caused by the different interpretation methods, and not by the stochastic property modelling. Several options were examined to negate this 365 possibility. As the grids are identical in their i, j, k dimensions, it was expected that Petrel™ would produce the same realization in the two grids when the same seed number was selected; this proved to be an incorrect assumption. The method selected to make sure that the same realizations were being used, and to ensure that an extreme case was not being selected, was to 1) generate 100 realizations on the Exp 1 grid, 2) copy all 100 realizations to Exp 4 grid and 3) 370 run the volumetric analysis on all realizations for both grids. Once the volumes had been calculated for 100 realizations on each grid, they were analysed to determine the average volumes. This negated the possibility of selecting an extreme case. Using the same set of realizations in the two experiments, meant that the differences in volumes could be assigned, with certainty, to the differences in interpretation methods used.

Seismic interpretation
Five seismic interpretation experiments (Fig. 3) were analysed to understand the effect that the interpretation methodology has on the resulting fault and horizon surfaces.
14 Firstly, it is important to consider the areal coverage and visible patterns contained in the 380 interpretation before it is converted into surfaces (Fig. 3). When analysing the interpretation of the top Fuglen and top Fruholmen, Exps 1 to 4 have an increase in interpretation density (the horizon interpretation of Exps 4 and 5 is the same; Fig.3 a-d). All the horizon interpretations show the same general trends in topography, but as expected, the topography is more detailed and most sharply defined on the most densely interpreted data (Exps 4 and 5; Fig. 3d, e). The top 385 Fuglen is the most clearly imaged reflector, which resulted in complete interpretation coverage in all experiments (i.e. no gaps in the interpreted lines; Fig. 3). The clear imaging of this reflector is especially evident in the auto-tracked horizon in Exps 4 and 5 (Fig. 3,

top Fuglen). The top
Fruholmen is a poorly imaged reflector, which consequently resulted in gaps in the interpreted lines (Fig. 3, Fig. 6a-c). In these cases, it is possible to identify topographic features near the faults, which are clearly artefacts (Fig. 6a-b; Exps 1-2).

405
To better visualize the surface anomalies, thickness difference maps were generated between the surfaces of Exps 1-3 and the most densely surfaces of Exps 4-5. Visual inspection indicates that surfaces 1-3 all contain interpretation anomalies. The difference maps show a decrease in thickness difference with increasing interpretation density (Exps 1 to 3). The maps also show 15 that the top Fuglen surfaces are a closer match to the auto-tracked horizon than the top 410 Fruholmen (Fig. 7).
Exp 1 shows the most significant differences from the 3D auto-tracked horizons due to a sparse interpretation grid and the introduction of gridding anomalies (Fig. 7a)

Fault length and morphology
16 Fault polygons were displayed on structure maps (Fig. 6) and plotted graphically (Fig. 8, 9) to 440 show how fault length and morphology changes with the interpretation method. Generally, fault length on the interpreted horizons increases with interpretation density from Exp 1 (shortest faults) to Exp 4/5 (longest faults). These observations are clear for both the top Fuglen (Fig. 8a,   b) and the top Fruholmen (Fig. 8c, d). In Exp 5 (horizontal fault sticks), the eastern fault is longer than the fault interpreted by vertical fault sticks in Exp 4, while the western fault is shorter than 445 in Exp 4 (Fig. 8).
The morphology of the faults also changes with interpretation. In Exp 1, there is a minimal amount of interaction between the two very straight faults forming the relay (Fig. 6a). In Exp 2, the faults are also straight and do not appear to interact (Fig. 6b). In experiments 3 to 5, the northward curvature and lengthening of the eastern faults towards the western fault increases, 450 which suggests that the relay is close to breaching or may even be breached (Fig. 6c-e). This near breach relay is evident in the top Fuglen for Exps 4 and 5, but it is less prominent in the top Fruholmen (Fig. 6d, e).
The effect of interpretation method on fault length is clearly seen in the graph of fault trace distance versus fault throw (Fig. 9). The data in these graphs were sampled on the interpreted 455 fault sticks and show that in Exp 1 there is minimal overlap between the two faults, and the amount of overlap increases towards Exp 4 ( Fig. 9a-d). For Exp 5, fault trace distance versus throw shows that the eastern fault is longer, while the western fault is shorter than Exp 4 ( Fig.   9e), which confirms our observations from Fig. 8.

Fault throw 460
Fault throw contours from all five interpretation experiments exhibit in general consistent patterns (Fig. 4a) on the eastern and western faults, but also some bullseye patterns (Fig. 10a).
The western fault has similar throw magnitudes across all experiments. The lowest throws occur on the eastern margin and the highest throws (up to 100 m) on the western side. With increasing interpretation density, the throw results for this fault appear smoother and more laterally 465 extensive. For example, in Exp 1 the western fault shows three separate bullseye patterns, while in Exps 2 to 4 it shows a progressively smoother throw distribution (Fig. 10a). For the eastern fault, the throw patterns are similar between experiments, but the throw magnitudes increase with increasing interpretation density (Fig. 10a). In Exp 1, fault throw reaches a maximum of ~150 m on the eastern side of the fault. For Exps 2 and 3, the results have slightly higher 470 maximum throw (~175 m), but they are segmented into geologically unrealistic bullseye patterns (Fig. 10a). In Exp 4, the maximum throw of the eastern fault is up to 200 m, and the results are more concentric, smoother, and geologically realistic than in Exps 1-3. Exp 5 (fault sticks interpreted on depth slices) shows similar patterns to those observed in Exp 4 but with more irregularities.

475
Fault trace distance versus throw also illustrates how fault displacement is influenced by the interpretation method (Fig. 9). As discussed before, the fault throw of all experiments is greater on the edges of the study area than near the relay (centre of the graphs in Fig. 9). For the western fault, the top Fruholmen is always displaced more than the top Fuglen. For the eastern fault, the top Fruholmen is displaced more than the top Fuglen in Exp 4/5 (Fig. 9d,

Juxtaposed lithology 495
Lithology data projected on to the fault planes can help us to understand how interpretation methods influence the evaluation of reservoir juxtaposition and the potential for fault sealing. All experiments were populated with the same lithological data from well NO 7120/6-1 (Fig. 1b, yellow dot), the only variation is the interpretation method. On a broad scale, the juxtaposition 18 diagrams for the five experiments look very similar on both the eastern and western faults (Fig   500   10b). The uppermost section of the faults is characterized by shale-shale juxtaposition (dark grey, western fault), or it has not been characterized due to a lack of conformable top Kolje distribution on the eastern side of the study area (light grey, eastern fault). The next unit down is a homogenous sand-sand interval, followed by a shale-shale section at the fault centres, which is segmented by thin sand-sand units. Finally, the deepest lithology juxtaposition is another 505 homogeneous sand-sand. On closer examination, however, comparison of the different experiments reveal that the lateral extent and definition of the intra-shale sand overlaps improve with increasing interpretation density (Fig. 10b). This is especially true when comparing the least dense seismic interpretation (Exp 1) to the densest one (Exp 4). Exp 5 follows the same pattern as Exp 4 in areas where the juxtaposed lithology ran smoothly, but there are some issues with the 510 juxtaposition (light grey triangle at base of eastern fault, Fig. 10b). This anomaly is caused by a limitation in the software, where the horizontal interpretation of the fault on depth slices results in some sections of the fault having vertical dips. It is not possible to generate juxtaposition diagrams in these vertical fault areas. The longitudinal strain (LS) patterns are similar to those observed in the DSG results (Fig. 11b).

Dip Slip Gradients (Longitudinal and Shear Strain)
The colour bar for longitudinal strain is set so any values outside a geologically realistic threshold (Freeman et al., 2010)  1-4 are within the acceptable threshold (Fig. 10b). These high values coincide with the area 530 between the top Kolje and top Fuglen. The eastern fault has the same LS bullseyes across its centre as observed in DSG, but they are mostly within the established threshold. In Exps 4 and 5, there are two above thresholds high (red) LS areas at the top Fruholmen level (Fig. 11b, black asterisks). All areas above threshold LS values (red, pink) are less than 250 m across.
For the shear strain (SS), the colour bar is also set to display geologically unrealistic values (+/-535 0.05; red and pink, Fig. 10c) (Freeman et al., 2010). Although SS highlights more problematic areas and places more stringent constraints to the interpretation, it indicates extreme highs and lows of SS at the overlap of the western and eastern faults, respectively (Fig. 10c, black arrows).
The overlapping sections of the faults are more laterally extensive from Exp 1 through to 5, which is reflected in the lateral extent of extreme SS. Localized (> 250 m) SS bullseyes highlight 540 some slight interpretation problems discussed before in relation to LS (Fig. 11c, black asterisks).
Due to the high degree of similarity between the experiments, no attempt has been made to analyse SS variations any further.

Reservoir modelling and hydrocarbon volume calculations
In order to test the implications of interpretation techniques on hydrocarbon volume calculations, 545 the least and most densely populated experiments (Exps 1 and 4) were input through a geological modelling workflow (Fig. 5). A 5 x 5 km geological model was generated for each experiment ( Fig. 12a, b) and used to calculate the bulk rock volume, pore volume and STOIIP (Fig. 9c, d).
There are significant differences in fault morphology, horizon resolution, and lithology distribution between the two geological models. In Exp 1, the surface anomalies observed in the 550 structure maps (Sect. 4.1, Fig. 6 arrows) are also evident in the 3D grid at the top and base of the gridded interval (Fig. 12a, Inset a, label 1). Since the top Fuglen and Fruholmen are used as the input to define the top and base of the gridded interval and the cells within, the surface anomalies also greatly impact the facies distribution in Exp 1, which undulate to match these anomalies.
These facies undulations can be observed on the exposed footwall of the eastern fault and on the 555 eastern geological model boundary, as the facies pull upwards towards the footwall (Fig. 12a, Inset a label 1). In Exp 1, there are also some problems with respect to the exposed fault planes, where some shale cells have bled up and down the fault planes creating unrealistic peaks ( Fig. 12 20 a, Inset label 2). This results in poor modelling of the relay ramp structure, although the exposed footwall and hanging wall blocks appear relatively smooth (Fig. 12a, Inset a label 3). 560 In Exp 4, the facies distributions do not have the same undulations that are observed in Exp 1. This result is more or less expected since these anomalies were not evident in the top Fuglen and Fruholmen, which define the grid. Flat, more geologically representative facies distributions are clear on the uplifted footwall of the eastern and western faults, and on the exposed eastern boundary of the model (Fig. 12b, Inset b label 1). A 'bleeding of facies' occurs on the margins of 565 the model and slightly on the edges of the faults (Fig. 12b). The relay ramp is much more clearly defined in this experiment than in Exp 1 (Fig. 12b, yellow arrow). The faults are better defined with respect to length and morphology, but the high density of interpreted fault sticks means that the fault planes have vertical jumps between grid cells in the 3D grid (Fig. 12b, Inset b label 3).
Bulk rock volume, pore volume, and oil (STOIIP) were calculated for both geological models, 570 using an oil water contact of 2420 m (Fig 12a, b; OWC). This contact was chosen to mimic a spill point at the lowest point of the top Fuglen. The volumetric analysis was run on each of the 100 realizations, the results presented are given as their average. The stochastic facies and porosity realizations used in these calculations were identical for the 2 experiments, which allowed any volume differences to be assigned to the impact of the resolution of the 575 interpretation. The volumetric calculations for Exp 4 were always slightly larger than Exp 1.
The bulk volumes for Exps 1 and 4 are 1548.7 and 1554.2 x 10 6 m 3 , respectively (a difference of 0.36%). For pore volume, values of 136.8 and 137.4 x10 6 m 3 were calculated from Exps 1 and 4 respectively, which is a difference of 0.46%. Finally, the calculation of oil in place (STOIIP) resulted in 123.1 x10 6 m 3 for Exp 1 and 123.7 x10 6 m 3 for Exp 4 (a difference of 0.46%).

580
The volumes in Exp 4 are slightly larger than in Exp 1, with the increase in the bulk rock volume carried through the pore-volume and STOIIP calculations. However, the percentage differences are very small, less than 0.5% for all metrics.

Implications on horizons and faults morphologies 585
The seismic interpretation method had a significant impact on all aspects of the fault analysis workflow. We found that both Exps 4 and 5 provided the most geologically accurate 21 representation of the morphologies of horizons, faults, and their intersections. The eastern fault was longest in Exp 5, while the western was longest in Exp 4, which suggests a combination of the methods (i.e. vertical and horizontal interpretation) would be the most rigorous approach to 590 fault interpretation. The horizons in Exps 4 and 5 were quick to interpret because of 3D autotracking, and they were also the most detailed. When interpreting the top Fuglen there was no need for a QC process since the imaging of this reflector was clear and the final surface did not contain any artefacts in the interpretation (Fig. 7, top Fuglen columns). The top Fruholmen needed some manual guidance/QC and did have some interpretation artefacts, but this was 595 unavoidable due to the poor seismic quality (Fig. 7, top Fruholmen columns). The interpretation of faults was slightly more time consuming for Exp 5 relative to 4, but the attribute volume increased the understanding of fault morphology and length compared to Exp 4 (see Fig. 8, fault lengths). Exp 1 is here considered to be a failure with respect to observed geological morphologies, and this methodology cannot be recommended as a method for fault 600 interpretation, even though it was very time efficient. The sparsity of the horizons and fault interpretations led to inaccuracies and gridding anomalies proportional to the spacing of the interpreted inlines (400 m), reduced fault length (Fig. 8, up to 400 m difference between Exps 1 and 4, western fault), and incomplete understanding of the relay morphology. Exps 2 and 3 were an improvement on Exp 1, as expected. They captured some important information but not as 605 much as Exp 4/5. The differences between Exps 2 and 3 were much less significant than those between Exps 1 and 2. As such, if manual interpretation of faults is required, Exp 2 should be considered as the minimum acceptable interpretation density for performing a detailed fault analysis workflow.
The two aspects of the fault analysis workflow that were the most effected by the interpretation 610 method were fault length and throw. Both the length and throw of the faults differed dramatically depending on interpretation density, which in turn had a large influence on the apparent morphologies of the faults and of the relay ramp (Figs 8-10). The knock-on effects of these are because the fault lengths and throws impact all other aspects of the workflow. Overall, comparison of the most and least densely interpreted datasets (Exps 4/5, 1 respectively) show 615 that the length, morphology and throws were different at both the top Fuglen and Fruholmen level (Figs 7-10).

22
The impact of the interpretation method on the length, morphology, and throw profiles in the relay is critical to understand its formation. Fault displacement/throw relationships in relay ramps are dependent on the stage of relay development in question (Fig. 13). In the first stage of 620 relay development, the faults do not overlap and therefore exhibit isolated fault throw profiles (Barnett et al., 1987; Fig. 13 a-b). Stage 2 of relay development is defined by the propagation of faults to form a relay ramp (Fig. 13c). Fractures break up the ramp (that in our case are subseismic resolution) and accommodate some of the strain of the relay (Larsen, 1988;Peacock and Sanderson, 1994). The throw profiles of the faults interact and the total throw of the overlapping 625 fault segments is accommodated by the relay ramp (Peacock and Sanderson, 1994 ; Fig 13d). The fault extents and throw profiles for Exp 1 (Fig. 9a, 10a) fall somewhere between stage 1 and 2, where there is a slight overlap of the faults, but a relay is only just starting to form (Fig. 6a). This is because Exp 1 does not properly capture the full length of the fault. Stage 3 of relay development is defined as when the faults have continued to propagate and fractures have begun 630 to spread through the relay structure, as it is near the maximum amount of strain it can accommodate (Long and Imber, 2012;Peacock and Sanderson, 1994). The propagation of the fault tips toward the relay and increased fault overlap are evident (Fig. 13e-f). Stage 4 of relay development defines the destruction (breaching) of the relay ramp and the formation of branch lines between the two relay forming faults (Peacock and Sanderson, 1994). The original tiplines 635 of the fault are no longer active, and the faults are now joined along branch-lines formed in the weakened and sheared ramp margins (Fig 13g-h). When analysing Exp 4/5, the morphologies are comparable to those observed in stage 3 of the relay formation. The northward propagation and curvature of the eastern fault tipline is clear, and there are likely fractures forming in the relay that are below the resolution of the seismic data. The relay in Exp 4/5 has not breached on either 640 the top Fuglen or Fruholmen level, although it is very close to breaching in Exp 5 at the top Fuglen ( Fig. 6d/e, 9d/e, 10a). The potential impact of a relay on a working hydrocarbon system and the implications of misinterpreting the relay are discussed in the upcoming Sect. 5.2.
A study of longitudinal and shear strain was completed to test the accuracy of the interpretation methods (Freeman et al., 2010). According to Freeman et al. (2010), longitudinal and shear strain 645 values in isolated faults should remain inside their defined threshold values (+/-0.1 and +/-0.05, respectively), in order for the interpretation to be deemed accurate. High and low values of longitudinal and shear strain were observed across all experiments, some of which are outside 23 these defined thresholds (Fig. 11 b, c). There is a high and low shear strain accumulation in all experiments on the western and eastern faults, respectively, and particularly in the parts of the 650 faults exhibiting overlap (Fig. 11 c). Freeman et al. (2010) stated that in the event of overlapping faults, higher shear strains (above their defined limit) are to be expected in the overlapping segments of the fault. The shear strain limits in this case are higher than could be expected from an isolated fault (Freeman et al., 2010). These highs and lows appear to change with interpretation density and align with the increased overlapping of the faults (Fig. 11 c, double 655 ended black arrows). There were some bullseye patterns (longitudinal and shear strain plots) which were outside of the fault overlap and outside of the defined threshold strains, these are interpreted to be artefacts produced by incorrect fault stick interpretations (Fig. 11 b, c black asterisks). It is possible that some of the bullseye patterns observed, which did not align with interpretation spacing are real and linked to the coalescence of faults during their formation (e.g. 660 Lohr et al., 2008), but this was outside the scope of this study and was not investigated further. It is important to note that interpretation accuracy with respect to longitudinal strain and shear strain was not the aim when running the initial interpretations, and therefore it is expected that some inconsistencies are present.

Interpretation and aspects of the petroleum industry
Relay ramps and the faults that define them have significant impact on sediment distribution pathways (deposition of reservoirs), fluid flow/ migration pathways, fault seal/juxtaposition and trap definition (e.g. Athmer et al., 2010;Athmer and Luthi, 2011;Botter et al., 2017;Fachri et al., 2013a;Knipe, 1997;Manzocchi et al., 2008aManzocchi et al., , 2010. By under interpreting the relay with 670 respect to fault length and throw (as discussed in Sect. 5.1, Exp 1), there is a clear misunderstanding of the stage of relay development, and therefore a misunderstanding of fault interactions. Exp 1 exhibits shorter faults with less throw and therefore a less defined relay (Fig.   14, left column). This under-interpretation of the relay will also have implications on our understanding of sediment distribution pathways (Fig. 14a). Compared with the relay interpreted 675 from Exp 4 (Fig. 14, right column), the results of Exp 1 (left column) also show: less laterally continuous extent of juxtaposed sand-on-sand resulting in different fault sealing (Fig. 14b), an unsuccessful fluid flow schematic where petroleum does not migrate towards the producer well 24 (Fig. 14c), and an under estimation of trap size because of the incorrect trap geometry (Fig. 14d).
These results are specific for our field area / relay morphology and of course, may differ with 680 changing field parameters. The important thing, however, is that significant differences can be generated by applying an interpretation method that is unsuitable for the scale of the structures that are being analysed.

The effect of interpretation on geological modelling
A geological modelling workflow was run on the least and most successful interpretation The bleeding of facies on the fault planes is caused by the low interpretation density and is easily avoided with a denser interpretation. Exp. 4 had more realistic horizon morphologies, more geologically realistic facies distributions, and much less facies bleed. The only problem with this interpretation was that the inline fault stick spacing resulted in linear cell anomalies and 705 unsmooth fault planes (Fig. 12b). Therefore, we suggest that when modelling, the removal of fault sticks in the fault's centre may provide clearer results. Deleting fault sticks is likely to result in some loss of detail in the faults structural morphology such as undulations or corrugations 25 (e.g. Needham et al., 1996;Resor and Meer, 2009;Ziesch et al., 2017). However, it is currently not possible to model these intricacies (and high density fault sticks) in a geologically realistic 710 manner using the current modelling softwares. The optimum interpretation strategy is therefore a balance between maintaining an adequate level of geological detail, and being able to produce a realistic and functioning geomodel.
Volumetric calculations using the two models revealed that the gross rock volumes were 0.35% larger in Exp 4 when compared to Exp 1, and both the in-place hydrocarbon volume (STOIIP) 715 and pore volume calculations of Exp 4 were 0.46% greater than Exp 1. These differences are small (and certainly much less than the normal uncertainty values considered in the industry), which suggest that for preliminary field analysis/ petroleum calculations, a detailed seismic interpretation is not all that important. However, this result has significant implications when upscaled to the fields dimensions -in this case the Snøhvit field. For simplicity in the 720 calculations, we take the values from the Norwegian Petroleum Directorate for field size, and the STOIIP in the entire Snøhvit area to be referencing an oil only field. In reality, the field contains gas, condensate and a small oil column (NPD, 2020). According to the Norwegian Petroleum Directorate, the Snøhvit field holds in place volumes of ~400 x10 6 m 3 oil equivalent (NPD, 2020). A STOIIP difference of 0.46% between Exps 1 and 4 on this field size is equal to ~1.84 725 x10 6 m 3 oil in place. This is equal to an underestimation of ~11.6 million barrels (1 m 3 oil = 6.29 bls) of in place oil in Exp 1 versus 4. The NPD lists the recovery factor of the Snøhvit field to be 64% (NPD, 2020), so only 7.4 million barrels can be considered recoverable. Assuming an oil price of 50 USD per barrel, this difference in interpretation method is equivalent to c. 370 million USD. Although this value is relatively small in the industry, it is staggering to see how 730 the inaccuracy in the calculation of petroleum reserves can be solely based on poor interpretation strategies, which are mistakes that are completely avoidable.

Horizons and horizon-fault intersections
The results showed that 3D auto-tracking (1 x 1 density) gave the best results in terms of detail in 735 the structure of horizons, horizon-fault intersections (cutoff-lines, throw, etc.), and it was the most time efficient option assuming relatively high-quality data. In the case of high-quality data and well-defined continuous strong seismic reflectors (e.g. top Fuglen), little manual quality 26 control of the interpretation is required. If the seismic data are of poorer quality, or the reflector in question is poorly imaged, discontinuous or changes seismic polarity, or there is significant 740 structural complexity and ambiguity, then it is important to reflect on the task at hand. This is because auto-tracking algorithms may fail or generate artefacts or erroneous results that require significant manual adjustment to correct. If fault seal or juxtaposed lithologies are critical to the field analysis, then a denser manual/ 2D auto-tracked method might be necessary and worth the significant time commitment (i.e. 8 x 8). If detailed structural analysis is not required, then a less 745 dense (i.e. 16 x 16) grid will give sufficient results for geological interpretation. A sparse interpretation spacing (i.e. 32 x 32) can give a geologically unrealistic and inaccurate representation of the subsurface, which could lead to critical errors in prospect or field evaluations and, as such, it cannot be recommended except for broad-scale regional understanding. These results assume a 12.5 m IL and XL spacing and may need to be adjusted in 750 the event of a different spacing.

Faults
The results of Exps 4 and 5 are very similar and give the most accurate picture with respect to fault extent, throw and morphology of the relay. When considering our experiments, it was difficult to capture the entire fault length if using less than a 4 IL interpretation spacing, but we 755 also found interpretation on horizontal time/depth slices to be a useful tool to accurately capture fault length. Therefore, the recommendations are to interpret faults on a minimum of 8 or 16 IL spacing for the main body of the fault, and on approaching tip lines or complex fault intersections, to decrease the line spacing in order to capture the full length, morphologies and relationships. We also recommend the combination of horizontal fault sticks and attributes to 760 understand fault morphology, fault extent, and to keep track of fault locations in 3D when interpreting horizons. The results shown here demonstrate that less than 16 IL spacing was insufficient to capture critical details required when performing fault interpretations, and as such should be avoided for critical prospect or field scale mapping. These results are also assuming an IL/XL spacing of 12.5 m and may need adjustment if the data differ.      Barnett et al., 1987;Elliott et al., 2012;Rippon, 1985;Watterson, 1987, 1988;Watterson, 1986;Wilson et al., 2009;Yielding and Freeman, 2016.