Tectonic inheritance controls nappe detachment, transport and stacking in the Helvetic Nappe System, Switzerland: insights from thermo-mechanical simulations

. Tectonic nappes have been investigated for more than a hundred years. Although geological studies often refer to a “nappe theory”, the physical mechanisms of nappe formation are still disputed. We apply two-dimensional numerical simulations of shortening of a passive margin, to investigate the thermo-mechanical processes of detachment (or shearing off), transport and stacking of nappes. We use a visco-elasto-plastic model with standard creep ﬂow laws, Drucker-Prager and von Mises 5 yield criteria. We consider tectonic inheritance with two initial mechanical heterogeneities: (1) lateral heterogeneity of the basement-cover interface due to half-grabens and horsts and (2) vertical heterogeneities due to layering of mechanically strong and weak sedimentary units. The model shows detachment and horizontal transport of a thrust nappe that gets stacked on a fold nappe. The detachment of the thrust sheet is triggered by stress concentrations around the sediment-basement contact and the resulting brittle-plastic shear band that shears off the sedimentary units from the sediment-basement contact. Horizontal 10 transport is facilitated by a basal shear zone just above the basement-cover contact, composed of thin, weak sediments that act as a décollement. Fold nappe formation occurs by a dominantly ductile closure of a half-graben and the associated extrusion of the half-graben ﬁll. We apply our model to the Helvetic nappe system in Western Switzerland, which is characterized by stacking of the Wildhorn thrust nappe above the Morcles fold nappe. The modeled structures, the deformation rates and the temperature ﬁeld agree with data from the Helvetic nappe system. Mechanical heterogeneities must locally generate effective viscosity contrast (i.e. ratio of stress to visco-plastic strain rate) of about three orders of magnitude to model nappe structures similar to the ones of the Helvetic nappe system. Our results indicate that the structural evolution of the Helvetic nappe system was controlled by tectonic inheritance and that material softening mechanisms are not essential to reproduce the ﬁrst order nappe structures. our results suggest that tectonic inheritance in the form of half-grabens and horsts has a strong impact on the development of fold and thrust nappes during crustal deformation. Our results indicate that two types of tectonic inheritance are important, namely the geometry and the magnitude of mechanical heterogeneities. The geometry of half-grabens and horsts controls the location of nappe initiation (Bauville and Schmalholz, 2017). The basement and sediments must, of course, have different mechanical strength, otherwise the geometry of the base- 25 ment would be unimportant. Our results suggest that tectonic inheritance was necessary to the evolution of the Helvetic nappe system, but not sufﬁcient. The results show that speciﬁc strength-contrast between basement and sediments and within to model nappe structures resembling Helvetic nappe effective viscosity basement order 24 Pa basement layer yield right Formation of the Morcles fold nappe is associated to the closing of a half-graben, bounded by basement massifs that form now the Aiguilles-Rouges and Mont-Blanc massifs, and the associated squeezing-out of sediments from this half-graben. Closing of the half-graben occurred by an overall distributed deformation without major reactivation of earlier half-graben normal faults and without major localized thrusting in the basement.


Introduction
Tectonic nappes were discovered more than a hundred years ago and are considered as typical tectonic features of orogenic belts (e.g. Price and McClay, 1981), particularly in the Alps (e.g. Lugeon, 1902;Termier, 1906;Argand, 1916;Tollmann, 1973;Trümpy, 1980;Escher et al., 1993;Pfiffner, 2014). Several definitions of a nappe have been proposed (see discussion in Price and McClay, 1981), for example, after Termier (1922): "A nappe is a rock packet not in its place, resting on a substratum 5 that is not its original one". Two end-member types of nappes are commonly distinguished, namely fold nappes and thrust nappes, or thrust sheets (e.g. Termier, 1906;Price and McClay, 1981;Epard and Escher, 1996). Fold nappes are recumbent folds exhibiting large-scale stratigraphic inversion, typically with amplitudes that are exceeding several kilometers. In contrast, thrust sheets are allochtonous sheets with a prominent shear zone or thrust at their base, but without a prominent overturned limb. The importance of tectonic nappes for orogeny, especially for collisional orogens, is nowadays well established, however, 10 the physical mechanisms of nappe detachment (or shearing off), transport and stacking are still disputed.
We focus here on the Helvetic nappe system in Western Switzerland (see next section for a geological overview), which is one of the birthplaces of the concept of tectonic nappes. In 1841, Arnold Escher mentioned a nappe (he used "Decke" in german) and a colossal overthrust ("colossale Überschiebung") during the presentation of a geological map of the canton Glarus, Eastern Switzerland (Escher von der Linth, 1841). Escher did not dare to publish his interpretation, but explained it in the 15 field to Roderick Murchinson, who published the overthrust interpretation in 1849 (Murchison, 1849), crediting Escher for the original observation. Bertrand (1884) argued also convincingly for an overthrust nappe (he used "masse de recouvrement" and "lambeaux de recouvrement" instead of nappe) in the Glarus region so that finally also Heim (1906) accepted the overthrust interpretation instead of the earlier preferred double-fold interpretation ("Überschiebungsfalte" instead of "Doppelfalte"). Although the important controversies and observations supporting tectonic nappes are related to the Glarus region, which is part 20 of the Helvetic nappe system, the true birth date of the nappe concept in the Alps, according to Trümpy (1991), is the publication by Schardt (1893) who worked in the Prealps, belonging to the Penninic domain (e.g. Escher et al., 1993). Schardt (1893) realized that Jurassic breccias have been thrust over Tertiary flysch and that large regions of the Prealps have been actually emplaced as a major overthrust. After decades of controversy, the existence of nappes was generally accepted approximately a century ago, revolutionizing Tectonics, Alpine Geology and orogeny in general (for historical reviews see Bailey 1935;Masson 25 1976;Merle 1998;Trümpy 1991;Dal Piaz 2001;Schaer 2010).
Since then, a considerable effort has been made in mapping the present-day structure of the Helvetic nappe system (e.g. Steck, 1999;Pfiffner et al., 2011). Structural and paleogeographic reconstructions have provided a valuable insight into the kinematics of nappe formation (e.g. Gillcrist et al., 1987;Epard and Escher, 1996;Herwegh and Pfiffner, 2005;Bellahsen et al., 2012;Boutoux et al., 2014). Therefore, the geometrical structure and kinematic evolution of the Helvetic nappe system 30 is reasonably well understood. There are also theoretical and analogue modeling studies investigating the formation of foldand-thrust belts and nappes (e.g. Bucher, 1956;Rubey and King Hubbert, 1959;Dietrich and Casey, 1989;Merle, 1989;Casey and Dietrich, 1997;Wissing and Pfiffner, 2003;Bauville et al., 2013;Poulet et al., 2014;Erdős et al., 2014;Jaquet et al., 2014;Ruh et al., 2014;Bauville and Schmalholz, 2017). However, the controlling physical processes of nappe detachment, transport and stacking, and the associated dominant rock deformation mechanism are still disputed. An overview of suggested driving forces and deformation mechanisms for nappe formation is given in Merle (1998, his chapter 3). For example, for fold nappes, many interpretations favor distributed shearing and dominantly ductile deformation mechanisms, such as dislocation or grainsize sensitive diffusion creep (e.g. Ramsay et al., 1983;Gillcrist et al., 1987;Ebert et al., 2008;Bauville et al., 2013). However, there are also interpretations arguing for localized thrusting and dominantly brittle-plastic deformation mechanisms, such as 5 fracturing related to fluid pressure (e.g. Boyer and Elliott, 1982;Granado and Ruh, 2019). Furthermore, the presumed driving forces of nappe transport are either external surface forces, due to tectonic compression, or internal body forces, due to gravity.
Heterogeneous shearing due to buttressing in a general compressional regime is an example of deformation driven by external forces (e.g. Ramsay et al., 1983;Epard, 1990;Bauville et al., 2013;Boutoux et al., 2014). Gravity gliding and spreading is an example of deformation driven by body forces (e.g. Durney, 1982;Merle, 1989;Merle and Guillier, 1989). For thrust 10 sheets, the prominent low-angle thrust planes are likely controlled by mechanical heterogeneities, such as the orientation of the basement-cover interface and of mechanically weak shale-rich or evaporite layers, as has been suggested for the Helvetic nappe system (e.g. Pfiffner, 1993;Steck, 1999;Pfiffner et al., 2011;Bauville and Schmalholz, 2017). Several softening mechanisms have been proposed to localize deformation at the base of the thrust sheet, such as reduction of the effective stress due to pore fluid pressure causing a reduction of the effective friction angle (e.g. King Rubey and King Hubbert, 15 1959) or a dominantly ductile deformation mechanism (e.g. Smoluchowski, 1909;Goguel, 1948;Voight, 1976), presumably in combination with thermally-, chemically-or mechanically-activated softening mechanisms (e.g. Poirier, 1980;Ebert et al., 2008;Poulet et al., 2014).
To make another step towards understanding the physical process of nappe formation, we investigate the detachment, transport and stacking of nappes with two-dimensional (2D) numerical simulations based on continuum mechanics. To keep the 20 model relatively simple, we focus here on thermo-mechanical processes on the macro-scale, larger than the typical size of mineral grains. Hence, we do not consider hydro-chemical couplings, such as fluid release by carbonate decomposition (e.g. Poulet et al., 2014), and micro-scale processes, such as micro-structural grain size evolution with secondary phases (e.g. Herwegh et al., 2011). The numerical algorithm is based on the finite difference method. We consider a standard visco-elasto-plastic deformation behavior, heat transfer and thermo-mechanical coupling by shear heating and temperature-dependent viscosities. 25 We also apply velocity boundary conditions that are standard for modeling accretionary or orogenic wedges (e.g. Buiter et al., 2006). For the comparison between model results and natural observations, we consider a geological section across the Helvetic Nappe System in Western Switzerland. This section is characterized by two deformed basement massifs, the Aiguilles-Rouges and Mont-Blanc massifs, a fold nappe, the Morcles nappe, and a thrust nappe, the Wildhorn super-nappe, that has been overthrust, or stacked, above the underlying fold nappe (Fig. 1). In our models, we consider the tectonic inheritance of the Mesozoic 30 passive margin formation in the form of simple half-grabens and horsts, because the Helvetic nappe system resulted from the inversion of the pre-Alpine European passive margin (e.g. Trümpy, 1980). We consider two main orientations of inherited mechanical heterogeneities: (1) a lateral variation of mechanical strength due to the lateral alternation of basement and sediments associated with the half-graben structure and (2) a vertical variation of strength due to (i) the basement-cover interface, (ii) the alternation of strong carbonate with weak shale-rich units (so-called mechanical stratigraphy after Pfiffner (1993)) and (iii) the pressure and temperature sensitivity of rock strength and effective viscosity, respectively.
The main aim of this study is to show that a thermo-mechanical model based on the theory of continuum mechanics (i) with a well established visco-elasto-plastic rheology based on standard flow laws, (ii) with mechanical heterogeneities mimicking pre-Alpine extensional heritage and stratigraphic layering and (iii) with a wedge-type compressional configuration can self-5 consistently explain the first-order features of nappe detachment, transport and stacking in the Helvetic nappe system.

Short overview of the Helvetic Nappe System in Western Switzerland
The Helvetic nappe system is commonly subdivided into Infrahelvetic, Helvetic and Ultrahelvetic units (Fig. 1c) (e.g. Masson et al., 1980;Escher et al., 1993;Pfiffner et al., 2011). The nappes consist mainly of Jurrasic to Paleogen sediments that were deposited on the Mesozoic European passive margin before the Alpine orogeny (Fig. 1a). This passive margin inherited 10 half-grabens and horsts from the Mesozoic, pre-Alpine extensional phase (e.g. Masson et al., 1980;Escher et al., 1993). The stratigraphy of the nappes is generally characterized by shale-rich units, totaling several kilometers in thickness, and two major units of massive platform carbonates, the so-called Quinten (Malm) and Urgonian (Lower Cretaceous) limestones, with a thickness of several hundred meters (e.g. Masson et al., 1980;Pfiffner, 1993;Pfiffner et al., 2011).
In the studied section, along the Rhone-valley near Martigny (Switzerland), the Infrahelvetic units form the Morcles fold 15 nappe (e.g. Steck, 1999). This recumbent fold nappe is strongly deformed, but is still connected to its original position of deposition, the Mesozoic half-graben between the Aiguilles-Rouges and the Mont-Blanc massifs (Fig. 1a). Therefore, the Morcles nappe is considered as a parautochtonous unit and its root zone, is termed the Chamonix zone (Fig. 1c). The sediments forming the Helvetic nappes have been deposited on more distal regions of the European passive margin than the units forming the Morcles nappe. The original regions of deposition of the Infrahelvetic and the Helvetic units have been presumably separated 20 by a horst, or basement high (Fig. 1a). The Helvetic nappes have been thrust above the Infrahelvetic units. In the studied region, the Helvetic nappe is termed the Wildhorn super-nappe, because it can be subdivided into the Diablerets, Mont Gond and Sublage nappes (Fig. 1c;Escher et al. (1993)). Due to the Rhone valley associated with the Rhone-Simplon fault, the Helvetic nappes cannot be continuously traced back to their original position of deposition (Fig. 1c). The Ultrahelvetic units have been depositied on more distal regions than the Helvetic units (Fig. 1a). Today, the Ultrahelvetic units are found in front and between 25 the Morcles and Wildhorn nappes (Fig. 1c).
During the Alpine continental collision, the Ultrahevetic units and the Penninic nappes, originating from more distal positions, have been thrust above the original deposition regions of the sediments forming today the Morcles and Wildhorn nappes ( Fig. 1b) (e.g. Epard and Escher, 1996). These sediments were subsequently sheared off from their original position of deposition and were transported several tens of kilometers towards the foreland, along a northwest transport direction (e.g. Epard 30 and Escher, 1996;Ebert et al., 2007). The present day nappe structure represents a thick-skinned tectonic style because the crystalline basement of the Aiguilles-Rouges and Mont-Blanc massifs exhibits significant deformation (Fig. 1c).
The above tectonic scenario is supported by peak metamorphic temperatures of the Helvetic nappe system, which range between 250-385 o C (Kirschner et al., 1996(Kirschner et al., , 1995Ebert et al., 2007Ebert et al., , 2008 increasing downwards and towards the root zone. The maximal depth of burial of the Morcles nappe has most likely exceeded 10 km and was achieved between 29 Ma and 24 Ma ( Fig. 1b) (Kirschner et al., 1996(Kirschner et al., , 1995. In the studied section, the carbonate layers are strongly folded indicating significant internal ductile deformation of the nappes (Fig. 1c). The Morcles fold nappe is characterized by strong parasitic folding in its 5 frontal part and by a ca. 20 km long, highly stretched inverse limb. The Wildhorn super-nappe also exhibits significant internal deformation, such as the isoclinal fold separating the Diablerets and Mont Gond nappes (Fig. 1c). These observations indicate that in the studied region ductile deformation was significant during formation of the nappes.

Mathematical model 10
Our mathematical model is based on the concept of continuum mechanics (e.g. Mase and Mase, 1970;Turcotte and Schubert, 2014). We assume slow (no inertial forces), incompressible deformation under gravity. Heat production and transfer by conduction and advection are considered. Thermal evolution and deformation are coupled by temperature dependent viscosities and shear heating, that is, dissipative deformation is converted into heat to conserve energy. The governing system of partial differential equations is solved numerically. The applied equations are described in detail in Schmalholz et al. (2019). The 15 applied numerical algorithm is based on the finite-difference/marker-in-cell method (e.g. Gerya and Yuen, 2003). The diffusive terms in the force balance and heat transfer equations are discretized on an Eulerian staggered grid, with a resolution of 3001 × 1001 (width×height). Advection and rotation terms are treated explicitly using a set of Lagrangian markers and a 4th order in space / 1st order in time Runge-Kutta scheme. The topography in the model is a material interface defined by a contour line with Lagrangian coordinate points, which is advected with the computed velocity field (Duretz et al., 2016). With ongoing 20 deformation, the distance between neighboring points can increase and achieve the size of the numerical finite difference cells.
In such case this contour line is locally remeshed by adding additional points in the deficient contour line segments.
We consider a visco-elasto-plastic deformation behavior and assume a Maxwell viscoelastic model and Drucker-Prager and von Mises yield criteria (see details in Schmalholz et al., 2019). In the applied creep flow laws, we add a constant pre-factor f to the dislocation creep flow laws where the expression to the right of f corresponds to the effective viscosity from standard dislocation creep flow laws determined by rock deformation experiments. In equation (1),˙ disII is the square root of the second invariant of the dislocation creep strain rate [s −1 ], and T is temperature [K]. All other parameters are explained and listed in Table 1.

3.2 Model configuration
The applied model configuration mimics a 200 km long section of the upper crustal region of a simplified passive margin ( Fig. 2). We consider four model units with distinct mechanical properties, namely basement, cover, strong layer and weak unit. The basement unit represents the crystalline basement, the cover unit represents the Ultrahelvetic and Penninic nappes, the strong layer represents the main carbonate layers (Malm and Urgonian) and the weak unit represents the shale-rich units. 5 The initial geometry of the basement unit represents the crystalline upper crust of a passive continental margin with 15 km thickness, tapering down to 5 km thickness (Fig. 2). The Infrahelvetic basin is represented by an idealized half-graben that is 5 km deep and 25 km wide. The Infrahelvetic and the more distal (right side of the model) Helvetic basin are separated by an idealized horst structure. We cover the entire passive margin structure with sediments, to obtain a total (basement + sediments) model thickness of 25 km. The model stratigraphy consits of three units (cover, strong layer and weak units) and each unit has  Table 1.

Results
First, we present the main results of a reference simulation, for the configuration described above, and then results of simulations in which some parameters are varied. All simulations show some common, general features: With increasing bulk 30 shortening, the initially flat topography is increasing, mostly around the right model boundary, representing a "back-stop" (Fig.   3). The models develop a wedge shape with a topography tilting towards the left side of the model. With progressive shortening, the increasing topography reaches the left model boundary and the topographic slope diminishes, generating again a flatter topography. Also, basement deformation occurs initially around the bottom right corner and progressively propagates towards the left (Fig. 3). With progressive shortening, the model thickens and the sedimentary units above the basement become relatively thicker than the underlying basement because the sediments are thrust above the basement. The thickened sedimentary cover results in increasing basement temperatures, hence decreasing its viscosity. The temperature increase at the top of the basement is documented in figure (3) by the position of the 300 o C isotherm. Such basement temperature increase and the 5 related shift to thick-skinned deformation was also reported by Bauville and Schmalholz (2015) in their numerical models of fold-and-thrust belts. Basement deformation results in the partial or total closure of the half-graben and associated extrusion of the basin fill. The specific model evolution, however, depends on the applied flow laws and model stratigraphy, which will be discussed in comparison with the reference simulation.
4.1 Reference model 10 We apply the configuration and parameters described in the previous section and displayed in figure 2 to generate a reference simulation (Figs. 3 and 4). Initially, elastic stress builds up during a few hundred thousand years until the brittle-plastic yield stress and the steady-state ductile creep stress are reached. The brittle-ductile transition occurs at about 6-8 km depth. We quantify deviatoric stress magnitudes with the square root of the second invariant of the deviatoric stress tensor, τ II , and maximal deviatoric stresses reach ca. 250 MPa at the brittle-ductile transition (Fig. 4). Maximal strain rates in the developing 15 shear zones are between 10 −13 and 10 −12 s −1 , in broad agreement with strain rate estimates for natural shear zones (e.g. Pfiffner and Ramsay, 1982;Boutonnet et al., 2013;Fagereng and Biggs, 2018). The largest stresses develop around the brittleductile transition in the cover, whereas stresses in the basement and in the strong layer are significantly smaller (Fig. 4). During the initial stages of deformation, the strong layer of the right basin is gently folding, or buckling (Fig. 3a). Stress 25 becomes concentrated around the contact of this strong layer and around the basement horst ( Fig. 5m) causing increased strain rates in this region. With progressive deformation, a localized shear zone dominated by brittle-plastic deformation develops across the strong layer, eventually detaching it from the basement (Fig. 5j to l). This shear zone develops within the strong layer so that a small piece of the strong layer remains attached to the basement (Fig. 5t). The detachment of the strong layer causes a significant stress drop in the strong layer and the basement (Fig. 5m to p). Once detached, the strong layer and parts of the 30 underlying weak unit passively move sub-horizontally over the horst initiating the horizontal nappe transport. Quantification of elastic strain rates shows that elastic deformation is active during the detachment process and that, hence, elastic stresses are not completely relaxed through visco-plastic deformation ( Fig. 5e to h).
During the detachment, some parts of the weak cover, originally residing above the strong layer, are dragged below the detaching strong layer (Fig. 5). During the horizontal transport, the detached unit, consisting of the strong layer and some weak units, is displaced above the cover material. Significant horizontal transport is facilitated because the underlying basement and the strong layer of the left half-graben are significantly more competent than the weak units at the base of the overthrusting nappe. 5 While the detached unit from the basin on the right is overthrusting the fill of the half-graben left, this fill is also sheared out of the half-graben due to (i) shear stresses generated by the overthrusting unit and (ii) closure of the half-graben due to ductile deformation of basement units. During overthrusting, some parts of the cover units are incorporated between the overthrusting unit and the fill of the left half-graben. Finally, a nappe consisting of the fill from the basin on the right has been stacked above a fold nappe made of materials from the left half-graben. The entire process of nappe detachment, transport and stacking occurs 38 % after ca. 8 Myr.

Impact of varying strength contrast
We performed three simulations with the same initial geometry and final bulk shortening as the reference simulation, but with 15 modified pre-factors, f , in the applied flow laws. In a first simulation, we used a smaller effective viscosity for the basement only (f = 0.33). Here, the basement is weak enough to deform significantly from the onset of shortening. Yield stresses are not reached at the contact of the basement horst with the strong layer (Fig. 6a). The strong layer does not detach from the basement and overthrusting does not take place. Instead, a several km large fold nappe develops in the strong layer of the basin on the right. Due to the highly distributed basement deformation, the half-graben closes only partially, resulting in a moderate 20 buckling of the strong layer, but not in the formation of a fold nappe. Also, a nappe stack does not form in this simulation.
In a second simulation, we used a stronger cover (f = 0.5 instead of f = 0.1). The effective viscosities of basement and cover are similar, as a result a mostly evenly distributed thick-skinned deformation takes place ( Fig. 6c and d). A large scale fold develops above the horst, but the overturned limb eventually detaches from the basement by necking. Although the overthrusting stage results in a significant horizontal displacement, this displacement is only half that observed in the reference simulation 25 and not sufficient to form a nappe stack. Due to the stronger shear drag from the top, the strong layer of the half-graben on the left is almost entirely sheared out. The strong layer of the left half-graben forms an overthrust nappe with significant horizontal displacement and with significant internal extension.
In a third simulation, we used weaker strong layers (f = 0.33). The development of the sediment units of the basin is largely similar to that in the reference model (Fig. 6). The only notable difference is that before the strong layer is detached from the 30 basement, it forms a shear fold that is on the scale of a few km. The development of the units of the left half-graben is largely different compared to the reference simulation. Since the strong layer is weaker, the drag from the overriding units is sufficient to detach the strong layer from its left contact with the basement and displace it several tens of km to the left. Due to significant horizontal displacements, a nappe stack forms with two thrust sheets on top of each other. However, the strong layer from the left half-graben is displaced considerably further towards the left than the strong layer from the basin on the right.
The final result of the three simulations differs significantly from the result of the reference simulation, although the effective viscosities of individual model units have been modified by factors of only three to five (Fig. 6). The results indicate that the effective viscosity contrast between the model units has a first-order impact on the results. 5 At the onset of nappe formation, after ca. 5% bulk shortening, the reference simulation and the three simulations with different f factors exhibit different distributions and magnitudes of effective viscosity (Fig. 7). The effective viscosity at the top of the basement is in the order of 10 24 Pa.s and the viscosity of the cover directly above the basement is at least one order of magnitude smaller. The strong layers have locally similar maximal effective viscosities than the top basement in the order of 10 24 Pa.s. The effective viscosity contrast between strong layer and weak units in the basin on the right is ca. three orders of 10 magnitude (Fig. 7a). The above mentioned viscosity ratios between model units are required to generate the nappe detachment, transport and stacking in the reference simulation. In the three models with different f factors in some model units, one of these viscosity ratios is different and, hence, the final result differs from the one of the reference simulation (Fig. 7). In other terms, slight changes in effective viscosity distribution leads to different outcomes.

15
To test the impact of the vertical strength distribution in the basement, we performed four simulations with the same configuration as the reference model, but we limit the deviatoric stress in the basement to 25, 50, 75 and 100 MPa, respectively ( Fig.   8a-d). The stress limitation is achieved by setting, in the basement only, the cohesion (C) to the stress limit and the friction angle to zero. A zero friction angle simulates effectively a pressure-insensitive von Mises yield criterion, which can mimic low temperature plasticity, such as Peierls creep. In all four simulations, the detachment of the strong layers from the basement in 20 the distal basin is caused by a plastic shear zone that cuts through the edge of the horst and shears off a small basement sliver ( Fig. 8a-d). At the half-graben, however, the deformational style depends on the value of the cohesion (C). For C = 25 MPa, the strong layer in the half-graben is scratching off the top of the left basement and forms a thrust nappe (Fig. 8a). For C = 50 MPa, the strong layer still scratches off a bit of the basement top, but forms now a fold nappe. For increasing values of C, the amount of strain in the basement is decreasing, hence the shape of the nappe from the half-graben is becoming more similar to 25 the shape in the reference simulation ( Fig. 8b-d).

Impact of multilayers
We also run simulations in which we replaced the single strong layer in the reference model with two thinner ones that are separated by weak units. We run three simulations with different initial thickness distributions of the two strong layers and alternating weak units. The initial thickness configuration is displayed on the right of the three panels in Fig. 9. The material 30 parameters of every unit are the same as in the reference model. The basement deformation agrees with the one in the reference model. The deformation of the strong layers is different.
On the right side, the strong layers form initially shorter wavelength (due to their smaller thickness) buckle folds, in agreement with the dominant wavelength theory (e.g. Biot, 1961;Schmalholz and Mancktelow, 2016). In the simulations, where the upper strong layer rests directly below the cover (Fig. 9b, c), the top layer is being detached and transported in a similar fashion as in the reference model. The lower layer, on the other hand, forms a fold nappe first, with an extremely thinned inverse limb. Eventually this inverse limb develops boudinage, and once necking takes place it detaches from the basement. In 5 the simulation, in which a weak unit is located between the upper strong layer and the cover (Fig. 9a), both layers form folds and detach from the basement horst by necking in the inverse limb.
Around the half-graben on the left, the deformation of the units is similar to the reference model, when weak units are located between the upper strong layer and the cover (Fig. 9a). The main difference to the reference simulation is that the weak unit located on top of the half-graben is sheared out, and both strong layers form a fold nappe with a more intensely stretched 10 inverse limb. In the models, in which the upper strong layer is in direct contact with the cover (Fig. 9b, c), the drag from the overriding unit is sufficient to displace this layer considerably horizontally. Drag from the top shears the upper strong layer of the left half-graben above the basement to the left and it detaches the layer from the half-graben. As a result, buckle and shear folds form around the left tip of the layer (Fig. 9b). The upper strong layer starts moving sub-horizontally without considerable internal deformation, and eventually forms a rootless nappe. The lower strong layer of the half-graben stays mostly in place, 15 until the weak units are extruded from the half-graben due to its closure. Then, the lower strong layer forms a fold nappe, with a highly stretched inverse limb (Fig. 9b, c).

Impact of softening mechanisms
We also test the impact of two different softening mechanisms, activated in the reference simulation, that can enhance strain localization (Fig. 10). The first mechanism is thermal softening by shear heating due to the conversion of mechanical work 20 into heat and the resulting decrease of the temperature-dependent viscosity (e.g. Yuen et al., 1978;Kaus and Podladchikov, 2006;Jaquet and Schmalholz, 2017;Kiss et al., 2019). Although this mechanism is activated in all simulations, for the applied 1 cm.yr −1 convergence velocity its impact on structure development is negligible. However, with 5 cm.yr −1 convergence velocity, thermal softening is sufficient to cause spontaneous shear zone formation (Kiss et al., 2019). In the higher velocity simulation, prominent ductile shear zones are formed in the cover that also promote more localized brittle deformation zones 25 (Fig. 10a). Heat production in the ductile shear zone raises the temperature of the units close to the "back-stop" on the right side of the model. Thus, at this location, the basement deformation is more intense, whereas the left half-graben in the basement is not being closed and the sediment fill is not being squeezed out (Fig. 10a).
The other considered softening mechanism is frictional-plastic strain softening. Such softening is frequently applied in numerical models of crustal deformation in order to enforce highly-localized brittle deformation by decreasing the friction 30 angle as a function of accumulated plastic strain (e.g. Buiter et al., 2006). Such softening algorithm induces mesh dependence of load bearing capacities, but we apply it here for comparison (e.g. Buiter et al., 2006;Erdős et al., 2014;Ruh et al., 2014). We used two different parameter sets to model strain softening. In the first case (Fig. 10b), we start with a friction angle of 30 o that we linearly decrease to 5 o between an accumulated plastic strain of 0.5 and 1.5, which is a strain interval typically considered in geodynamic models with frictional strain softening (e.g. Erdős et al., 2014). Compared to the reference simulation, we observe strongly localized brittle deformation that is characterized by high angle ( 0 o from horizontal) and small displacement (< 10 km) overthrusts. This is the only simulation, where a strong back-thrust forms over the right basin that also deforms the strong layer. In the second case (Fig. 10c), we start with a friction angle of 15 o that we linearly decrease to 5 o between accumulated plastic strain of 0.5 and 1.5. Such initially lower friction angle is often suggested to mimic fluid-pressure reduced effective 5 friction angles (e.g. Erdős et al., 2014). In this simulation, the detachment mechanism of the strong layer from the right side of the horst is different than in the reference simulation. The initial buckling and folding phase is entirely missing, and plastic yielding dominates from the beginning of deformation. Initially, the angle of thrusting is ca. 35 o from the horizontal. Once a sufficient amount of weak units are sheared on top of the horst, the transport direction is sub-horizontal. Similarly to the other simulations with significant softening mechanisms, the basement around the half-graben is only deformed to a small degree 10 and the half-graben is not closed.

Numerical robustness
We investigated the impact of different numerical resolutions on the model results to test the robustness of these results. Such test is important, because of weak and thin material between strong material can cause mechanical decoupling but only when 15 resolved numerically. We compare the reference model with an original resolution of 3001×1001 (width×height) numerical grid points (initially 66×25 m grid spacing) with two simulations having identical configuration and parameters, but with smaller resolutions of 1501×501 (initially 133×50 m) and 751×251 (initially 267×100 m). The resulting structures after 38% of bulk shortening are essentially identical (Fig. 11). Similarly, the strain rate fields below the brittle-ductile transition are similar too. However, the strain rate distribution in the brittle part and around the brittle-ductile transition is resolution 20 dependent (Fig. 11). This is typical for the applied non-associated plasticity scheme with the Drucker-Prager yield criterion, that is merely a stress limiter, inhibiting the stresses to exceed the failure limit. Thus the exact geometry of the brittle-plastic shear bands is resolution dependent, but the effective load bearing capacity of the brittle layer converges with increasing resolution (Yamato et al., 2019, their appendix). Keeping in mind these limitations regarding the brittle-plastic deformation, the results in our main area of interest, that is the ductile nappe stacking, are essentially independent on the resolution within 25 the studied range. Hence, our results are numerically robust concerning the detachment, transport and stacking of nappes under dominantly ductile deformation.

Comparison of the model results with the geological observations
There are several features of the Helvetic Nappe System that we could successfully reproduce in our thermo-mechanical model. Similarly to Bauville and Schmalholz (2015), a structure resembling a fold nappe has been formed by the extrusion of the sedimentary fill from a half-graben. During formation of this fold nappe, the half-graben has been closed and the sediments squeezed between the two basements resemble the structure of the Chamonix zone located between the Aiguilles-Rouges and Mont-Blanc massifs (Fig. 1c). Hence, our model generated the first-order structural features (i.e. features larger than the applied numerical resolution of 66×25 m) of the Infrahelvetic complex in W-Switzerland, namely a recumbent fold nappe with a root located between two deformed basement massifs. Additionally, our model reproduced the detachment and sub-horizontal transport of sedimentray units from a passive margin. The thrust nappe, which originates from the basin on the right in our 5 model, resembles the Wildhorn super-nappe. The horizontal transport of this thrust nappe is in the order of 30 km in the model.
Furthermore, in the model this thrust sheet is stacked above the fold nappe and the final model structure resembles a thrust nappe that is stacked above a fold nappe, as observed in the Helvetic nappe system. Moreover, there is a considerable amount of cover units entrapped between the fold nappe and the thrust sheet. The entrapped lower region of the cover unit resembles the Ultrahelvetic units so that our model can explain how these Ultrahelvetic units have been entrapped between the Morcles 10 fold nappe and the Wildhorn super-nappe (Fig. 1c). At the end of the simulated formation of the nappe system, the maximum temperature in the nappe system ranges between 250 o C and 350 o C. The Wildhorn nappe exhibits peak temperatures between 250 o C and 300 o C and the underlying Morcles nappe hotter peak temperatures between 290 o C and 350 o C, which is in broad agreement with the metamorphic peak temperatures of the Helvetic nappe system reported by Kirschner et al. (1996) and Ebert et al. (2007). In the simulations, the nappe stack is formed within ca. 8 Myr, which is also in broad agreement 15 with the estimated time span of main formation of the Morcles fold nappe from ca. 28 to 17 Ma (Kirschner et al., 1995). The simulations with two thin strong layers, separated by weak units, can explain the significant parasitic, or second order, folding of the two main carbonate units (Quinten and Urgonian limestone formations) which is observed in the Wildhorn super-nappe.
Some features of the Helvetic Nappe System are not reproduced by the simulations. In the frontal part of the fold nappe, originating from the left half-graben, the front is first thrust out of the half-graben and the overturned limb develops subsequently. 20 This deformation generates a "nose-like" structure in the frontal part of the fold nappe, which is not observed. However, a stress limiter in the basement favours the formation of a fold nappe without the "nose-like" structure, indicating that the stress contrast between the basement top and the strong layer has a first-order control on the developing fold nappe geometry. Also, in all simulations the fold nappe has only a minor second order folding, in contrast with the prominent parasitic folds of the Morcles nappe. In the numerical models, we likely overestimated the amount of shale-rich sediments in the basin on the right, 25 mimicking the Helvetic basin, as the total volume of the Wildhorn super-nappe south of the Morcles nappes is much thinner than in the simulations. There was also a likely significant amount of vertical flattening, and presumably pressure solution related volume decrease, after the main phase of nappe formation and during the exhumation of the nappe system, which is not modelled in our simulations. Moreover, several basement shear zones have been mapped in the Aiguilles-Rouges and in the Mont-Blanc massifs, which are not present in the simulations. This is likely because (i) the straight bottom boundary of 30 the model may prohibit any significant vertical displacement of the basement units and hence inhibit significant shear zone formation, (ii) the model basement is mechanically homogeneous and there are no heterogeneities that can trigger shear zone localization and (iii) the amount of brittle-plastic deformation is underestimated in the basement. We considered a horizontal model base while during natural nappe formation the overall basement-cover interface was likely dipping, or tilting, in the direction of subduction (i.e. direction of basal velocity), so that a model base gently dipping towards the subduction direction zone be more realistic. The deformation at the base of our model is viscous and the surface slope for evolving crustal wedges with a viscous base depends on the viscous shear stress at the base, whereby larger shear stresses are related to higher surface slopes (e.g. Ruh et al., 2012). Keeping the basal viscous shear stress the same, a tilting of the model base towards the subduction zone would reduce the surface slope. Therefore, in our models the surface slopes towards the foreland (left) region represent high end-member surface slopes so that effects of gravity-related forces directed towards the foreland region are on the higher can be tested eventually with larger scale models.

Tectonic inheritance, mechanical heterogeneities and potential softening mechanisms
Geological reconstructions of the Helvetic nappe system showed the correlation of the nappes with their original positions along the pre-Alpine European passive margin, which was characterized by half-grabens and horsts (e.g. Epard, 1990;Boutoux et al., 2014). In agreement with previous modelling studies (e.g. Beaumont et al., 2000;Wissing and Pfiffner, 2003;Bellahsen 20 et al., 2012;Lafosse et al., 2016;Bauville and Schmalholz, 2017), our results suggest that tectonic inheritance in the form of half-grabens and horsts has a strong impact on the development of fold and thrust nappes during crustal deformation. Our results indicate that two types of tectonic inheritance are important, namely the geometry and the magnitude of mechanical heterogeneities. The geometry of half-grabens and horsts controls the location of nappe initiation (Bauville and Schmalholz, 2017). The basement and sediments must, of course, have different mechanical strength, otherwise the geometry of the base-25 ment would be unimportant. Our results suggest that tectonic inheritance was necessary to the evolution of the Helvetic nappe system, but not sufficient. The results show that specific strength-contrast between basement and sediments and within the sediments are required to model nappe structures resembling those of the Helvetic nappe system (Figs. 6 and 7). The reference simulation exhibits an effective viscosity contrast between weak units and strong layer and basement in the order of three to four orders of magnitude (Fig. 7a). Although the effective viscosity in the basement and strong layer is in the order of 10 24

30
Pa.s, the stresses in the basement and strong layer are far below the brittle-plastic yield stress and typically smaller than 100 MPa (Fig. 4). If the effective viscosity contrast between strong layer and basement is not large enough, then the sediments of the basin on the right side are not detached in the manner of a thrust sheet (Fig. 6a). One possibility to enforce detachment for smaller viscosity contrasts is the application of plastic strain softening and/or initially reduced friction angles ( Fig. 10b and c).
Application of strain softening favors the formation of thrust sheets in the models, but prohibits the formation of fold nappes ( Fig. 10b and c). The importance of tectonic inheritance and the pre-Alpine configuration on the nappe formation underlines the importance of considering geological field work and associated geological reconstructions for the model configuration, because only such field based reconstructions can provide estimates for the pre-Alpine configurations. Our results are consistent with those of Duretz et al. (2011) which showed that inherited mechanical heterogeneities, promoting large lateral strength 5 contrast, are essential to trigger exhumation of lower crustal granulites as observed in the Bohemian Massif. Generally, our results are consistent with a variety of studies, which show the importance of structural inversion of extensional systems during compressional deformation and are based on geological field observation, analogue deformation experiments and numerical models (e.g. Gillcrist et al., 1987;Buiter and Adrian Pfiffner, 2003;Buiter et al., 2009;Bellahsen et al., 2012;Bonini et al., 2012;Lafosse et al., 2016;Granado and Ruh, 2019).
For the model configuration, a significant localization due to thermal softening does not occur for a convergence velocity of 1 cm.yr −1 , but it does for 5 cm.yr −1 . Average convergence velocities during the Alpine orogeny are typically estimated to be in the order of 1 cm.yr −1 (Schmid et al., 1996). However, some short periods with higher convergence velocities cannot be excluded. So if there were short periods during the formation of the Helvetic nappe system with convergence velocities larger than ca. 5 cm.yr −1 , then thermal softening might have been important.

15
There is field evidence for grain size reduction associated with mylonitic shear zones at the base of nappes in the Helvetic nappe system (e.g. Ebert et al., 2007Ebert et al., , 2008. We did not consider the microscale grain size reduction in our models for several reasons: First, the major mylonitic shear zones with significant grain size reduction have a thickness in the order of 10 m. Although we use high resolution models we have a numerical grid size of ca. 66 × 25 m, hence, this resolution is still not large enough to resolve the internal deformation within shear zones having thickness of 10 m. Second, recent numerical simulations 20 including grain size reduction and combined diffusion and dislocation creep flow laws suggest that grain size reduction does not have a dramatic impact on strain localization (Schmalholz and Duretz, 2017), which is in agreement with theoretical results of Montési and Zuber (2002). The reason is that a piezometer-type stress to grain size relation, when subsituted into a grain-size-sensitive diffusion creep flow law, generates a power-law type flow law with stress exponents similar to the one of the corresponding dislocation creep flow law (e.g. Montési and Zuber, 2002). However, other studies argue that microscale 25 processes such as coupled grain evolution and damage mechanisms can generate significant strain localization and that these mechanisms have been responsible for generating subduction and plate tectonics (e.g. Bercovici and Ricard, 2014). Therefore, future simulations should consider such coupled microscale processes in order to quantify their importance on the first order tectonic nappe detachment, overthrusting and stacking.

30
Our 2D thermo-mechanical simulations support geological interpretations arguing that tectonic and structural inheritance controlled the tectonic evolution and resulting first order structures in the Helvetic nappe system. We show that both the geometry and magnitude of mechanical heterogeneities, representing the tectonic and structural inheritance, control the nappe formation.
The two main heterogeneities are caused by the laterally varying basement-cover interface, that is characterized by half-grabens and horsts, and by the vertical alternation of sedimentary layers with different mechanical strength. If these heterogeneities are included in numerical simulations, based on continuum thermo-mechanics, with standard creep flow laws and with Drucker-Prager and von Mises yield criteria, then the simulations can produce the first order features of nappe detachment, transport and stacking in the Helvetic nappe system. In our models, an effective viscosity contrast of approximately three to four orders 5 of magnitude between weak sediments and strong sediments and basement is essential to reproduce the first order tectonic features. Furthermore, our results suggest that it is not essential to consider additional rock mechanic processes, in addition to the applied creep and brittle-plastic deformation, such as grain size evolution, frictional strain softening or pore fluid pressurization in order to explain these first order features, although all these additional processes most likely occured during the formation of the Helvetic nappe system. Therefore, our model requires likely the least amount of assumptions concerning rock 10 deformation mechanisms for the explanation of the first order features of the Helvetic nappe system.
Based on the first-order agreement between our model results and natural data, we propose a thermo-mechanics based interpretation for the tectonic evolution of the Helvetic nappe system of Western Switzerland: The pre-Alpine European passive margin was characterized by significant mechanical heterogeneities due to a basement-cover contact with half-grabens and horst, and due to the alternation of mechanically strong and weak sediment units. During the continental collision, the pas-15 sive margin was deformed by external compressive stresses. During deformation, mechanical heterogeneities, and not material softening mechanisms, control the detachment, transport and stacking of nappes. Detachment of sedimentary units of the Wildhorn super-nappe is caused by stress concentrations around the contact between basement and sediments. The transport of the Wildhorn super-nappe was facilitated by weak shale-rich units and weak Ultrahelvetic units, which have been entrapped from above to below the Wildhorn super-nappe. Formation of the Morcles fold nappe is associated to the closing of a half-graben, 20 bounded by basement massifs that form now the Aiguilles-Rouges and Mont-Blanc massifs, and the associated squeezing-out of sediments from this half-graben. Closing of the half-graben occurred by an overall distributed deformation without major reactivation of earlier half-graben normal faults and without major localized thrusting in the basement.
Code availability. Not publicly available.
Data availability. Not publicly available.
Duretz, T., Kaus, B., Schulmann, K., Gapais, D., and Kermarrec, J.-J.: Indentation as an extrusion mechanism of lower crustal Duretz, T., May, D., and Yamato, P.: A free surface capturing discretization for the staggered grid finite difference scheme, Geophysical      13.9 14.25 15 Figure 9. The final geometry of three simulations with two strong layers with the isotherms of the corresponding temperature field. The initial model stratigraphy around the upper region of the half-graben and basin is displayed on the right of each panel. The model stratigraphy is laterally homogenous, so the overall initial configuration is similar to that in Figure (2).
30 Figure 10. The geometry and the strain-rate field of three simulations after ca. 30% bulk shortening, with various softening mechanisms.
Panels a) and b) show results of a simulation with a convergence rate of 5 cm.yr −1 , in which thermal softening has a significant impact.
Panels c) and d) show results of a simulation with strain softening that reduces friction angle from the initial 30 degrees to 5 degrees. Panels e) and f) show results of a simulation with strain softening that reduces friction angle from the initial 15 degrees to 5 degrees. Figure 11. The geometry and the strain-rate field of three simulations after ca. 38% bulk shortening for different numerical resolutions. Table 1. The list of the reference model parameters, where f is a custom pre-factor, A is the pre-exponential factor, n is the power-law exponent, Q is the activation energy, λ is the thermal conductivity, ρ ref is the density at reference pressure (P ref = 0 Pa) and temperature Qr is the radioactive heat production, C is the cohesion and φ is the friction angle. Some parameters have constant values: Cp = 1050 J.K −1 is the heat capacity, G = 10 10 Pa is the shear modulus, α = 3 × 10 5 K −1 is the thermal expansion coefficient, β = 10 −11 Pa −1 is the compressibility and F = 2 (1−n)/n 3 −(n+1)/(2n) is a geometry factor (needed to convert flow law parameters from axial compression experiments into an invariant form). The creep flow law parameters (A, n and Q) are: 1 Westerly granite (Hansen et al., 1983), 2 calcite (Schmid et al., 1977) and 3 mica. (Kronenberg et al., 1990).