Time-lapse gravity and levelling surveys reveal mass loss and ongoing subsidence in the urban subrosion prone area of Bad Frankenhausen/Germany

. We present results of a sophisticated, high-precision time-lapse gravity monitoring that was conducted over four years in Bad Frankenhausen (Germany). To our knowledge, this is the ﬁrst successful attempt to monitor subrosion-induced mass changes in urban areas with repeated gravimetry. The method provides an approach to estimate the mass of dissolved rocks in the subsurface. Subrosion, i.e. leaching and transfer of soluble rocks, occurs worldwide. Mainly in urban areas, any resulting ground subsi-5

Abstract. We present results of sophisticated, high-precision time-lapse gravity monitoring that was conducted over 4 years in Bad Frankenhausen (Germany). To our knowledge, this is the first successful attempt to monitor subrosioninduced mass changes in urban areas with repeated gravimetry. The method provides an approach to estimate the mass of dissolved rocks in the subsurface.
Subrosion, i.e. leaching and transfer of soluble rocks, occurs worldwide. Mainly in urban areas, any resulting ground subsidence can cause severe damage, especially if catastrophic events, i.e. collapse sinkholes, occur. Monitoring strategies typically make use of established geodetic methods, such as levelling, and therefore focus on the associated deformation processes.
In this study, we combine levelling and highly precise time-lapse gravity observations. Our investigation area is the urban area of Bad Frankenhausen in central Germany, which is prone to subrosion, as many subsidence and sinkhole features on the surface reveal. The city and the surrounding areas are underlain by soluble Permian deposits, which are continuously dissolved by meteoric water and groundwater in a strongly fractured environment. Between 2014 and 2018, a total of 17 high-precision time-lapse gravimetry and 18 levelling campaigns were carried out in quarterly intervals within a local monitoring network. This network covers historical sinkhole areas but also areas that are considered to be stable. Our results reveal ongoing subsidence of up to 30.4 mm a −1 locally, with distinct spatiotemporal variations. Furthermore, we observe a significant time-variable gravity decrease on the order of 8 µGal over 4 years at several measurement points.
In the processing workflow, after the application of all required corrections and least squares adjustment to our gravity observations, a significant effect of varying soil water content on the adjusted gravity differences was figured out. Therefore, we place special focus on the correlation of these observations and the correction of the adjusted gravity differences for soil water variations using the Global Land Data Assimilation System (GLDAS) Noah model to separate these effects from subrosion-induced gravity changes.
Our investigations demonstrate the feasibility of highprecision time-lapse gravity monitoring in urban areas for sinkhole investigations. Although the observed rates of gravity decrease of 1-2 µGal a −1 are small, we suggest that it is significantly associated with subterranean mass loss due to subrosion processes. We discuss limitations and implications of our approach, as well as give a first quantitative estimation of mass transfer at different depths and for different densities of dissolved rocks. free continental surface. Thus, the capability for solution and mass transfer by meteoric or groundwater exists. Two main categories of sinkholes have been distinguished: solution and subsidence sinkholes (e.g. Waltham and Fookes, 2003;Waltham et al., 2005;Beck, 2012;Gutiérrez, 2016). The first group results from differential dissolutional weakening of exposed or merely soil-covered karst rocks. The subsequent slow subsidence forms sagging or suffosion sinkholes and is considered to be less hazardous from an engineering point of view (Gutiérrez et al., 2014). The second group represents a wide spectrum of dolines generated by subsurface chemical dissolution or mechanical erosion, termed subrosion in the following. It is further classified by the affected material (cover, cap rock, or bedrock), the process of subsidence mechanism (collapse, suffosion, or sagging), and the dissolution rate (Cooper, 1986;Beck, 1988). Subrosion and the development of sinkholes may be influenced by numerous anthropogenic factors such as mining (Brady and Brown, 2006;Mesescu, 2011), tunnelling (Song et al., 2012), water abstraction (Bell, 1988;Aurit et al., 2013), water impoundment (Hunt et al., 2013), and other large-scale hydrological projects in karst regions, which enhance the natural process of dissolution (Milanovic, 2002;Gutiérrez and Lizaga, 2016). In karst environments, collapse sinkholes are often related to gravitational subsurface cavity collapse (Parise and Lollino, 2011;Waltham, 2016), where stress conditions exceed the material stability of the surrounding rocks, which may be related to sudden water-level changes (e.g. Lollino et al., 2013) or seismic activity (e.g. Kawashima et al., 2010).
If very fast suffosion takes place or sinkholes suddenly collapse in urban areas, they are a severe hazard for residents, economical and residential buildings, and infrastructure in general (e.g. Brinkmann et al., 2008;Lei et al., 2013;Gutiérrez, 2016;Wadas et al., 2017). Hence, ongoing urbanization and the growth of the world's population increase the requirement for detailed investigation of subrosion processes and sinkhole development for risk assessment.
In contrast, the time-lapse gravity method can deliver enhanced information about the variable local gravity field over time. It is non-invasive and directly sensitive to temporal mass movements on different spatial and temporal scales. Time-lapse gravity observations were successfully applied to monitor, e.g. subsurface water storage changes in general (Naujoks et al., 2008;Pfeffer et al., 2013) and in karst regions Jacob et al., 2010;Champollion et al., 2018), CO 2 storage changes (Nooner et al., 2007;Wilkinson et al., 2017), or withdrawal or intrusions beneath volcanic edifices (Jentzsch et al., 2004;Hautmann et al., 2014;Carbone et al., 2017).
We present a study that reveals potential subrosion-related mass transfer in the subsurface using spring gravity meters for a time-lapse survey over 4 years. A special focus is placed on the hydrological correction, which strongly means soil water content and changes in groundwater level. Both are challenging impacts on gravity variations (e.g. Bonatz, 1967;Mäkinen and Tattari, 1988), especially in urban areas and where no special hydrological monitoring sites exist. We introduce the survey area of Bad Frankenhausen, located in Thuringia in Germany (Sect. 2), and the monitoring concept, including data processing and a hydrological soil water correction approach (Sect. 3). The results show constant gravity decrease, as well as continuous subsidence at specific points inside of our measurement network (Sect. 4), which is assumed to be caused by underground mass redistribution. This is also discussed in Sect. 5, such as the feasibility of the timelapse gravity method for sinkhole monitoring under urban conditions.
2 Geology of the survey area Sinkholes in Germany occur over the whole country due to the dissolution of various soluble rocks in the subsurface (Fig. 1a). These are mainly near-surface salt diapirs in northern Germany (Krawczyk et al., 2012;Kaufmann et al., 2018), and carbonates and sulfates in the central and southern parts of the country (Kaufmann, 2014;Wadas et al., 2017).
Our study area is located in the centre of Bad Frankenhausen (BF), a small city in northern Thuringia in central Germany, which is located in the transition zone of soluble rock packages ( Fig. 1a; Wadas et al., 2016). It is bounded by the Kyffhäuser hill range to the north (Fig. 1b) and the Thuringian Basin to the south (an extensive geological overview of Thuringia is given by Seidel, 2003). Geological units in the working area were deposited in the Permian and the Triassic, and are divided by the W-E-trending and northwards-dipping Kyffhäuser Southern Margin Fault (KSM Fault) (Wadas et al., 2016).
The sediments to the north of the KSM Fault are mainly Zechstein evaporites developed from seven evaporationtransgression cycles of the epicontinental Zechstein sea in the Upper Permian (258-250 Ma). These are alternating layers of conglomerates, carbonates, sulfates, and rock salt (Richter and Bernburg, 1953). The main occurrent marine formations in the research area are termed Werra, Staßfurt, and Leine ( Fig. 1b). Extensive units of anhydrites, carbonates, copper shales, and conglomerates, mainly from the Werra and Staßfurt formations, can be found in the Kyffhäuser hill range. Scattered Leine Formation deposits consisting of salt clays, anhydrites, and carbonates cover the region to the northwest of BF (Schriel and Bülow, 1926a, b).
The sediments to the south of the KSM Fault are mainly sandstones, claystones, and shales that were deposited during the Triassic terrestrial sedimentation phase after the marine sedimentation phase of the Permian. Triassic Buntsandstein, Muschelkalk, and Keuper overlay the Permian evaporites (for thickness values of the rock units, see Schriel and Bülow, 1926a). Quaternary deposits are floodplain sediments, claystones, and siltstones, as well as glacial gravels and aeolian silt deposits.
The whole region is prone to subrosion, as proven by many features on the surface (Fig. 2). This is predominant along the KSM Fault (Fig. 2a) and part of an about 250 km long tourist trail -the "Karst Trail" -along the southern Harz hill range. Several studies show that the Upper Permian in this region is strongly fractured, and therefore the mechanical integrity of the subsurface is disturbed. Kaufmann (2014) used a combination of different geophysical methods and joint inversion to show that the fractured zones can serve as pathways for meteoric and groundwater, and hence accelerate the underground dissolution. Proven by salt springs and about 20 000 subrosion structures, which shape the landscape south of the Harz Mountains (Knolle et al., 2017), the Upper Permian provides the solvable material in the near-surface area (Kugler, 1958), especially along the KSM Fault, where the southward-draining groundwater from the Kyffhäuser hill range ascends (Reuter, unpublished data). The different types and ages of the subrosion features document the ongoing subrosion processes over time. Underground cave growth ( Fig. 2b; a description of the Barbarossa Cave is given by Kupetz and Mucke, 1989) and weakening of the rock units lead and led to the development of collapse ( Fig. 2c, d) and sagging sinkholes, which strongly affect urban constructions in and around BF. The most famous subrosion feature in the area is the leaning church tower of BF that currently has an inclination of 4.93 • (Fig. 2e) and has been stabilized by a steel pylon construction. Since the last collapse of the Quellgrund sinkhole in 1908, the tower's inclination has been increasing rapidly due to disturbances within the drainage system beneath the building (Sven Schmidt, Thuringian State Institute for Environment and Geology (TLUG), personal communication, 2016). Several cavities and disrupted zones were investigated by three research core drillings (depth: 100-458 m) between 2013 and 2015 around the leaning church tower and mainly in the upper 100 m of the cap rock, which mainly consists of Zechstein anhydrites and gypsum (Sven Schmidt, TLUG, personal communication, 2016). Other investigations show similar results; e.g. the company SOCON Sonar Control found and surveyed a large cavity (volume: 95.5 m 3 , depth: 14.5-20.5 m, radius: 8 m) directly beneath the tower by using an ultrasonic sound method through an older drill hole beside the tower walls. In addition, the bigger part of infrastructure and buildings shows cracks and damage, which led, e.g. to the necessity to rebuild the leaking swimming pool in the city centre, several building renovations, and reconstruction work.
Sinkhole development and ongoing subsidence in urban regions such as BF represent severe hazard. Therefore, we aim to detect mass redistribution caused by subrosion by applying time-lapse gravity monitoring to improve the understanding of subrosion processes.

Conception for measurement and data analysis
Our general approach is the high-precision monitoring of subrosion-induced time-variable gravity and height changes using regular repeated measurement campaigns. Including March 2014, when a 2-week long reference campaign was performed, a total of 17 time-lapse gravimetry (1 week of measurement time) and 18 levelling campaigns were carried out quarterly. The frequency of the campaigns has been reduced to half-yearly intervals since the beginning of 2018. The observed temporal components provide the possibility to derive evidence for ongoing subrosion and afford a quantification of the mass relocation in the subsurface. Therefore, the time-lapse measurements have to be close mesh and of high quality. Potential error sources have to be avoided as much as possible. Realistic error estimations are required when considering the results with respect to significance. All this must be taken into account during planning, measurement, and data analysis. The key requirements in this context are appropriate instrumentation, the local stability of the individual measurement locations, their long-term availability in variable infrastructural surroundings, and a method (measurement and data analysis), which must be robust against inner-city noise (pedestrians, cars, construction work) and systematic errors. In the following, we describe our implemented conception for the monitoring and the data analysis.

Measurement: monitoring network
A local combined geodetic-gravimetric measurement network was established in Bad Frankenhausen in March 2014 ( Fig. 3), based on previous studies (gravimetry for structural investigations, levelling), site inspection, experiences from similar studies (Naujoks et al., 2008), and information about future construction work in the measurement area. The north-ern part of the city centre on the edge of the KSM Fault ( Fig. 2a) is subject to subsidence of up to several millimetres per year (Fig. 3c) as determined by levelling surveys from 2000 to 2010 carried out by Scholte (Glückauf Vermessung Sondershausen, unpublished data). Furthermore, the Bouguer anomaly in this area, which is trend reduced by a second-order polynomial (Fig. 3b), correlates qualitatively with the sinkhole areas I-III (cf. Fig. 3c). The negative gravity anomalies are considered to be the first evidence for subsurface subrosion-induced density contrasts. Hence, our extensive measurement network (125 levelling points -15 of them for time-lapse gravity monitoring) covers the northern city with the focus on known sinkhole areas and zones of gravimetric minima in the medieval centre. The connection to superior reference systems provides stability control of the whole network. Our levelling network is tied to the second-order trigonometric benchmark RP1 (RP: reference point, Fig. 3a, Fig. 4c) and to LRP (reference point for levelling) of third order at both ends of a 2.75 km long E-Worientated profile (Fig. 3a). In combination with the levelling network, we defined 12 points as a gravimetric network in a close-mesh arrangement (Fig. 3, blue dots: G01-G12) and three points as gravity reference points (Fig. 3, red dots: RP1, RP2, RP3). Most of the measurement points were installed on infrastructure such as cobblestoned footpaths and marked by synthetic survey markers, which cover steel piles in 30 cm deep-drilled holes (Fig. 4a). Gravity points on meadows (G02, G11, G12) are self-made concrete pedestals of 80 cm depth to reduce noise and soil freezing effects. They are marked by brass survey markers (Fig. 4b).
3.2 Measurement: devices and data acquisition

Instrumental array
Up to four different gravity meters of various manufacturers were used per campaign (Fig. 5). The gravimetrical setup consisted of astasized relative metal spring gravity meters of LaCoste & Romberg G-type with feedback (LCR-G, acc 1 : ≤ 10 µGal; 1 µGal = 10 nm s −2 ) and ZLS Burris (acc: ≤ 5 µGal), as well as non-astasized quartz spring Scintrex CG3 (acc: ≤ 8 µGal) and CG5 (acc: ≤ 5 µGal) gravity meters (detailed description of the instruments -LCR: Torge, 1989;Scintrex: Scintrex, 1995ZLS Burris: Jentzsch et al., 2018). The mentioned accuracies are dependent on the noise level, the measurement conditions, and (especially concerning the Scintrex instruments) the age of the instrument itself. Instrument heights above point label were controlled for each observation. The gravity measurements were accompanied by levelling using a Leica DNA03 digital level (standard deviation per kilometre double run: 0.3 mm; see Leica Geosystems AG, 2006) with two Invar bars to provide height references for gravity height reductions and to conduct subsidence monitoring. Additional equipment consisted of tripods, parasol to avoid effects of sunshine and rain, and sensors for air pressure.

Calibration setup for gravity meters
In preparation for the single measurement campaigns, instrumental error sources were reduced by determination of instrument-specific calibration factors and their stability control to 10 −4 . This means that the inaccuracies due to calibration are, within the effective gravity range of 6 mGal, on the order of 0.6 µGal. The gravity meters were calibrated using the calibration line in the university tower building in Hanover, Germany (Kanngieser et al., 1983) and the Harz mountain calibration line (Torge, 1989). The estimated accuracies of their gravity differences are in the range of 1 µGal for Hanover and 2 µGal for Harz (Timmen, 2010;Timmen et al., 2018). Frequent calibrations have shown stable cali- bration factors for all used instruments. Besides, regular laboratory tests with respect to dependencies on instrumental air pressure effects, drift behaviour, and tilt were performed for accurate instrument modulation.

Time-lapse gravity monitoring
Evidence of mass loss due to underground leaching requires a measurement concept, which focuses on both accuracy and efficiency. The most convenient procedure is the measurement of gravity differences between the points in a network applying the step method for optimal drift control (Torge, 1989). Therefore, the gravimetric measurement network ( Fig. 3) was subdivided into polygons consisting of four to six points each. In a four-point polygon, this resulted in a total amount of 13 measurements; i.e. four differences were each measured three times (Fig. 5c). The advantage of this method is an optimal drift determination and the possibility of statistical validation of each measured gravity difference. At a measurement point, and for statistical and accuracy reasons, different settings were used depending on the type of gravity meter: Scintrex -10 measurements in cycles of 60 s (45 s registration, 15 s break); LCR-G -three measurements at three spindle positions (±0.1 scale units) using feedback; ZLS Burris -five measurements at a fixed spindle position using only feedback (range: ±25 mGal).

Data analysis: processing steps
Measured height changes obtained by levelling campaigns are processed using Nigra, a special software for the analysis of levelling (TrukkSoft, 2018). Here, the height differences are not adjusted, but discrepancies are distributed along a profile or in loops after the averaging of all double-observed height differences.
Gravimetric time series contain numerous effects that superimpose the signal of interest. Therefore, and to compare the observed gravity differences between network points for different campaigns, as well as to identify a potential subrosion signal due to mass redistribution, several processing steps had to be applied (Table 1). Highly sensitive gravity meters are affected by shocks and tilt, e.g. by passing cars or pedestrians, which can produce errors like jumps or spikes in the datasets. Especially, Scintrex gravity meters are sensitive to transportation effects, i.e. to long-lasting run-in periods due to the relaxation of accumulated tension in the sensor (Reudink et al., 2014;Klees et al., 2017). These effects mainly occur after tilting a Scintrex instrument by ≥ 8 • for several minutes during transportation. Additionally, random and systematic errors occur. Hence, preprocessing was applied to correct the data per point for outliers, jumps, spikes, and running-in behaviour, and finally, to average them.
The pre-corrected mean gravity records contain different instrumental and environmental signal content (explained in detail, e.g. by Torge, 1989or Timmen, 2010, which varies continuously with time or occurs irregularly, and superimposes the potential subrosion signal. Thus, the observations of each single gravity meter and all gravity meters combined were analysed by a least squares adjustment (Wolf, 1975) using the Fortran-based programme package GNLSA 1.01 (Wenzel, 1985, unpublished data). It executes the correction of data for Earth tides and ocean loading tides, gravitative and instrumental air pressure effects, height reduction of the gravity meter mass suspension relating to the point label by using the vertical standard gravity gradient on Earth's surface of −0.3086 mGal m −1 , as well as the application of all calibration factors as described in Sect. 3.2. It also includes the adjustment of linear drifts. If drift effects appeared as highly non-linear on some days, we divided these days and introduced "additional" gravity meters instead. Another important requirement was the weighting of gravity meters during the least squares adjustment according to their precision level. The results of running GNLSA are adjusted gravity differences plus SD for every possible difference in the network and adjusted linear drift parameters for each campaign. Furthermore, gravity values plus SD for each single measurement point are calculated from the adjusted gravity differences based on given absolute levels of reference points.
Subsequently, a post-processing takes places. Firstly, temporal height variations of gravity points as derived from the levelling campaigns were taken into account using the vertical standard gravity gradient on Earth's surface. Secondly, the gravity differences show seasonal effects (Sect. 4.2,Figs. 8,9), which correlate well with temporal variations in the soil water content. These were considered within data post-processing as described in the following section.

Data analysis: hydrological effects
So far, the gravimetric potentially subrosion-induced signal of interest is still superimposed by hydrological effects like changes in groundwater table and soil water content (Bonatz, 1967;Mäkinen and Tattari, 1988). These are characterized by spatial point-to-point and temporal campaign-to-campaign variations. Hydrological effects are also dependent on the topography around a measurement point (Naujoks et al., 2008;Figure 5. Overview of the gravity meters and the method that was used in this study. The accuracies (acc) were obtained from least squares adjustments per campaign and are valid for the single measurement of a gravity difference.  Deville et al., 2013). Unfortunately, no recordings of groundwater level and soil water content are available in or near the measurement area.
Soil water content can vary in two ways: irregularly and seasonally. To compute its effect on gravity data, several models on different scales and resolution are available (e.g. Meng and Quiring, 2008;Ford and Quiring, 2013). In this study, the Global Land Data Assimilation System (GLDAS -model type: Noah) from NASA was chosen (Rodell et al., 2004), because no local soil water models are available for our measurement area in that quality. It includes extensive regional climate data (temperature, air pressure, humidity, long-wave radiation) on a 0.25 • × 0.25 • grid over Europe and provides monthly soil water content that covers the upper 2 m of depth. GLDAS Noah is available online and open source (Beaudoing and Rodell, 2016). The computed varying soil water content has the dimension of "mm water column" and was interpolated for the first day of each measurement campaign. The correction of gravity differences is based on the determination of respectively associated regression coefficients in the dimension of µGal mm −1 . Here, a regression coefficient symbolizes the differences in the time-variable soil water content between two points of a gravity difference and further discrepancies in point conditions such as topography, porosity, or sealing. We multiplied a single coefficient with the soil water content and reduced the result from the associated time-variable gravity difference. The remaining signal in the gravity differences contains mainly long-periodic subrosion-related gravity variations, in case subrosion is taking place, plus short-periodic noise, and location-dependent non-computable groundwater effects.
With regard to groundwater variations, it is known from some drillings that the depth of groundwater in Bad Frankenhausen varies between 2.5 m in the southern and up to 10-20 m in the northern part of our monitoring network. However, no specific evidence of the local groundwater level depth is available. A few gauges of groundwater level exist at a distance of 5-10 km. The groundwater level only par-tially correlates with the soil water content from the GLDAS model in a seasonal range. As a first proxy, and under the assumption that these groundwater variations are roughly valid for the measurement area as well, the previously determined regression coefficient for soil water content could partly include the effect of groundwater changes. Also it has to be considered that the groundwater signal has a phase shift relative to the soil water content. If local groundwater data were available, a second regression coefficient for groundwater correlation could be calculated.

Levelling
The results over 4 years of levelling are shown in Fig. 6 as an overview. Achieved accuracies over all campaigns are in the range of ±1.5 mm for the 2.75 km long E-W-orientated profile and ±1.0 mm for the loops in the northern city centre (Fig. 6b). The reference points for levelling (RP1, LRP; see Within the medieval centre of BF, two areas of continuous height changes are remarkable (yellow-orange-red dots in Fig. 6b). The first of these is located around gravity points G04 and G09 and covers the western-southwestern slope area of the ancient Quellgrund sinkhole. Subsidence rates are 3.5-5.4 mm a −1 with the local maximum of 21.7 mm in total over 4 years at point G04. The second area includes gravity points G05, G07, and G08 and is located northwestnortheast of the leaning church tower. Subsidence rates are 3.0-30.4 mm a −1 with the local maximum of 121.7 mm in total over 4 years at point G07.  (Wenzel, 1985) Postprocessing • point subsidence • hydrological effects • height changes on gravity points • soil water content • self-written Python-based software • global model GLDAS (Rodell et al., 2004) Within these areas, two profiles covering the local sinkhole territory were defined to show the time-variable development of the subsidence (blue lines in Fig. 6b, profiles shown in Fig. 7). Curves are smoothed by a weighted moving average of width three values (weighting factors: w n−1 = 0.1, w n = 0.8, w n+1 = 0.1). Additionally, levelling observations were affected by seasonal variations, e.g. due to ground frost or drying. We reduced these seasonal and area-wide effects by using a constant offset based on seasonally constant levelling points to emphasize the evolving subrosion patterns within instable zones. Figure 7 displays yearly variations relative to March 2014 and, additionally, the last measurement campaign in April 2018. The church profile (gravity point G07 is not included) reveals irregular subsidence from northwest to northeast of the leaning church tower, i.e. between points G05 and G08, which correlates qualitatively with the results of Scholte (Glückauf Vermessung Sondershausen, unpublished data). Subsidence rates from 2014 to 2016 were 2-6 mm a −1 and have decreased since 2017 to 1-4 mm a −1 with the exception of, e.g. point G08. To the south and west of the leaning church tower, no significant height changes can be detected. The Quellgrund profile shows a similar pattern. The subsidence rates from 2014 to 2016 were 0-4 mm a −1 and decreased abruptly between 2017 and 2018 to 0-0.5 mm a −1 .
All height changes at gravity points were used to correct time-lapse gravity results for subsidence and to refer them to the reference campaign in March 2014.

Reference points
Gravity reference points provide stability control of the whole network and the possibility to calculate gravity values plus SD for each single measurement point from the adjusted gravity differences ( g). After least squares adjustment (LSA), the g between predefined reference points RP1 and RP2 (cf. Fig. 3) for time-lapse gravity observation shows small but significant gravity changes, and therefore RP1 and RP2 were not used as reference points. Reference point RP3 was defined later, in June 2015, and is located at the town hall site in BF. Close by, in the cellar of the town hall, three absolute gravity campaigns were carried out in June  In the next step, we considered the temporal variations of all g between our network points and RP3 before correcting the hydrological effect of soil water content, to define new stable reference points for further data analysis. Figure 8a shows the variations of the g between RP3 and gravity points G01, G03, and G05, which scatter around zero (including seasonal effects of ±2-3 µGal), and thus they were chosen as new reference points within our measurement network. The algebraic sign of a g is dependent on the order of the sequence of calculation ( g xy = − g yx ). Based on the results, we assume that the new reference points were stable before August 2015 as well.

Gravity differences
In the following, all shown and discussed results are related to our reference campaign in March 2014. Gravity differences ( g) were determined in the LSA for each possible g between the 12 measurement points in the northern city centre (66 overall; see Fig. 3c). Some of them could not be measured in a few campaigns (column 3 in Table 2) caused by, e.g. construction work. The SDs of adjusted g vary between 1.1 and 3.6 µGal depending on the season, the combination of used gravity meters, and the number of adjusted g per campaign (Table 2). From the beginning of the study until now, the accuracy of the results of the LSA could be succes-   Results of time-lapse gravity monitoring shown by the adjusted gravity differences ( g) between the stable reference point RP3 and points G01, G03, and G05, as well as, exemplarily shown, by the adjusted gravity difference between points G01 and G12 ( g xy = − g yx ). (a) g scatter around zero and reveal that gravity points G01, G03, and G05 are stable. In addition, the best linear fit shows an insignificant trend of < 1 µGal over 2 years for the three shown g. The g values are based on the absolute gravity measurements at point RP3, carried out by Leibniz University Hanover (LUH), which prove that RP3 is stable (Ludger Timmen, LUH, personal communication, 4 December 2018). (b) g (blue) compared to the soil water content from GLDAS (brown) and the calculated regression coefficient between both in the dimension of µGal mm −1 . (c) g (solid blue) corrected for soil water using the regression coefficient. Dotted lines show the previous state. Regression lines show the g over 4 years before and after the hydrological correction, as well as the decrease of the trend. sively increased, i.a. due to the purchase of the high-precision gravity meters (Scintrex CG5 and ZLS Burris). These instruments perform their own correction of the gravity observations for tilt and temperature effects.
The results of the LSA are presented as an example with the temporal variations over 4 years of the g and their SD between the stable point G01 (Fig. 8a) and point G12, which shows the highest gravity decrease in our measurement network (Fig. 8b, c; values of G12 subtracted from the values of G01). In addition, the seasonal content is obviously significant. The SDs are shown as error bars that get smaller from campaign to campaign, mostly because the LCR gravity meters of lower precision are no longer part of the instrumental setup. Assuming that point G01 is stable, the variations define a trend in g of 3.0 µGal a −1 (dotted line in Fig. 8c), which has displayed an overall gravity decrease at G12 of 12.0 µGal since March 2014. The temporal variations of g in Fig. 8b show seasonal signals of 2-6 µGal oscillating between minima mainly in the winter months, e.g. December 2014 and February 2016, and maxima mainly in the summer months, e.g. July 2016 and July 2017. These g variations are compared to the varying soil water content obtained from the global hydrological model GLDAS (Sect. 3.4).
The g G01-G12 correlates very well with the hydrological model (brown curve in Fig. 8b), except for the time span between February 2015 and November 2015, which is not clearly understood. Here, the variations in soil water content show a maximum. These peaks can also be affected by other gravitational mass changes, e.g. groundwater variations, which cannot be considered here (Sect. 3.4). For this example of g variations, a regression coefficient of −0.110 µGal mm −1 was determined between g and the soil water content, and used to correct g for the soil water variations (Fig. 8c, solid blue curve). The hydrological correction smoothes the curve for seasonal variations in soil water con-tent mainly in the second period of our observation, starting from February 2016, and decreases the 4-year trend in gravity by 4.2 µGal. Between February and November 2015, the hydrological correction seems to produce an increase in the seasonal signal, which could also be induced by a minimum or phase shift of groundwater level, that is not corrected here. However, the remaining signal reveals a significant trend of gravity decrease over 4 years at point G12.
A hydrological correction applying an individual regression coefficient is done for each g in our network. A selection of g variations arranged in a rectangle between four measurement points is shown in Fig. 9. In Fig. 9d, the above discussed g G01-G12 is displayed like in Fig. 8b. Firstly, the correction of seasonal variations in the other gravity differences shows the same quality in the displayed graphics as described for g G01-G12. Furthermore, the remaining gravity decrease of 7.8 µGal over 4 years in the g G01-G12 is related to a gravity decrease at G12. Following the variations of g anticlockwise results in a gravity decrease at G10 by 4.2 µGal over 4 years, which is the difference of the gravity decrease of 7.8 µGal at G12 and the remaining gravity variation of −3.6 µGal over 4 years of the g G12-G10 (Fig. 9b). The same procedure reveals a gravity decrease of 0.8 µGal over 4 years at point G09, which is insignificant, and thus G09 and its difference to G01 are stable within the SD as well (Fig. 9c). The remaining trends over 4 years of the four gravity differences in Fig. 9 add up to 0.1 µGal over 4 years, which may result from rounding during data processing and trend fitting. However, the results show the feasibility of the applied hydrological correction using individual regression coefficients for each g without producing any loop errors and thus the possibility of identification of stable or instable gravity differences in the network. Table 2. Abstract of time-lapse gravimetric field campaigns. Shown are the dates of the campaigns, unusable gravity points, the gravity meters used for the observations per campaign (instruments are LaCoste & Romberg G-Type: LCR-G, Scintrex: CG3 and CG5, ZLS Burris: ZLS), and their number (no.). Furthermore shown are the standard deviations of adjusted gravity differences (SD of g), the standard deviations of gravity values on measurement points (SD per point), and the total number of adjusted gravity differences (no. of g) per campaign.

Gravity values
For gravity investigations, it is not mandatory that gravity values on installed benchmarks are stable over time . The careful selection of the benchmark locations prior to the installation of the monitoring network is based upon the expectation of favourable noise and environmental conditions at these points. However, during the operation time of the network, these assumptions can turn out to be wrong or not sustainable, e.g. due to unknown/changing hydrological conditions of soil or groundwater, or construction work. Hence, the possibility of temporal gravity changes at these benchmarks must be properly considered during interpretation and discussion of results. Under the assumption that G01, G03, and G05 are stable relative to the stable point RP3 (Fig. 8a) and relative to each other in all LSA, the absolute gravity values were set fixed for the two points G01 and G03, according to the adjusted g G01-G03 of the reference campaign. Then, we derived gravity values for each point in our measurement network from the LSA. Now, instead of g between measurement points, the time-variable changes of gravity on individual gravity points are considered. Mean gravity values, each over 1 year of observation starting from the first campaign in March 2014, are displayed in Fig. 10 as bar charts per point. They are corrected for hydrology using the changes in the gravity trends obtained from the hydrological correction in Sect. 3.4 for each gravity difference related to the stable point G01. The results reveal areas of gravitational stability and of significant gravity de-crease over 4 years (Fig. 10). Two patterns are discernible: (i) invariable points (G01, G03, G05, G06, G08, G09) within the SD; (ii) continuous gravity decrease, which is obvious for points G07, G11, and G12. Points G02, G04, and G10 show a gravity increase that appears relative to the reference date (March 2014) in the first year (brown bar) and then also ongoing gravity decrease -this could be an effect of discrepancy between the mean of the first year and the reference value from the first campaign, respectively. It is conspicuous that a gravity decrease mainly is taking place within or very close to the known historical sinkhole areas (see Fig. 3). Furthermore, it is apparent that the gravity decrease on a single point (G * * ) and the time-variable g between stable point G01 and G * * show similar results, which is a matter of course.

Discussion
The approach presented here combines repeated levelling and time-lapse gravimetric surveys in a local measurement network in BF, Thuringia, central Germany. The aim is to identify surface deformation and mass transfer in the subsurface, in an area that is prone to subrosion, as several features on the surface reveal. After 4 years (17 gravimetry and 18 levelling campaigns) of regular quarterly measurements, we have obtained convincing and meaningful results, implications, and limits. Plausible surface deformation (subsidence) and gravity changes (gravity decrease) were detected in subrosion-prone areas of BF, which means mass trans- Figure 9. Selected adjusted gravity differences ( g, g xy = − g yx ) in the measurement network arranged in a rectangle consisting of gravity points G01, G12, G10, and G09, and corrected for soil water variations. (a) g G10-G09 reveals gravity decrease at G10, because G09 is stable within the SD. (b) g G12-G10 reveals faster gravity decrease at G12 than at G10, relative to each other. (c) g G09-G01 shows insignificant variations and reveals that also G09 is stable within the SD. (d) g G01-G12 shows a significant gravity decrease at G12 relative to G01, which is located next to the leaning church tower, because G01 is stable within the SD (cf. Fig. 8a). port in the subsurface. Although the temporal changes are small, i.e. a few µGal in gravity over 4 years, the surveys and later data analysis were realized at high effort to gain gravity observations with high precision. This includes a suitable measurement concept and data analysis (LSA), error estimation and propagation, and finally, a new, simple, but effective, method applied for hydrological correction of gravity differences (Sect. 3). The results discussed below were achieved in spite of challenging urban area conditions. Levelling. Vertical deformation of several mm a −1 was found around the historical sinkhole areas in the measurement network. The deformation rates vary with time and strongly decrease for the campaigns in 2017, which was a rainy year. The origin of non-linear surface deformation in BF is not well understood and requires continuous observation, further research, and comparison with other data. The seasonal height variations due to ground frost would be reduced by installing holes drilled below the depth of frost and filled with longer steel piles than we did to mark levelling points (Sect. 3.1).
The sources of subsidence in BF can be a combination of geogenic and anthropogenic origin. Evidence of leaching of soluble rocks is the occurrence of two natural brine sources inside of Quellgrund, which extract approximately 250 t of salt per day from the subsurface (Sven Schmidt, TLUG, personal communication, 2017). Thus, subrosion processes and the corresponding subsidence could be affected by human activities due to water extraction. As another example, we found extensive subsidence of 7.5-30.5 mm a −1 at three points northwards of the leaning church tower of BF, including gravity point G07 (Fig. 6b). These subsidence rates are significantly higher than those displayed by the levelling profiles in Fig. 7. In the church area, several cavities were found at depths up to 100 m through several research core drillings (one of these was surveyed using borehole-based ultrasonic sonar; see Sect. 2), which also reveals evidence for underground leaching. Partially, the high subsidence rates could be affected by compaction loading due to intensive construction and recultivation work, which were carried out northwards of the leaning church tower between 2014 and 2016. The load of the northeastward sagging tower itself has an effect on the subsidence rates in this area as well. However, the ongoing subsidence seems to show slightly lower rates after the reconstruction work ended.
Time-lapse gravity. Besides stable gravity differences in obviously stable areas, several gravity differences and certain points in the network have been detected to show significant gravity decrease of up to about 2 µGal a −1 with SD of 1-2 µGal over the whole period of 4 years. This is an indication of local mass redistribution and has been observed mainly in and close to known historical sinkhole areas, which could be evidence for ongoing subrosion processes in the subsurface. The order of gravity decrease of 1-2 µGal a −1 is small compared to the results of other studies that have investigated underground mass transfer, e.g. greater two-digit µGal range observed by Hautmann et al. (2014) in a volcanic environ-ment. However, the significance of the results is given by the high precision level of our gravity results (SD ≈ 1 µGal) after LSA and error propagation in our local network, as shown in Sect. 4.2 and Table 2, and by rather continuous rates of found gravity decrease. This has been proven to be due to the use of a highly accurate instrumentation and its appropriate use at measurement site conditions with a sophisticated concept of network configuration and high number of observations, as well as the regular control of their calibration stability. The SD level of the adjusted gravity differences was improved by an extensive preprocessing for data quality improvement and the high sophisticated application of LSA (e.g. drift control, required corrections, and weighting of gravity meters; see Sect. 3.3). Here, the accuracies of the gravity meters indeed have an effect as the used combinations of instruments show (Sect. 3.2 and Table 2). Hence, the data quality was further increased by using exclusively the newer Scintrex CG5 and ZLS Burris gravity meters, which achieve SD in the range of 3-4 µGal for one observed gravity difference.
We were able to emphasize and interpret subrosioninduced signals for several gravity differences on several survey points by applying a first attempt of correction for seasonal hydrological variations. We have used the global GLDAS Noah model to estimate the effect of a varying soil water content to each single gravity difference in our monitoring network. A regression coefficient for each gravity difference was determined and used to correct it for local soil water content. The regression coefficient represents temporal and spatial deviations in hydrology between two different measurement points, dependent on their geological, topographical, and infrastructural conditions. Although this correction works successfully, the remaining seasonal signals suggest that groundwater has to be considered additionally and the global soil water model may not be completely compatible for local studies. For future and similar studies, we propose to test repeated in situ measurements of the soil water content using ground-penetrating radar or nuclear magnetic resonance, which is especially appropriate to determine local soil water content within the vadose zone (de Pasquale and Mohnke, 2014). Both methods provide spatial information for soil water distribution. In this context, time-domain reflectometry in drill holes for point-specific 1-D profiles of water content can only be a first approximation because of the heterogeneity of soils, and furthermore drill holes are needed. Additionally, the remaining signal contains groundwater changes. In particular, in 2015, where soil water had an extreme minimum, the signals suggest further "mass loss" (Fig. 9). Here, the gravity minimum also can hint at very low groundwater levels and hydrological mass deficits in the aquifer layers. Unfortunately, groundwater gauges are not available in the measurement area; i.e. due to the lack of information about groundwater mechanisms, it is currently not possible to make any assumption about the effect of groundwater on the time-lapse gravity records and thus on the subrosion processes. Therefore, groundwater recordings at high precision level or hydrological models at best should be taken into account for future and similar studies.
An idea, in which way our results could be explained by underground mass transfer, is given in Fig. 11 as a simple approximation. The gravity effect of a 1 m thick layer of different densities with horizontal dimensions of 10 m in both the x and y directions, which is located and centred beneath a single surface point, is displayed. For example, a cavity similar to the one surveyed beneath the leaning church tower (Sect. 2) would have to propagate for 1 m towards Earth's surface, with a density of disappearing rocks of 2600 kg m −3 (density measured by LIAG Hanover using drill cores from the 458 m deep research drilling at the leaning church tower of BF), to explain a gravity decrease of 6.5 µGal (red curve). This could fit to the found gravity decrease at gravity points G07 and G12 (Fig. 9), which are located very close to the leaning church tower, after about 3-4 years. Erosion rates known from literature can vary from 0.1 mm a −1 for limestone (Waltham et al., 2005;Gabrovšek and Stepišnik, 2011) Figure 11. Expected changes in gravity against density for a 1 m thick dissolved layer located beneath a survey point and at different depths, with a spatial dimension of dx = 10 m, dy = 10 m, and dz = 1 m. to 0.5-1.0 m a −1 for gypsum under favourable subrosion conditions (Waltham et al., 2005). The erosion rates determined by the approximation of our simple model are on the order of about 0.25 m a −1 . This fits to favourable geological subrosion conditions in our measurement area, in which the caprock mostly consists of anhydrite and gypsum. The erosion itself is proven by several small cavities (Sect. 2). Also, it can be discussed if the mass loss is only according to dissolution processes or if, e.g. a roof collapse of a cavity, i.e. breccia, can result in larger rate of mass loss. However, such estimations are indeed subject to the principle of ambiguity in gravity investigations, which has to be considered during interpretation. For an extensive modelling of subrosion processes based on gravity data, very long gravity time series as well as a detailed geological model of the study area are required, which is subject to future work. In addition, and for similar studies, we propose to dense up the measurement network for areas that are demonstrably not gravitationally stable (in our case, around the leaning church tower) and thus to give up few points in stable areas due to high measurement effort. Especially in local studies, which investigate shallow structures, it is required to derive detailed information about the spatial dimension of areas that are prone to gravity decrease due to underground mass redistribution.

Conclusion and future work
In this study, we show the feasibility and the success of an approach that combines levelling and time-lapse gravimetric surveys in the urban area of Bad Frankenhausen in Germany, which is intensely prone to subrosion. We used ground-based time-lapse gravimetry, which is a non-invasive and power-ful geophysical tool to monitor mass movements in the subsurface, and precise levelling to investigate the accompanied subsidence. A total of 17 time-lapse gravimetry and 18 levelling campaigns were carried out over 4 years in a local combined network. Despite challenging measurement conditions and the lack of permanent monitoring sites for environmental or hydrological parameters, we identified subrosion-related signal content in our measurements. To our knowledge, the presented field study is the first long-term, high-resolution time-lapse gravity study in a subrosion-prone area. Furthermore, it is the first attempt to quantify mass movements related to underground leaching, as well as to correct timelapse gravity records for varying soil water content using a regression coefficient. The main results and modifications for future work based on the findings can be summarized as follows: 1. Subsidence in Bad Frankenhausen, on the order of millimetres up to several centimetres per year, is an ongoing process with possible non-linear periods, which is evidence of ongoing leaching.
2. A significant gravity decrease of 1-2 µGal a −1 has been observed on several points, which is additional evidence of subsurface mass relocation.
3. Survey points should be stable with respect to seasonal elevation effects (ground frost) and inner-city noise. Well suited are poured concrete pedestals for time-lapse gravimetric surveys and drilled holes filled by steel piles for levelling, both below frost depth.
4. Survey points should be connected to a superior reference system for stability control of the whole network and significant results.
5. Correction of gravity observations for varying soil water content using the global GLDAS Noah model is an effective approach to reduce the seasonal signal but requires further investigation. We propose in situ measurements using ground-penetrating radar (GPR) or nuclear magnetic resonance (NMR). Additionally, we recommend monitoring of the groundwater level.
6. If time-lapse gravity monitoring is conducted in areas of large subsidence, a more precise determination of the vertical gravity gradient should be considered to improve the correction of height changes in the time-lapse gravity records. Even more, the vertical gravity gradient can be considered as a separate parameter to be monitored. The second or even third derivative of the gravity potential is more sensitive to near-surface density variations, i.e. subrosion-induced mass changes (e.g. Eppelbaum, 2009). However, a precise determination of the gradient requires much effort and is of course also sensitive to near-surface hydrological mass variations.
7. For similar studies, we recommend to exclusively use highly precise gravity meters (Scintrex CG5 or CG6, ZLS Burris) and observe gravity differences at least three times each with three to four instruments in order to improve the quality and validity of gravity differences significantly.
8. After 4 years of observation, it becomes possible to model or quantify the amount of weakening of soluble layers by leaching or subsurface cavity growth. However, longer time series, i.e. additional data acquisition, are required to increase the resilience of the approach presented and to detect non-linearities in gravity variations.
9. In support of point 8, due to ambiguity in gravimetry, determination of the spatial extent of local areas, in which gravity is changing, requires local densification of the measurement network.
Besides ongoing data acquisition, the next step is to create a geological model of the study area based on close-mesh Bouguer anomalies, physical rock parameters derived from other studies in this area (e.g. Wadas et al., 2016), and borehole measurements. On the basis of this model, we aim to identify depth and thickness of potential sinkhole areas in BF and to investigate additional structures which are not being resolved by our time-lapse observations yet. The timevariable subrosion-induced mass transports will be then investigated related to their depth and extent by adapting the geological model for each survey period.
In the future and for similar studies, it is recommendable to focus on the investigation of the role of hydrology and its parameters (e.g. flow intensity, flow path, flow direction) for sinkhole development.
Data availability. Geodetic-gravimetric data are property of LIAG Hanover and will be used for further investigations and publications by the authors. The data are available from the authors upon request. Please contact the first author for details. The specific soil water GLDAS Noah model used here is provided by Beaudoing and Rodell (2016).
Author contributions. MK analysed the gravity data, developed software (pre-processing, hydrological correction), produced figures, and wrote the paper. GG developed the overall scientific concept of the project and established the monitoring network. AW and MK developed the new hydrological correction concept. DV performed the processing and validation of levelling data. All authors contributed to the fieldwork, the scientific discussion and interpretation, and reviewing and editing the paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. Sincere thanks are given to the technical staff of LIAG Hanover for extensive measurement work despite difficult weather circumstances: Jan Bergmann-Barrocas, Jan Bayerle, Dieter Eppig, Eckhardt Großmann, Norman Krause, Sven Wedig, and Erwin Wagner. Furthermore, we thank Ludger Timmen and the University of Hanover, Germany, for the frequent loan of a Scintrex CG3 gravity meter. We appreciate David Tanner for his advice and help with improving the manuscript as well as the English grammar and spelling. We acknowledge the valuable reviews of the reviewers, Carla Braitenberg and Lev Eppelbaum, that have contributed to the improvement of this article.
The work is part of the project SIMULTAN, funded by the Federal Ministry for Education and Research (BMBF, grant no. 03G0843A), which is gratefully acknowledged.
Review statement. This paper was edited by Ulrike Werban and reviewed by Carla Braitenberg and Lev Eppelbaum.