Surface-wave tomography for mineral exploration: a successful combination of passive and active data (Siilinjärvi phosphorus mine, Finland)

Surface wave (SW) methods are ideal candidates for an effective and sustainable development of seismic exploration, but still remain under-exploited in hard rock sites. We present a successful application of active and passive surface wave tomography for the characterization of the southern continuation of the Siilinjärvi phosphate deposit (Finland). A semi-automatic workflow for the extraction of the path-average dispersion curves (DCs) from ambient seismic noise data is proposed, including identification of time windows with strong coherent SW signal, azimuth analysis and two-station method 15 for DC picking. DCs retrieved from passive data are compared with active SW tomography results recently obtained at the site. Passive data are found to carry information at longer wavelengths, thus extending the investigation depth. Active and passive DCs are consequently inverted together to retrieve a deep pseudo-3D shear-wave velocity model for the site, with improved resolution. The seismic results are compared with the latest available geological models to both validate the proposed workflow and improve the interpretation of the geometry, extent and contacts of the mineralization. Important large-scale 20 geological boundaries and structural discontinuities are recognized from the results, demonstrating the effectiveness and advantages of the methods for mineral exploration perspectives.


Introduction
Challenges in mineral exploration are growing since current targets are typically located at increasing depths (Decrée and Robb, 2019) and within the proximity of existing mine infrastructures, where noise, accessibility and environmental impact 25 issues exist. Exploitation of deep mineral resources and within the proximity of existing mine infrastructures pose challenges in technical capabilities, environmental impacts and safety concerns (Decrée and Robb, 2019). These challenges also affect the exploration stage. To be successful, modern mineral exploration methods should progress technologically to ensure improved efficiency, economic viability and environmental sustainability. In this framework, high resolution seismic velocity models can assist the construction of high quality, reliable geological models. These models carry comprising information 30 about the composition and structural geometry of the bedrock, which are both crucial for successful exploration. However, mineral exploration sites are typically characterized by intricate hard rock geology, inherently related to complex wave propagation and challenging seismic prospection (Eaton et al., 2003), both exploiting body waves (BWs) and surface waves (SWs). The sensitivity of SWs to near-surface properties and lateral variations (Colombero et al., 2019), their high energy in seismic recordings and lower attenuation with respect to BWs make them attractive for subsurface geological reconstruction 35 in mining areas . In addition, , as well as the high velocity field expected at hard rock sites makes them ideal candidatessuitable for high-resolution deep mineral exploration. In both active and passive SW tomography (SWT), dispersion curves (DCs) can be retrieved between pairs of receivers and then inverted in a tomographic approach to estimate 2D/3D (depending on the receiver geometry) shear-wave velocity distributions at several grid points (Kennett and Yoshizawa, 2002). Depending on the spatial and wavelength coverage and the superposition between different receiver-pair paths, high 40 resolution can be achieved (Yin et al., 2016). SWT is even more attractive with the use of passive data, since ambient seismic noise recorded at mining sites is typically dominated by these waves. The SW signal contained in ambient noise is usually of lower frequency (i.e. longer wavelength), with respect to the one produced by the active-seismic explosive or mechanical sources of exploration, potentially meaning higher investigation depths (Park et al., 2005).
However, the use of SWs remains under-exploited in hard rock sites, while in other fields of applications, such as global 45 seismology, passive SWT, exploiting ambient seismic noise or earthquakes, is widely used to map the laterally varying structure of the Earth's crust and mantle (e.g. Ritzwoller and Levshin, 1998;Kennett and Yoshizawa, 2002;Shapiro et al., 2005;Sabra et al., 2005).
The extraction of DCs form active data is generally undertaken converting a common shot gather of traces recorded by a receiver array (space-time domain) into a different domain (e.g. frequency-wavenumber or frequency-phase 50 velocity/slowness). In the spectral domain, the DC can be easily identified and picked as the coordinates of the spectral maxima thanks to the high energy and characteristic dispersive pattern of SWs (Socco and Strobbia, 2004).
CCompared to active SWT, the extraction of high-quality DCs from passive-source data is not straightforward. Indeed, the direction of propagation of SW with respect to the receivers' location is unknown and likely variable over time. To ensure that useful SW information is contained in the data, the field measurements are usually performed over long time periods, leading 55 to large datasets which are difficult to process, especially when the processing scheme is not automated. Finally, ambient noise usually lacks high-frequency SW information, impeding the retrieval of accurate velocity estimates at shallow depths.
In the field of natural-resource exploration, SWT is consequently mainly used for the characterization of deep geothermal reservoirs (e.g. Lehujeur et al., 2017;Martins et al., 2020;Planès et al., 2020). Promising examples, though, have shown its potential to be used as a low-cost and environmentally friendly investigation method, also in hydrocarbon (e.g. Bussat and 60 Kugler, 2009) and mineral (Hollis et al., 2018;Lynch et al., 2019;Da Col et al., 2020) exploration.
Here, weWe present a successful integration of active and passive SWT for the characterization of the southern tail of the Siilinjärvi phosphate deposit (Laukansalo, Siilinjärvi, Finland), which can be the target of future mining activitieswhere a new open pit mine is planned to be opened in 15 years. Data acquisition, processing and methodological implementation were friendly methods for the sustainable exploration of mineral resources. We designed and tested a semi-automatic workflow for the extraction of coherent SW signal from ambient seismic noise recordings, azimuth analysis and picking of DCs for the tomographic inversion. Active SWT results from the same area of investigation, down to a depth of approximately 210 m, display a low-velocity area compatible with the presence of the carbonatite complex and high-velocity anomalies matching the position of known diabase dykes . In this work, we exploit the lower frequencies contained within the 70 passive data to improve the azimuthal coverage and investigation depth. Active and passive DCs are then inverted together to retrieve a high-resolution shear-wave velocity (VS) model down to a depth of 480 m. We finally compare the results with the overall structural framework of the site, the latest geological models of the owning company (Yara) and the available diamond drill holes (DDH), typically extending to the depth of 200 m below the ground level. The results of this investigation comprise: i) improved understanding of the structural patterns of the site, including the contacts of the mineralization, ii) validation of 75 surface wave tomography (SWT) as a reliable, effective and sustainable tool for mineral exploration, and iii) guidelines for future mine planning at the site.

Test site and passive seismic data set
Siilinjärvi phosphorus mine, located in the municipality of Siilinjärvi, 20 km north of the city of Kuopio (Finland, Fig. 1a), is the only operating mine producing phosphorus within the EU. The phosphorus-bearing mineral is apatite, currently extracted 80 from two open pits exploiting one of the oldest carbonatites on Earth (2.61 Ga; e.g. Tichomirowa et al., 2006;O'Brien et al., 2015). The ore occurs as a sub-vertically dipping, N-S trending sheeted body measuring approximately 1 km in width and 15 km in length (Fig. 1b), continuing south of the current exploitation area (Fig. 1c). The ore body is known to extend down to 800 800-m depth (Malehmir et al., 2017).
The Siilinjärvi deposit is comprised of carbonatite-glimmerite rocks intruded into Archaean granitic and gneissic host rocks. 85 Alkali metasomatised fenite halo encircles the carbonatite-glimmerite ore zone. Evenly spaced, predominantly NW-SE to N-S striking diabase (i.e. waste rock) dykes crosscut the entire ore body. The thickness of the dykes varies within a range of a few centimetres to several tens of meters (Puustinen, 1971), and most of the known dykes show steep dips, but a significant group of moderately to gently dipping dykes is also known to exist (Mattsson et al., 2019). The dykes intersect each another creating a complex structural pattern, which is later disrupted by N-S trending shearing along the ore boundaries and by 90 Proterozoic (1.89 Ga) dioritic-tonalitic intrusion in the SW border of the deposit (e.g. Lukkarinen, 2008;Fig. 1b). For future mine planning, beside the more complete characterization of the intricate geo-structural setting, further identification and location of sub-horizontally oriented diabase dykes at depth would be crucial. As their locations and extent can`t be reliably predicted based on surface and borehole geological observations, geophysical methods are needed to aid the mapping.
Active and passive seismic data were acquired at Siilinjärvi phosphorus mine in 2018, with a 3D array of randomly-distributed 95 stations (Fig. 1c). The deployment design was chosen to respect site and instrumentation logistical constraints and to optimize subsurface illumination (in terms of wavelength and azimuth coverage) in the areas of interest, as further described in ). The seismic array consisted of 578 vertical geophones (10 Hz) connected to wireless stations, and almost continuously recorded for 13 days (September 24 -October 6). Without any a-priori information on the ambient seismic noise spatio-temporal distribution on site, a long acquisition time was selected to maximize the possibility of recording a large variety 100 of useful noise sources. Data gaps occurred mainly during night hours, due to battery discharge. Passive data from all stations were recorded at 500 Hz and stored in 1-minute segy files. The deployment design was chosen to respect site and instrumentation logistical constraints and to optimize subsurface illumination (in terms of wavelength and azimuth coverage) in the areas of interest . Beside the seismic reflection lines (SM1 to SM3 in Fig. 1b), remaining seismic stations were deployed in three main areas, including the main pit to the north, the gypsum pile located SW, and the southern 105 forest area (Fig. 1c). The latter (Laukansalo, Siilinjärvi) is of crucial importance for future mining activities, since it is known that the mineralization extends to the south from the current main pit. In this area, the 3D random array included 273 stations (in red in Fig. 1c and Fig. 1d) covering approximately 0.8 km 2 , with an average distance of 50 m between the stations. Passive data recorded in this area were processed to increase depth and resolution of the shear-wave velocity model retrieved for the southern continuation of the mineralization by active SWT . 110

Passive seismic tomography workflow
The processing workflow adopted for passive SWT is summarized in Fig. 2. Before the tomographic inversion, a two-step semi-automatic workflow was designed to extract the path-average DCs from the passive data.
In the pre-processing step (I in Fig. 2), each 1-minute segy file is windowed in 2-s segments. On each window, the Frequency Domain Beam Forming (FDBF) method (Zywicki, 1999) is applied to retrieve frequency-wavenumber (fkx, ky) power 115 spectral density functions, used to identify dispersive events in the recordings and their direction of propagation. The result is a 3D matrix having kx-ky spectra as planes at the different investigated frequencies (3-20 Hz). This frequency band was chosen to lower the frequency content retrieved from active DCs, generally in the range 7-48 Hz . The position of the spectral maximum in each kx-ky plane provides wavenumbers and azimuth of the dominant SWs propagating at that frequency during the 2-s window (Foti et al., 2007). Once the wavenumber modulus |k| is retrieved from kx and ky values, the 120 phase-velocity (VR) of the DC is computed following: where |k|, is simply: (2).
dispersion region (Fig. 3a), defined on the basis of the velocity ranges of active dispersion curves at the lowest frequencies. If a significant number of DC points are located within the dispersion region (>70%), data quality is considered sufficient to continue the computations, otherwise, the time window is rejected. The threshold for window acceptance was chosen by visual inspection of the FDBF DCs retrieved from the 2-s windows of the first day of recording. Considering that all the computed DCs showed phase velocities higher than the active dispersion region at the lowest frequencies and additional noisy points 130 may exist at higher frequencies (e.g. Fig. 3a), 70% of points within the dispersion region was selected as a good compromise to keep the window for further processing. This threshold is strictly depending on the selection of the active dispersion region, the DC pattern at the lowest frequencies, the analyzed frequency band and the overall data quality, so it may be highly sitedependent and an initial training on the data is recommended for future applications.
For the high-quality time windows, the maxima of the kx-ky planes are automatically picked (Fig. 3b). For an automatic random 135 selection of peaks, the azimuth is computed based on its location in the related kx-ky plane (Fig. 3d to 3i;Okada, 2003;Foti et al., 2007). If there is a dominant azimuth, i.e. consistent among the different frequencies (Fig. 3c), the SW event is considered coherent and the time window is stored in a folder associated to that azimuth value.
This pre-processing procedure is automatically repeated for all the available records. Fig. 4a shows the entire recorded dataset of 1-minute segy files analyzed with the pre-processing step. Data gaps occurred predominantly during night hours due to 140 battery discharge. The average amplitude of the ambient seismic noise recorded at the site is variable over time, with higher amplitude in the morning hours of each day, potentially meaning a higher number of SW events recorded by the entire network in these time windows. The number of extracted 2-s time windows containing strong and coherent SW signal is shown in Fig.   4b, withand the related azimuth distribution (, Fig. 4c)retrieved from the pre-processing analyses. Approximately 7.6% of the recorded 2-s windows showed strong SW signal (Fig. 4a4b). To be consistent with the kx-ky plane orientation, the azimuth 145 distribution is shown in the reference system of Fig. 1d. Nearly 8000 SW events originated at an azimuth around 219° (Fig.   4c). More in general, a first cluster of events is located between 200° and 240°, probably related to the workshop activities located towards W-SW, where industrial machineries are produced and tested (Fig. 1c). A second cluster of events is centred around 50°, likely related to the factory area located to the NE of the forest, where apatite concentrate is processed on site to produce phosphoric acid and fertilizers (Fig. 1c). 150 In the second processing step (II in Fig. 2), for each azimuth (Fig. 4b4c), receiver pairs aligned along this direction, with a tolerance of 1°, are chosen randomly to maximize path length and consequently wavelength distribution .
A modified version of the two-station method (as implemented by Yao et al., 2006) is applied to each receiver pair (e.g. Fig.   5a). The receiver traces are cross-correlated frequency by frequency to extract cross-multiplication matrices. This procedure is repeated for the same receiver pair in all the available time windows having the same dominant azimuth. Individual cross-155 multiplication matrices are then stacked to increase the signal-to-noise ratio. DCs are automatically extracted as the amplitude maxima of the stacked cross-multiplication matrices (Fig. 5b), using the reference dispersion region of active data  to guide the picking at the highest frequencies (>10 Hz).
The comparisons between active and passive DCs for the same receiver pairs (e.g. Fig. 5c) showed matching phase-velocity trend and values in the overlapping frequency/wavelength range. The global passive DC dataset (1025 curves) retrieved from 160 the picking of the stacked cross-multiplication matrices is shown in Fig. 6 in comparison with the active DCs (433 curves; Da . The curves from passive data are generally characterized by longer wavelengths (Fig. 6b), thus enlarging the depth of investigation. The maximum wavelength of the passive DCs (approximately 1200 m) is nearly double the one of the active curves, even though the frequency content in the passive DCs was not significantly lower than the frequencies of the active curves (Fig. 6a). Thanks to the high seismic velocities characterizing the site, even the few additional lower frequency 165 components present in some of the passive curves resulted in significantly large wavelength increases. The two DC data set were therefore merged to improve the invesion results.
The distribution of the path length and azimuths of all the DCs is shown in Fig. 7. Although we observed peaks in the estimated azimuths around 200°-240° and 42°-60° (Fig. 4b4c) for the passive data, this did not translate to corresponding peaks in the global DC azimuth coverage. Instead, the distribution of DC azimuths is rather homogeneous, apart from a peak in the 150° 170 and 330° direction. This was due to the fact that the cross-correlation matrices, computed for the same receiver pair and different time windows, were stacked, and only one DC was extracted for each receiver pair. Moreover, due to the random geometry of the receivers, several receiver pairs with different path lengths existed at the same azimuth ( Fig. 7a) and, therefore, a wide range of path lengths was included in the DC set.
The spatial coverage obtained by adding the passive DCs to the active dataset is shown for different depth ranges (assuming a 175 pseudodepth of investigation = half wavelength; Abbiss, 1981). DC coverage is dense in the whole forest area and wavelengths and apparent phase velocities show consistent lateral variability at each wavelength. In the active data only, coverage was lower and limited to a depth of approximately 210 m (Da Col et al., 2020), while acceptable coverage is now retrieved until a depth of approximately 400-500 m.
For the tomographic inversion of the merged data set (III in Fig. 2), the same approach applied to active DCs was followed to 180 retrieve a pseudo three-dimensional shear-wave velocity (VS) model, as described in Boiero (2009). 1D profiles beneath grid points are estimated using damped weighted least square inversion. Horizontal and vertical constraints between neighbouring models are set in the regularization matrix to connect VS and thickness information from one model to the surrounding ones.
For the spatial discretization, we assumed 420 uniformly distributed grid points, at 44 m from each other. After testing different vertical-discretization dimensions, we concluded that the best DC fitting was achieved assuming 7 layers, reaching a maximum 185 depth of 480 m. The adopted initial model is reported in Table 1. A lateral VS constraint of 150 m/s for the first four layers and of 500 m/s for layers 5-7 was imposed. These were found to be the strongest constraints which did not increase the inversion misfit. The final VS model, obtained after 12 iterations, is shown in Fig. 9 as slices at different depths. In the shallow subsurface (30 190 m, Fig. 9a), low velocity values (2000-2800 m/s) are almost homogeneously displayed. Deeper, down to 240 m, the model presents a clear mapping of the carbonatites, characterized by shear-wave velocity values lower than the surrounding rock types, due to their petrophysical properties and fracturing conditions low-velocity values, which have been previously attributed to the presence of the carbonatites . In the same slices, the results highlight several local highvelocity anomalies, which have been earlier interpreted, depending on their location, as the intrusions of diabase dykes within 195 the Siilinjärvi deposit or as the host rock (Fig. 9b to d). These interpretations were already supported by the available surface geological mapping, drilling logs and geological model. The layers between 330 m and 480 m (Fig. 9e to g) present a greater extent of high velocity zones (3300-4000 m/s) and lower lateral variability.
A comprehensive geological interpretation of the final VS model is given in the Discussion section.
The histogram of Fig. 10a shows the misfit between the experimental and theoretical (computed for the inverted model) DC 200 points. The Normalized Root Mean Square Deviation of the DCs along the analyzed two-station paths is shown in Fig. 10b.
The misfit is lower than 12.5% for more than 75 % of the data points, and there are no zones over the investigated area with concentration of high misfits. The results are therefore acceptable and the estimated DCs are well fitting the experimental ones.
To evaluate the resolution of the final model, a checkerboard test was performed. The initial model velocities (Table 1) were perturbed positively and negatively (±8%) according to the regular pattern of Fig. 11a (squares of 200x200 m). The VS 205 perturbations obtained after the inversion are presented in Fig. 11b. Velocity perturbations at depths 30-330 m are well reconstructed, apart from the south-western portion of the slice at 90-m depth (red box in Fig. 11). At depths 330-480 m, the quality of the inverted checkerboard model reduces, especially at 330 m depth, where the perturbations in the southern portion of the model appear smoothened (blue box in Fig. 11). Nevertheless, the dimensions of the rest of the perturbation blocks in all the other layers are clearly identified in the inverted checkerboard model. Therefore, it can be concluded that the resolution 210 of the experimentally retrieved VS model of Fig. 9 equals, at least, the size of the perturbation blocks of Fig. 11, except for the two zones indicated by the coloured boxes.

Discussion
The combined use of active and passive data led to the reconstruction of a deeper VS model for the area south of the current Siilinjärvi phosphorus mine where a new open pit is planned. Compared to active SWT only, the final model ( Fig. 9) 215 highlighted more continuous features which geological interpretation may offer a valuable support for the characterization of the mineralization and future mining plans.
The carbonatite-glimmerite bodies show generally low-velocity signatures as already found in Da Col et al. (2020), but the final velocity distribution now better highlights the general NNE-SSW orientation of the mineralization down to the 240-m depth (Fig. 12b to Fi. 12d). Also the fenite bodies are found to be associated with low-velocity signatures, while surrounding 220 granite and gneiss show the highest VS values at shallow depths (<240 m). No sharp velocity contacts are depicted at the boundaries of the mineralization, but relatively clear arrays of NNE-SSW discontinuities coincide with diabase dykes and shear zones as modelled by Yara (Fig. 12a) and can be used to delineate the margins of the mineralization (I in Fig. 12c). In particular, the western contact is visible at all depths, and it is more distinct than the eastern margin. The contrasting character of the margins is in line with the observations on DDH data, where the eastern contact is found to be gradational and defined 225 by alternating layers of carbonatite-glimmerite, fenite and tonalite. In the eastern area of investigation, the NNE-SSW discontinuities and transitions in the seismic velocity signatures can be used to improve the delineation of the contacts between fenite and granite-gneiss (IV). Some structural embayments along the fenite-granite granite/gneiss contact are possible, as highlighted by the different dashed pattern along line IV at the depth of 150 m (Fig. 12c). S, and such geometry is compatible with the overall irregular thickness of the fenite within the ground surface level geological maps (Fig. 1b). A sharp increase in 230 velocity is seen from the depth of 240 m downwards, but with no direct linkage to known geological features. The increasing velocities are likely consistent with a reduction in weathering and fracturing from the shallow subsurface downwards.
However, this also matches the fabric of stronger reflectivity segments on the 2D reflection seismic profiles sections (SM1 and SM2 in Fig. 1c), possibly originating from contacts to diabase dykes.
In the model slices 90-240 m, discontinuous higher-velocity areas within the mineralization, and to a lesser degree in fenite, 235 loosely define a structural network that we infer to represent diabase dykes (II, III, V and VI). In particular, feature II does not directly correlate with any known diabase dyke set, but could result from the interplay of different dyke sets, comprising also the strictly horizontal dykes presently included in the gently ESE-dipping set (orange in Fig. 12b). Feature III may correlate with the sub-vertical NNW-SSE trending dyke set (magenta in Fig. 12b). The orientation of thin diabase dykes found in some DDH is compatible with V and VI at 240-m depth, as well as with the larger dykes bounding the mineralization. 240 At deeper levels, the NNE-SSW trending western margin of the high-velocity material remains the most visible feature, in agreement with the regional trend and the dominant population of diabase dykes, while no internal structures can be recognized in the mineralization from the seismic results. A parallel (NNE-SSW) but less distinct seismic "low" coincides with the fenite bounding the mineralization to the E.
In general, decametric geo-structural features, with a size compatible with the checkerboard perturbations of Fig. 11, are clearly 245 observable in the SWT results. Smaller-scale individual dykes and other local geological features shown by DDH data and observations in the Siilinjärvi main pit, further in the north, are below the expected resolution.

Conclusion
The application of combined active and passive SWT provided a high-resolution deep seismic velocity model for the southern continuation of the Siilinjärvi phosphate deposit, on which new mining activities are planned for after 2035can be potentially 250 focused in the future. We proposed a semi-automatic workflow for the extraction of the path-average DCs from ambient noise data recorded on site in 2018. The pre-processing steps allow the extraction of the time windows containing sufficient SW energy, minimizing the time and computational costs of the subsequent processing steps. The distribution of azimuths provides information about the noise sources and allows for the application of the two-station method, which requires the receivers to be in-line with the source. The method is almost fully automatic, apart from an initial calibration of the pre-processing criteria 255 (i.e., thresholds to compare passive DC points and the reference active dispersion region and to declare that a dominant azimuth is present in the window) and a manual quality control ofn the picked DCs on a training dataset. However, due to the large size of the data set (almost 800 GB), the pre-processing computations required around 3 weeks, running in parallel on a 10core workstation. Including only the accepted time windows, the data set was downsized to approximately 61.5 GB. The DCs obtained by the proposed workflow are comparable to the ones obtained from active data in the overlapping frequency range, 260 thus supporting the reliability of the results. Moreover, it was shown that passive data contain more information about the deeper subsurface portions, increasing the investigation depth and therefore, opening the possibility of mapping the target at depth. Combined with the active DCs, they provided high resolution SWT results down to 480-m depth. The time required for the two-station processing, stacking and semi-automatic DC picking was approximately 1 week, while the tomographic inversion required approximately 6.5 hours on the same 10-core workstation. The final velocity model was validated through 265 interpretation of the main seismic signatures in comparison with the available geo-structural data. Results confirm the southern extension of the carbonatite-glimmerite deposits, help in constraining the geometry of the mineralization and in unravelling the intricate relationships with the host rocks and the major intruding dikes. This model can be used to guide future drilling efforts for a detailed characterization of the mineralization planning the new open pit in the area.
This successful application has proven that the use of passive data within SWT is a promising, cost-effective and sustainable 270 tool for deep mineral exploration, overcoming the need for active seismic sources.

Code/Data availability
Code and data were developed and acquired within the EU H2020 project Smart Exploration (grant agreement No. 775971) and may be available by contacting the corresponding author.

Author contribution 275
CC and MP worked on SW passive data processing, with coordination and supervision of LVS. TK, PS, EK and MS worked on the geological interpretation of the seismic results. MP and EK contributed to seismic data acquisition. CC wrote the original paper draft, with contributions from all the authors.

Competing interest
The authors declare that they have no conflict of interest. 280