These authors contributed equally to this work.

We introduce the open-source tool

Cross-correlations of ambient seismic noise form the basis of many applications in seismology from site effects studies

Importantly, most ambient noise studies are based on the assumption that noise cross-correlations converge to inter-station Green's functions

Several state-of-the-art open-source tools for ambient noise data processing are freely available, e.g., MSnoise

Therefore, we present a tool named

One of the main challenges in modeling ambient noise cross-correlations is the adequate representation of seismic wave propagation from the noise
sources, which are in general globally distributed

Databases of pre-calculated Green's functions have recently been applied to a variety of seismological problems, such as source inversion of
earthquakes

Various examples fall within the range of possible applications of

Ambient seismic noise can be considered the superposition of elastic waves that have propagated from various traction sources

Hence noise cross-correlation modeling has to address how to parametrize the noise sources

For the evaluation of Eq. (

If an Earth model is assumed a priori, e.g., the Preliminary Reference Earth Model

Similar to the derivation of the forward model, the misfit gradient with respect to noise source parameters, which is needed for noise source
inversion, can be obtained. For one receiver pair and components

The

The core tasks of the tool are to evaluate Eqs. (

Below we briefly describe the implementation in more detail following a possible sequence of work to create a cross-correlation model and noise source inversion.

The discretized noise source grid that will be used throughout modeling and inversion is predefined and fixes the locations of possible noise
sources. For each evaluation of Eqs. (

Illustration of noise source model parametrization. The upper panels show spatial source distributions

Since the rectangle rule is used for spatial integration (Eq.

The grid only defines source longitude and latitude but does not specify elevation. The influence of an eventual topography of the underlying wave
propagation model on the surface area of each grid cell is neglected. However, topography itself can be taken into account; the Green's functions

Instead of parametrizing the sources as fully sampled spectra at each grid point, their spectra are represented by a small number of Gaussian
functions of frequency in each grid location, which reduces the dimensionality of the model and inverse problem and ensures that the source PSD in
each location is smooth. This is illustrated in Fig.

Any number of such distributions can be superimposed to create a source model. Gaussian PSDs and their spatial weights at each grid point are stored
in HDF5 format as detailed in the Appendix E. Examples for all input files are also provided in the GitHub repository. The parameters for setup are
geographic distributions (geographically homogeneous, ocean, and Gaussian “blob”), as well as the central frequency and variance of the Gaussian
spectra. Custom source models can be created by modifying the underlying HDF5 file (an example is shown in Sect.

Green's functions are stored in one HDF5 file per seismic receiver component. The format is specified in the Appendix C. For the preparation of this
database, routines are provided that take a seismic station list, the format of which is also specified in the Appendix B, as input. One may set up a
database for analytic far-field surface wave Green's functions for 2-D homogeneous media

Custom wavefields can be built by converting the format of previously computed surface wavefields. Similarly to the example of converting from
AxiSEM3D output, output from any other wave propagation solver may be interpolated at the grid locations and stored in the HDF5 format as detailed in
the Appendix C for use with the

The tool evaluates correlations for all possible combinations of stations specified in the station list (see Appendix B) and the selected component, optionally including auto-correlations. If run on multiple processors, tasks are again distributed according to a simple embarrassingly parallel scheme.

While the convolutions of Eqs. (

The resulting cross-correlations are saved in SAC format with essential metadata contained in the SAC header.

To run noise source inversion, observed auto- and/or cross-correlations must be provided as SAC files with their headers containing a fixed set of
metadata as specified in the Appendix A

To the best of our knowledge, the only currently available open-source model of noise cross-correlations and their sensitivity kernels was provided by

To compare entirely independently computed ambient noise cross-correlations, we use AxiSEM3D

As source distribution for this example, we chose a homogeneous distribution of noise with a Gaussian spectrum peaking at a 20

Comparison between two implementations of simulation ambient noise cross-correlations with PREM

Upon close inspection, deviations of the correlations modeled by

Forward modeling of ambient noise auto- and cross-correlations has been employed in a number of studies, for example, to investigate noise sources

Simulation of cross-correlations due to hum sources modeled akin to

We illustrate a selection of the resulting correlations (selected to represent the variety of inter-station paths and distances) in
Fig.

In a further step, we compare the model to observed cross-correlations. Since stacking duration was only 3 months for the noise source model
(July–September 2013), only a few of the modeled station pairs yield cross-correlation with an acceptable signal-to-noise ratio. These are pairs of
stations which are (i) exceptionally quiet in the hum band, according to probabilistic power spectral densities for the respective time period, and (ii) at moderate or near-antipodal distance to enhance station-to-station surface wave amplitude. These criteria are fulfilled by CAN, SSB, and TAM. We show
a comparison of their observational cross-correlations with the modeled ones in Fig.

Forward modeled and observed cross-correlations. No fitting or inversion was undertaken; the forward model is built upon the hum mechanism by

For better visibility, windows around the R1 wave are enlarged. Upon measuring the L2 waveform difference between observed and modeled cross-correlation within these windows, a slightly better overall fit is obtained by using a 3-D Earth model (this holds both for the three correlations selected here and the collection of all modeled correlations).

The observed cross-correlations are noisy due to the relatively short stack (up to 92 d depending on data availability); cross-correlations in this
frequency band are expected to predominantly show direct, fundamental mode surface waves between two stations only after a stacking duration of one
year and more

Sensitivity kernels computed with

Illustration of sensitivity kernels.

As the first misfit function, we use the L2 norm of the synthetic (

An exemplary waveform sensitivity kernel for the

In contrast, Fig.

For simplicity, we will refer to this second measurement as asymmetry in the following. This second sensitivity kernel (Fig.

This illustrates that inversions using different strategies to measure data-model misfits (waveform, asymmetry, etc.) will produce different optimal models of the noise source distribution. For example, provided adequate coverage, one can expect a higher resolution to result from using the L2 waveform misfit, which has more short-wavelength spatial features.

This appears even more clearly once we conduct the inversion. We first construct a synthetic dataset by forward modeling cross-correlations from a
source distribution shown in Fig.

To treat the inversions with different measurements consistently, we proceed in the same manner concerning filtering and smoothing. The inversion
starts at a lower frequency, and a higher frequency band is added (taking two measurements after bandpass filtering in two different bands) after
20 iterations. Gaussian smoothing is applied in lieu of a more formal regularization, and smoothing length is decreased after 20 iterations. The
optimization itself is performed with the L-BFGS algorithm of the SciPy minimize module

As expected, the full waveform misfit performs better at recovering the perturbations. The recovery succeeds reasonably well for sources that are close to the array, whereas sources at a greater distance are more smeared both towards and away from the array. The sources close to the array suffer fairly little smoothing and demonstrate that it is possible to not only retrieve the direction but in this case also the approximate location of ambient noise sources predominantly imaged by fitting surface wave measurements.

The logarithmic signal energy ratio misfit shows stronger inversion artifacts and images a rather crude impression of the target model with stronger smearing effects. In addition, this inversion was terminated after 44 iterations due its falling below the threshold for minimal misfit improvement, which might indicate that it is trapped in a local minimum or simply suggests very slow convergence.

Figure

While the full waveform misfit produces a very satisfactory image in this synthetic case, it has a very low tolerance for errors in the seismic velocity
model

The

Disadvantages compared to implementations integrated into spectral element solvers, such as the ones by

The output of the wavefield at the Earth's surface either in full or sampled at particular predefined grid locations poses practical challenges for
input/output and storage in both types of applications. As an example, the retained wavefield utilized by SPECFEM3D_GLOBE for creating the
cross-correlations of a single reference station in Fig.

The following SAC headers on observed cross-correlation traces can be used with

Stations to be used in modeling need to be specified in a comma-separated list (with one example line) as follows.

The tool expects to find Green's functions organized as HDF5 files by seismic receiver channel with filenames NETWORK.STATION..CHANNEL.h5 for the networks and stations listed in the input file list (see above). Each HDF5 file needs to contain the following data structure. Both single and double precision floats may be used for the “data” and “sourcegrid” datasets. Single precision is used by default.

The tool expects to find the noise source model as HDF5 files with name starting_model.h5 (for each iteration) with the following data structure.

The Python code can be downloaded from GitHub (

The GitHub repository contains a set of basic test cases to be passed by further developments. It also provides a numerical test for the consistency of forward model and gradient, which can be employed for the development of additional misfit functions.

All observed seismic data used to prepare this paper were downloaded from IRIS Data Management Center.

The supplement related to this article is available online at:

LE implemented the current version of

The authors declare that they have no conflict of interest.

The authors thank Kees Wapenaar and Erdinc Saygin for their thoughtful, constructive reviews on the paper, and the topical and executive
editors of

This research has been supported by the Swiss National Science Foundation (grant no. 175124,

This paper was edited by Michal Malinowski and reviewed by Erdinc Saygin and Cornelis Wapenaar.