Observation and explanation of spurious seismic signals emerging in teleseismic noise correlations

Deep body waves have been reconstructed from seismic noise correlations in recent studies. Authors prospect their great potentials in deep-Earth imaging. In addition to the expected physical seismic phases, some spurious arrivals having no correspondence in earthquake seismograms are observed from the noise correlations. The origins of the noise-derived body waves have not been well understood. Traditionally, the reconstruction of seismic phases from inter-receiver noise correlations 10 is attributed to the interference between waves from noise sources in the stationary-phase regions. The interfering waves emanating from a stationary-phase location have a common ray path from the source to the first receiver. The correlation operator cancels the common path and extracts a signal corresponding to the inter-receiver ray path. In this study, with seismic noise records from two networks at teleseismic distance, we show that noise-derived spurious seismic signals without correspondence in real seismograms can arise from the interference between waves without common ray path or common 15 slowness. These noise-derived signals cannot be explained by the traditional stationary-phase arguments. Numerical experiments reproduce the observed spurious signals. These signals still emerge for uniformly distributed noise sources, and thus are not caused by localized sources. We interpret the presence of the spurious signals with a less restrictive condition of quasi-stationary phase: providing the time delays between interfering waves from spatially distributed noise sources are close enough, the stack of correlation functions over the distributed sources can still be constructive as an effect of finite frequencies, 20 and thereby noise-derived signals emerge from the source averaging.


Introduction
The technique of noise correlation is implemented simply via computation of correlation functions between ambient noise records at receivers. Theoretical and experimental studies (e.g., Lobkis and Weaver, 2001;Wapenaar, 2004) have shown that under restrictive conditions, the inter-receiver correlation function converges toward the response that would be recorded at 25 one receiver if a source was located at the other. This is, by definition, the Green's function of the medium between the two receivers. Great achievements have been realized with the introduction of the noise correlation technique into solid-Earth seismology (Campillo and Paul, 2003;Shapiro and Campillo, 2004). The most common applications are passive imaging (e.g., Sabra et al., 2005;Shapiro et al., 2005) and monitoring (e.g., Brenguier et al., 2008;Wegler et al., 2009) of subsurface using signals derived from seismic noise. We refer to Campillo and Roux (2015) for a systematic review on the recent progress in the theoretical and methodological aspects, and the various noise-based applications.
Both the surface-wave and body-wave parts of the Green's function can be reconstructed from noise correlations. Surface waves are easier to extract due to their dominance in the noise spectra. There are relatively fewer, yet promising, examples of 5 noise-derived body waves. Some authors have demonstrated that deep body-wave signals that propagate through the mantle and core can be extracted from the correlations of continuous seismograms recorded by seismic networks at regional to global scales (Boué et al., 2013b(Boué et al., , 2014Lin et al., 2013;Nishida, 2013). Several applications followed. For instance, Poli et al. (2015) imaged the core-mantle boundary (CMB) beneath Siberia with P-wave reflections derived from microseisms. Spica et al. (2017) used higher order correlations to retrieve the ScS arrivals and imaged the lateral variations of the CMB beneath Mexico. 10 Besides from ambient noise wavefields excited by ocean waves (Longuet-Higgins, 1950), another strategy to extract the deep body waves relies on the highly coherent late coda waves excited by large earthquakes (Lin and Tsai, 2013;Boué et al., 2014;Xia et al., 2016). The coda-derived deeply incident waves are dominantly related to reflected and refracted core phases (Boué et al., 2014;Phạm et al., 2018). By estimating the differential times between coda-derived core phases, Wang et al. (2015) and Wang and Song (2017) inferred the equatorial anisotropy of the Earth's inner core. The coda correlations contain normal 15 seismic phases as well as spurious phases. A seismic phase is termed normal if it is present in the Green's function of the medium, and spurious if it is not observed in real seismograms and does not obey the theory of seismic wave propagation. Poli et al. (2017) ascribed the vertically traveling ScS-like signals in the vertical-vertical component of coda correlations to the interference of high-order modes that have low attenuations. Using ray theory and stationary-phase arguments, Phạm et al. (2018) interpreted the spurious phases in global coda correlations as results of the interferences between specific ray paths. 20 Tkalčić and Phạm (2018) inferred the shear properties of the Earth's inner core by modelling a spurious phase related to the J waves.
In contrast to the fact that spurious body waves in earthquake coda correlations have been observed and explained, the observation and explanation of spurious phase in ambient noise correlations are lacking yet. In this study, we focus our analysis 25 on the interpretation of a spurious P-type phase that can be observed from the noise correlations between receivers separated at teleseismic distances. This paper is organized as follows. In Sect. 2, we describe the processing of seismic noise data. In Sect. 3, we correlate the processed noise records and show the observation of noise-derived spurious arrivals emerging earlier than the direct P waves. In Sect. 4, we develop a new double-array technique to estimate the slownesses of the interfering waves and analyse the origin of the observed spurious phase. In Sect. 5, we propose a mechanism to explain the generation of 30 the spurious phase. More elaborate numerical experiments are implemented in Sect. 6 to reproduce the observed signals and to investigate the sensitivity to the distribution of seismic noise sources.
Continuous seismograms recorded in 2008 by the broadband stations of the Full Range Seismograph Network of Japan (the FNET array) and the northern Fennoscandia POLENET/LAPNET seismic network in Finland (the LAPNET array), are used in this study to calculate the double-array noise correlations (Fig. 1). The aperture of the LAPNET array is ~700 km, and that of the FNET array is nearly 1,400 km. There are 1,558 FNET-LAPNET station pairs in all. The distance between the FNET 5 and LAPNET stations ranges from ~56° to 70°, with a center-to-center distance of 63°.
The processing of seismic noise data is segment-based, as demonstrated in Fig. 2. The processing is similar to that adopted by Poli et al. (2012) and Boué et al. (2013b). First, routine signal-processing operations are applied to the raw seismograms (including mean and linear trend removal, bandpass filtering, 5 Hz down-sampling, instrumental response deconvolution). 10 Then, continuous seismograms are divided into 4-h segments. The frequency spectra of the segments are whitened between periods of 1 s and 100 s. One may further clip the spectral-whitened waveform at several times of the standard deviation to reduce large transients (e.g., 3.8 times as in the above-mentioned studies). A selection filter is applied to the segments to detect and reject those containing transient impulses like earthquakes and electronic glitches.

15
In the previously mentioned studies, the selection filter was based on the energy ratio between segment and daily trace, which can be deemed as a coarse version of the classic STA/LTA method commonly used for earthquake detection (Allen, 1982).
Here, we adopt a new kurtosis-based selection filter. The kurtosis, or fourth-order normalized central moment, is defined as the expectation operator for arithmetic mean and the demeaned waveform. It can be deemed as a measurement of deviation from the Gaussian distribution that has a kurtosis value of zero and is generally expected 20 for stationary ambient seismic noise (Peterson, 1993). The kurtosis is highly sensitive to impulsiveness (Westfall, 2014), close to zero for stationary noise but increasing abruptly in the presence of transient impulses (see Fig. 3 for examples). The selection filter can be applied at any stages of the processing (raw, whitened, clipped waveforms). In this study, segments of kurtosis beyond an empirical threshold of 1.5 are discarded. The kurtosis has also been used in detecting earthquakes and picking phases (e.g., Baillard et al., 2014;Saragiotis et al., 2002) and is first used for noise data processing. Compared to the energy-25 based selection, the kurtosis-based selection depends on the statistics of the segment itself and is more robust when the strength of seismic noise changes rapidly.

Observation of spurious phase
We correlate all of the available pairs of processed noise segments and stack them to produce the correlation function of the year-long data for each FNET-LAPNET station pair. The computation of correlation function is diagrammatized in Fig. S1. 30 The correlation function contains a causal part and an acausal part that correspond to the positive and negative time lags, respectively. In this paper, the acausal correlations correspond to seismic waves that travel from FNET to LAPNET (causal: from LAPNET to FNET).
The noise correlations of all of the FNET-LAPNET station pairs are binned in an inter-station distance interval of 0.1°, to produce the waveform sections for the acausal and causal parts of the noise correlations. The filtered sections (periods of 5 s 5 to 10 s) of the vertical-vertical noise correlations are shown in Fig. 4 and the broadband sections in Fig. S2. The theoretical P and PcP waves marked on the panels are predicted using the Taup program (Crotwell et al., 1999) and the IASP91 Earth model (Kennett and Engdahl, 1991). A coherent arrival between 410 s and 450 s is clearly visible in the acausal section of noise correlations, about 200 s earlier than the direct P wave that should be the primary arrival. The early arrival has no correspondence in the true Green's function of the Earth medium, and thereby is undoubtedly a spurious phase. Spectral 10 analysis indicates that the spurious phase has a peak period of 6.2 s (Fig. S3), typical for secondary microseisms. As estimated from the acausal vespagram in Fig. 4, the emerging time and apparent slowness of the spurious phase at 63° distance are about 430 s and 4.6 s/deg, respectively. The spurious phase is only observed in the vertical-vertical noise correlations, indicating that it is likely a P-type phase. In the causal correlations, a corresponding spurious phase is hardly discriminable.

Origin of signals from P-PKPab correlations 15
In the previous section, a prominent spurious phase was observed in the FNET-LAPNET noise correlations, and its apparent slowness and emerging time were estimated. The double-array configuration offers the possibility to estimate the respective azimuths and magnitudes of the slownesses of the correlated wavefields that should be responsible for the spurious phase.
Given a wave with slowness at FNET and a wave with slowness at LAPNET, the time difference between the time delay between the ith FNET station and the jth LAPNET station and the center-to-center reference time can be determined 20 from Eq. (1): where are the local coordinates of the station with respect to the array center, and superscripts A and B distinguish between FNET and LAPNET. For a given pair of ( , ), the noise correlations of all station pairs are beamed by Eq. (2): Where 〈•〉 denotes ensemble average and is the correlation function between the ith FNET station and the jth LAPNET station. This delay-and-sum process for the double-array data is known as the double-beam method, which has been applied to earthquake data (e.g., Krüger et al., 1993;Rost and Thomas, 2002) and noise correlations (e.g., Boué et al., 2013a;Roux et al., 2008). Repeating the double-beamforming for a range of and , the power map of the double-beamed waveforms The results of the double-array slowness analysis for the observed spurious phase are shown in Fig. 5a. The azimuths of the correlated waves responsible for the spurious phase are confined to the great-circle direction across FNET and LAPNET, implying that the corresponding microseism noise source should be located on the great circle. The slowness at FNET is distinct from that at LAPNET. The 4.7 s/deg slowness at FNET is valid for deep mantle phases, while the 4.2 s/deg slowness at LAPNET is characteristic of core phases. To investigate the resolution capability of the double-array slowness analysis for 5 the FNET-LAPNET geometry, we make numerical experiments by presuming (a) the same slowness at FNET and LAPNET (4.6 s/deg), and (b) different slownesses at FNET (4.7 s/deg) and LAPNET (4.2 s/deg). Assuming plane waves passing through FNET and LAPNET, the time delays between FNET and LAPNET station pairs can be calculated from Eq. (1). The wavelet of the observed spurious phase (5 to 10 s bandpass filtered) is convolved with the time delays to synthesize the correlation functions. The synthesized correlations are beamed by Eq. (2) for various slownesses. The results are plotted in Fig. 6. In both 10 cases, the slownesses of the correlated waves at FNET and LAPNET are well resolved, justifying the reliability of our slowness discrepancy estimation in Fig. 5a. To investigate if this discrepancy is caused by the lateral heterogeneity of structure beneath the seismic networks, we apply the double-array slowness analysis to the acausal P waves in the FNET-LAPNET correlations as reference (Fig. S4). If lateral heterogeneity causes the slowness discrepancy for the spurious phase, one should also observe a similar phenomenon for the inter-receiver P wave. It is revealed that the slownesses of the interfering waves for the P wave 15 coincide with each other and are close to the predicted value (6.7 s/deg in IASP91 model). The P-wave results again justify the reliability of our slowness estimates, and indicate that lateral heterogeneity is not the reason for the slowness discrepancy observed from Fig. 5a.
The peak period of the spurious phase is a typical value for secondary microseisms, which are dominantly excited by ocean 20 wave-wave interactions (Hasselmann, 1963;Longuet-Higgins, 1950). It implies that the noise sources are mainly distributed on the global ocean surface. We propose a slowness-track method to identify the ray paths of the interfering waves from source to receivers that produce the spurious phase (Fig. 5b). Body-wave phases that are discernible in the vertical-component earthquake seismograms are considered as candidates (see labels in Fig. 7). For each seismic phase, we compute a distancetraveltime table and a distance-slowness table using the Taup program. The source-receiver distance can be derived from the 25 distance-slowness table, and subsequently, the traveltime can be obtained from the distance-traveltime table. A pair of seismic phases are rejected if the difference between their source-receiver distances differs from the reference FNET-LAPNET distance (63°) or if their time delay deviates from the emerging time of the spurious phase (430 s). For clarity, only several typical P-type phases (P, PcP, PP, and PKP branches) are shown in Fig. 5b. Finally, we find that only the correlation between the P wave at FNET and the PKPab wave at LAPNET, emitted from noise source on the great circle ~89° away from FNET 30 and ~152° from LAPNET, can satisfy all the constraints. As can be seen from Fig. 5b, at 89° distance, the PcP wave arrives almost simultaneously with the P wave, suggesting that PcP-PKPab also has a time delay of ~430 s at 63° inter-receiver distance. However, the PcP slowness (4.1 s/deg) is inconsistent with the estimated 4.7 s/deg slowness at FNET. We ascribe to that PcP is a faint phase, so that signals generated from PcP-PKPab correlations are much weaker than those generated from P-PKPab correlations. The slownesses estimates are crucial for the exclusive determination of the interfering waves. There are also other pairs of seismic phases meeting the constrains other than the slowness estimates. Another example of PcS-PcPPcP correlations that roughly meets the inter-receiver distance (63°) and time delay (430 s), is provided in Fig. S5. The correlated PcS and PcPPcP waves have a common slowness of 3.6 s/deg. The PcS-PcPPcP correlations, as shown by Phạm et al. (2018), may lead to conspicuous signals in coda correlations. However, compared with the direct P and PKP waves in the ambient 5 noise wavefields, the two correlated waves are too weak to have significant contributions to the spurious phase observed in this study. Furthermore, the difference in slowness estimates refutes the PcS-PcPPcP correlation as the dominant term in our case. This example reflects the discrepancies between ambient noise correlations and earthquake coda correlations.
It is logical that the spurious phase is observable in the vertical-vertical noise correlations only, as the correlated waves are

Explanation of quasi-stationary phase
The observed spurious phase originates from the correlation between teleseismic P waves and PKPab waves that emanate from the oceanic microseism noise source south of New Zealand. In this section, we explain how such spurious signals can be 20 generated from the interference between waves having distinct slownesses and no common path. Considering ambient noise wavefield as a superposition of waves from uncorrelated sources distributed on Earth's surface (Fig. 9a), the correlation function between the noise records at two receivers is equivalent to a stack of the correlation functions for individual sources (i.e., source averaging; see e.g., Ruigrok et al., 2008). We first simulate the source-wise correlation functions by convolving a wavelet of 6.2 s period with the time delays between the two correlated seismic phases. The final inter-receiver correlation 25 function is obtained from a stack over all sources. In this ray-based simulation, amplitude information is neglected for simplicity. The result for the source averaging of the P-PKPab correlations is shown in Fig. 9b, while that of the P-PKPbc correlations is shown in Fig. 9c for comparison.
The construction of seismic signals from noise correlations has been usually explained with the stationary-phase condition 30 (e.g., Wapenaar et al., 2010). We show such an example in Fig. S4, which is the reconstruction of the inter-receiver P wave from the correlation between the P and PP waves. The P-wave reconstruction is linked to the presence of the extreme (stationary) point on the curve of the P-PP time delay. The P and PP waves from the source at the stationary-phase location (Fig. S4, source A) have a common path and a common slowness. However, as for the spurious phase observed between FNET and LAPNET, the correlated P and PKPab waves have no slowness or ray path in common, and there is no stationary point on the curve of the P-PKPab time delay (Fig. 9b). The strict stationary-phase condition is not fulfilled, and thus the emergence of the spurious phase cannot be explained by this argument. Despite the missing of stationary points on the curve of the P-PKPab time delay, 5 the interference between finite-frequency P and PKPab waves is constructive over the shaded range in Fig. 9b. That leads to the presence of a pulsive signal in the final inter-receiver correlation by source averaging. In contrast to the strict condition of stationary phase, we propose to call this mechanism the condition of quasi-stationary phase, and refer to this range of sources as the quasi-stationary-phase region or effective source region. Recall that the spurious phase has a peak period of 6.2 s. We emphasize that this typical dominant period for secondary microseisms is resultant from the distribution of spectral content of 10 seismic noise sources ⎯ secondary microseisms correspond to the largest peak in the seismic noise spectra (Peterson, 1993).
Numerical tests of source averaging for the P-PKPab correlations indicate that, even at short periods (1 s or less), the spurious phase can be constructed, with narrower effective source region shrinking toward larger source-receiver distances.
Experiences from earthquake observations indicate that PKPbc waves are generally the dominant PKP branch at distances 15 from ~144° (the PKP-wave caustic) to ~153° (Kulhá nek, 2002). Microseism studies have also reported that PKPbc waves can be more prominent (e.g., Gerstoft et al., 2008;Landé s et al., 2010). However, our analysis reveals that the spurious phase originates from the interference of P waves with PKPab waves, rather than with PKPbc waves. From the source-averaging experiment for the P-PKPbc correlations (Fig. 9c), one can see that the P-PKPbc time delay varies almost linearly against the source-receiver distance, and that the dynamic range of the time delay is broad. Consequently, the signals in the source-wise 20 correlations are out of phase, which leads to a destructive source averaging.

Effect of source distribution
The above-described ray-based simulation is simple to implement and efficient to reveal the origin of the spurious phase. It uses only the phase information and neglects the effect of amplitude variations. This simplification ensures that the spurious phase is not likely to be caused by a strong localized source. To confirm, in this section, we implement a formal wave-based 25 simulation to show that the observed spurious arrivals can be well reproduced under ideally uniform source distribution. As shown in Fig. 8, the spatial variations of the power of global microseism sources are heavily fluctuated. The spurious phase is observable in the acausal correlations because the corresponding source is strong, and is hardly observable in the causal correlations because the responsible source is too weak. It is worth confirming whether the spurious phase can be eliminated with an ideally uniform source distribution or not.

8
We request the vertical components of the synthetic global broadband seismogram for the iasp91_2s model from the IRIS Syngine Data Service (Krischer et al., 2017) powered by the spectral-element program AxiSEM (Nissen-Meyer et al., 2014) and the Python packages Obspy  and Instaseis (Van Driel et al., 2015). A mask is applied to the full waveforms (Fig. 10a) to extract the P waves and the PKPab waves (Fig. 10b). Providing that the uncorrelated noise sources are distributed evenly on the global surface, we compute the source-wise correlations and stack them for each inter-station 5 distance, using the data in Fig. 10b. A global section of synthetic P-PKPab correlations is obtained accordingly (Fig. 10c). The spurious phase is clearly reproduced, which suggests that it is not caused by unevenly distributed noise sources.
Repeating the ray-based simulation in Fig. 9b for various inter-receiver distances, one can also obtain a full section of the P-PKPab correlations. Due to the neglect of amplitude information, the ray-based simulation over-estimates the observable range 10 of inter-receiver distances for the spurious phase. The wave-based simulation in this section is undoubtedly more realistic. The theoretical time-distance curve of the spurious phase can be picked from the synthetic sections. It is almost identical for the ray-and wave-based simulations in the observable distance range, and fits well with the observed spurious arrivals in Fig. 4.
We also compare the results of the wave-based simulations for two-dimensional plane model (sources along the great circle) and three-dimensional spherical model (sources on global surface). As shown in Fig. S6, the reduction of source dimension 15 leads to an overestimation of the observable distance range of the spurious phase. The time-distance curves picked from the synthetic P-PKPab correlations of two models are identical in the common distance range, while the amplitudes of signals are different. In the case of uniformly distributed sources, it is safe to implement two-dimensional simulation for efficiency.

Conclusions
We observe early spurious arrivals in the teleseismic noise correlations between the Japan and Finland stations. These signals 20 are prominent and isolated from other strong seismic phases, making it a good agency to unveil the generation mechanism of such spurious phases. In contrast to previous studies that observed a strong correlation between signal amplitudes and large earthquakes, the spurious signals in this study have no connections to seismicity. We provide evidence in support of that the observed signals origin from the interference between ballistic P waves and PKPab waves that emanate from oceanic microseism noise sources south of New Zealand. The interfering waves have no ray path or slowness in common, and do not 25 meet the traditional condition of stationary phase. The effective source location responsible for the spurious phase does not correspond to a stationary point. We propose a less restrictive condition of quasi-stationary phase, which explains our finitefrequency observations. The strict stationary-phase condition has been used by Phạm et al. (2018) to interpret the spurious phases in the earthquake coda correlations. We expect to see if the quasi-stationary phase arguments can be generalized to the explanation of the coda-derived spurious phases.

9
The interfering P and PKPab waves have deterministic ray paths that sample the deep mantle and the core. We expect a way to use the spurious phase to investigate the deep Earth structure. The strength of the spurious phase is linked to the power of microseism excitation in a definite, constrained region of noise sources. It is potential to use it to monitor the ocean wave activities and microseism excitations in that specific region. Ambient noise wavefields are dominated by the ballistic waves that persistently emanate from oceanic microseism events, while late coda wavefields after large earthquakes are dominated 5 by highly coherent deep reverberations / high-order core modes separable at intermediate periods of 20 to 50 s (Boué et al., 2014;Maeda et al., 2006;Poli et al., 2017). We appeal to distinguish between global noise correlations and global coda correlations. Multiple spurious arrivals have been observed from the global coda correlation wavefield (Boué et al., 2014;Lin and Tsai, 2013;Phạm et al., 2018). The P-PKPab correlation could not be the unique spurious phase emerging in the global noise correlation wavefield. The double-array slowness analysis and slowness-track method proposed in this paper are also 10 applicable to the analysis of other coda-or noise-derived signals.
Author contribution. LL performed the data processing and designed the synthetic experiments. The code for noise data processing was developed based on an early version by PB. All authors contributed to the analysis of the results. LL prepared the manuscript with contributions from all co-authors. MC wrote the proposal leading to this publication. 15 Competing interests. The authors declare that they have no conflict of interest.