Monitoring induced distributed double-couple sources using Marchenko-based virtual receivers

We aim to monitor and characterize signals in the subsurface by combining these passive signals with recorded reflection data at the surface of the Earth. To achieve this, we propose a method to create virtual receivers from reflection data using the Marchenko method. By applying homogeneous Green's function retrieval, these virtual receivers are then used to monitor the responses from subsurface sources. We consider monopole point sources with a symmetric source signal, where the full wavefield without artefacts in the subsurface can be obtained. Responses from more complex source mechanisms, such as double-couple sources, can also be used and provide results with comparable quality as the monopole responses. If the source signal is not symmetric in time, our technique that is based on homogeneous Green's function retrieval provides an incomplete signal, with additional artefacts. The duration of these artefacts is limited and they are only present when the source of the signal is located above the virtual receiver. For sources along a fault rupture, this limitation is also present and more severe due to the source activating over a longer period of time. Part of the correct signal is still retrieved, as well as the source location of the signal. These artefacts do not occur in another method which creates virtual sources as well as receivers from reflection data at the surface. This second method can be used to forecast responses to possible future induced seismicity sources (monopoles, double-couple sources and fault ruptures). This method is applied to field data, where similar results to synthetic data are achieved, which shows the potential for the application on real data signals.


Introduction
Seismic monitoring of processes in the subsurface has been an active field of research for many years. Traditionally, most recording setups are limited to the surface of the Earth, although boreholes can also be utilized. The latter approach is more expensive and complicated, however. In case of monitoring with active sources, the receivers in these recording setups measure valuable reflection data, which provide quantifiable information about processes in the subsurface. Some examples of using this information are monitoring time-shifts in seismic data to predict the velocity-strain relation for a depleting reservoir (Hatchell and Bourne, 2005) and the monitoring of geomechanics in the subsurface by using time-lapse data (Herwanger and Horne, 2009). When the source is not active, but rather passive, such as when caused by an induced earthquake, the resulting signal can be measured as well. These passive measurements are difficult to process due to the fact that the signal is complex and unknown (McClellan et al., 2018), however, the information content in these induced seismic signals is of great interest. Induced seismicity has had a large impact in countries such as the Netherlands (van Thienen-Visser and Breunese, 2015) and the USA (Magnani et al., 2017) and there is much discussion about the cause and the effects. To determine the cause of induced seismicity, the source of the signal is of particular interest and consequently, inversions for the source mechanism (Zhang and Eaton, 2018) as well as the location of the source (Eisner et al., 2010) are often performed. These methods can be carried out from surveys that are located at the surface of the Earth or inside boreholes, however, they are limited in accuracy. Ideally one would use a dense network of receivers around the source location to directly monitor the wavefield. Due to practical difficulties and expenses associated with placing a dense network of receivers in the subsurface, the wavefield can generally not be directly measured around the source location of the signal. An alternative to using physical receivers for these measurements is the use of virtual receivers. A virtual receiver is not physically present in the subsurface, rather, it is created through processing of measured signals at the surface. Virtual receivers can be created in a variety of ways. A mathematical basis for the retrieval of these virtual receivers is the so-called homogeneous Green's function representation. The classical form of this representation was proposed by Porter (1970) and extended for inverse source problems by Porter and Devaney (1982) and for inverse scattering methods by Oristaglio (1989). This representation states that if the responses from two signals are measured on an enclosing recording surface, the response between the two sources of the signals can be retrieved. It forms the basis for seismic interferometry to create virtual sources (Wapenaar et al., 2005) or virtual receivers (Curtis et al., 2009). All of these approaches require an enclosing boundary and introduce artefacts if this requirement is not met. Even though this limitation is well known, for many cases these approaches are still utilized. A novel approach that can be used when the acquisition boundary is not closed is the datadriven 3D Marchenko method. This method can create virtual sources and receivers in the subsurface Slob et al., 2014). In order to achieve this, the method requires a reflection response recorded at the surface of the Earth, and an estimation of the first arrival of the signal from a location in the subsurface to the receiver locations in the measurement array. This first arrival can be estimated from a background velocity model, which requires no detailed information about the subsurface. Through the Marchenko method, the Green's function with a virtual receiver in the subsurface can be retrieved. Using this method, many virtual receivers can be created in the subsurface, which can be used to monitor the wavefield from the virtual receiver locations to the receiver array. To obtain the signal between an induced signal from the subsurface and the virtual receiver locations, homogeneous Green's function retrieval can be employed, however, as pointed out before, the classical approach would include artefacts due to the open boundary of the recording. These artefacts can disturb the interpretation of the signal. An alternative retrieval scheme was developed by Wapenaar et al. (2016), who showed that if a focusing function is used in combination with a Green's function, an open boundary can be used for the retrieval instead of an enclosing one, without the artefacts of the classical method when applied to an open boundary. A focusing function is a wavefield that is designed to focus to a location in the subsurface and can be retrieved using the Marchenko method . This single-sided representation has been proven to succesfully work on both synthetic data and on field data . Using the single-sided method, two approaches for monitoring induced seismicity can be taken. First, virtual receivers can be used in combination with a virtual source. All the signals are created from the reflection data using the Marchenko method. This has the benefit that the virtual source can be created at any location in the subsurface, where one expects induced seismicity to happen and that the source signal can be controlled. This is the way that the method has been mostly applied in previous works. Another approach that can be taken is to create virtual receivers using the Marchenko method and to use a real induced seismic source signal instead of a virtual Green's function. This effectively allows for the monitoring of the actual signal in the subsurface, including the source location and mechanism. This could be a boon to induced seismicity monitoring, however, this approach does require some modifications. Induced seismicity often causes more complex source signals that evolve over a period of time and cover an extended location in the subsurface. These rupture planes or fault sources are the main topic of interest. In this work, we aim to apply the single-sided homogeneous Green's function retrieval on both synthetic and field data for a distribution of virtual double-couple sources. We first apply the method on synthetic data for point sources and show the principles of the representation. We then use the same synthetic data to apply the representation with modifications to the sources originating from a fault plane and show the results that can be achieved. Finally, we also apply the representation on field data for both types of sources.

Green's function and focusing function
In this paper, we present several representations for the retrieval of wavefields in the subsurface. First, we review the properties and quantities that are relevant for these representations. To this end, we consider a medium that is acoustic, lossless and inhomogenous with mass density ρ = ρ(x) and compressibility κ = κ(x), where x = (x 1 , x 2 , x 3 ) indicates the cartesian coordinate vector. We make use of a Green's function in this medium that obeys the following wave equation: where G = G(x, x A , t) indicates a Green's function that at time t describes the response of the medium at location x due to an unit impulsive point source of volume-injection rate density δ(x − x A δ(t) at source location x A . δ(·) is the Dirac delta function, ∂ t the temporal partial differential operator ∂ ∂t and ∂ i a component of a vector containing the spatial partial differential operators in the three principal directions ∂ ∂x 1 , ∂ ∂x 2 , ∂ ∂x 3 . The Green's function obeys sourcereceiver reciprocity, which allows the interchange of the source and receiver position, hence G(x B , x A , t) = G(x A , x B , t). We impose causality on the Green's function, G(x, x A , t) = 0 for t < 0, such that it is forward propagating, away from the source, and a causal solution to equation (1). A schematic illustration of the Green's function is shown in Figure 1-(a), where several possible raypaths have been drawn for a heterogeneous model. This includes the direct arrival, primary reflections and multiple reflections. We also consider the time-reversed Green's function G(x, x A , −t), which is the acausal solution (a) (b) Figure 1: (a) Schematic representation of the Green's function G(x, x A , t), with a source located at x A , which is measured at varying location x at the surface, defined in the physical medium. (b) Schematic representation of the focusing function f 1 (x, x A , t), where the wavefield propagates from x at the surface to the focal location x A , defined in the truncated medium. For both functions, several possible raypaths have been drawn. For the focusing function the downgoing waves have been marked with red arrows and the upgoing waves with blue arrows.
to equation (1), where the causality condition is G(x, x A , −t) = 0 for t > 0. Superposition of the causal and acausal Green's function yields the homogeneous Green's function: where G h (x, x A , t) obeys the homogeneous wave equation: Equation (3) is similar to equation (1), with the exception of the lack of a source singularity on the right hand side of the equation. Aside from the Green's function, we consider the focusing function f 1 (x, x A , t), which describes a wavefield, during time t, at location x, that converges to a focal location x A in the subsurface of a medium that is truncated below the focal location. The focusing function can be decomposed as, where f + 1 (x, x A , t) denotes the downgoing and f − 1 (x, x A , t) the upgoing component of the focusing function. A schematic representation of the focusing function can be found in Figure  1- (b). Similar to the Green's function, several possible raypaths have been drawn, however, to distinguish the decomposed wavefields, the downgoing focusing function has been marked with red rays and the upgoing focusing function with blue rays. The medium of the focusing function and the Green's function are identical until the focal depth, after which the medium of the focusing function becomes truncated. The physical and truncated medium can be used in reciprocity theorems in order to relate the focusing function to the Green's function, which is shown in section 2 of the supplementary information. The focusing function and Green's function can be separated from each other in time. The coda of the focusing function resides in the interval between the direct arrival of a related Green's function and its time reversal. The direct arrival of the focusing function coincides with the direct arrival of the time reversed Green's function. This difference in time intervals explains some of the effects that are present in the representations that are used in this paper. Both the focusing function and Green's function can be retrieved for a heterogeneous medium, through use of the Marchenko method. We will not consider this method in detail in this paper, instead we refer the reader to Wapenaar et al. (2014) for a more detailed overview. Due to the nature of some equations, we also make use of the frequency domain version of the time domain quantities. To obtain these transformation we make use of the Fourier transfrom. We define the Fourier transform of a space-and time-dependent function u(x, t) as where u(x, ω) is the Fourier transformed version of u(x, t) in the space-frequency domain, with ω as the angular frequency and ı the imaginary unit. By using equation (5) we obtain the space-frequency domain versions of equation (1), (2), (3) and (4), respectively: where indicates the real part of a complex function.

Homogeneous Green's function representation
The classical homogeneous Green's function representation was originally developed for a configuration where the Green's function was measured on a arbitrarily shaped boundary enclosing the medium of interest (Porter, 1970;Porter and Devaney, 1982;Oristaglio, 1989). The representation states that, if the responses from two sources inside the medium are recorded on the boundary, the response between the two source locations can be obtained. For seismic recording setups, the measurements are usually only available at the surface of the Earth, meaning that the boundary is single-sided instead of closed, which will introduce significant errors into the final result.
In recent years a new representation for homogeneous Green's function retrieval was developed x3 x1 x2 Homogeneous half-space Figure 2: Setup for the single-sided Green's function representation for (a) a case where the source of the Green's function is located below the focal location and (b) a case where the source of the Green's function is located above the focal location.
that is designed to work with the single-sided boundary, where a focusing function is used together with a Green's function (Wapenaar et al., 2016). Consider the setup in Figure 2, where a heterogeneous medium V A is bounded by two boundaries S 0 and S A on two different levels in vertical direction x 3 . The boundaries extend infinitely in the horizontal directions x 1 and x 2 . The medium above S 0 is homogeneous and the boundary itself is non-reflecting, while the medium below S A can be heterogeneous. The upper boundary S 0 corresponds to the surface where functions are available over an area that is covered by receivers at x. In this scenario, we assume that we have three functions available at the upper boundary, a Green's function B , t), that has a source location x (2) B inside medium V A and a focusing function f 1 (x, x A , t), that has a focal location x A , located at the depth of S A . The available functions can be used to obtain the response between two locations. To this end, we use the representation given by equation (35) of the supplementary information (for the derivation see section 2.3 of the supplementary material), where is the imaginary part of a complex function and χ(x B ) is the characteristic function, This representation states that, by applying the focusing function components to a Green's function at the upper boundary, the Green's function between the focal location of the focusing function and the source location of the Green's function can be obtained. The focal location will become the receiver of this new Green's function, and the source location of the original Green's function on the right hand side of equation (10) will become the source location of the new Green's function. However, contributions from the imaginary part of the focusing function between the source and receiver locations are present if the source location is located inside the medium V A , as is the case if the Green's function from Figure 2 B is used. Because they are related to a focusing function, these artefacts will be present between the direct arrival of the Green's function and its time reversal . In this case, the source location is present above the focal location. These contributions vanish if the source location is present outside V A , in other words if it is located below the focal location, such as when the Green's function from Figure 2-(a) with source location x (1) B is used. This would mean that we are limited in the correct application of the representation. To overcome this limitation, we substitute equation (10) into the right hand side of equation (7) to create the single-sided homogeneous Green's function representation: which corresponds to equation (33) from our companion paper . The additional contributions have vanished from this representation and the homogeneous Green's function will be obtained when it is evaluated, instead of the causal Green's function.

Virtual sources and receivers
Generally, the focusing function and Green's function are not directly available. These functions can be obtained through the use of the Marchenko method (Broggini et al., 2012;Wapenaar et al., 2014;van der Neut et al., 2015), which is a data-driven method that requires only reflection data at the surface of the Earth and an estimation of the first arrival of the wavefield at the location of interest inside the medium. The method handles the primaries of the reflection data in the same way as conventional methods, however, unlike those methods, the Marchenko method can also correctly handle the multiples in the data. The first arrival can be estimated through the use of a macro-velocity model. Recent developments have shown that the Marchenko method can be used independently from the velocity model to remove the multiples from the reflection data, however (Zhang and Slob, 2019). The method cannot handle attenuation on the reflection data and ignores evanescent waves. On field data, the data requires additional processing to account for these and other requirements. The Marchenko method has been applied succesfully on both synthetic and field data, for examples see Ravasi et al. (2016), Staring et al. (2018) and Brackenhoff et al. (2019) . The method can be used in the homogeneous Green's function retrieval scheme in two ways. The first approach is a two-step process, where, in the first step, the Marchenko method is used to retrieve a single Green's function to create a source location for the homogeneous Green's function. This source is called a virtual source because it is not physically present in the subsurface. In the second step, using the Marchenko method, many focusing functions are created at varying locations in the medium, that serve as the receiver locations for the homogeneous Green's function. Similar to the virtual source, these are called virtual receivers, again, because they are not physically present in the medium. Due to the bandlimited nature of the data, a source signal s(t) will be present on the Green's function, that changes its phase and amplitude: where p(x, x B , t) is a pressure wavefield in the medium and s(ω) is the source spectrum of the source signal. Equation (13b) can be directly substituted in equation (10) to create the wavefield equivalent representation, due to the fact that the source signal is not dependent on the integral over the receiver array: However, this representation can not generally be used to create the homogeneous representation as before. If the source spectrum is not strictly real-valued, the signal is not symmetric in time, because s(ω) = s * (ω), and therefore there will be a phase difference between the causal and acausal wavefield, making the superposition of the signal with its time-reverse incorrect.
Assuming that through processing of the signal, the type of wavelet that is applied to the data can be controlled, symmetry of the source signal can be ensured by using zero-phase wavelets. When this condition is fulfilled, equation (14) can be substituted in equation (7): . This is a similar representation to equation (39) for modified back propagation from our companion paper . In this representation, we assume that the focusing function has a broadband wavelet on it, so a convolution with the pressure wavefield will only leave the wavelet of the pressure wavefield.
The second way we can use the Marchenko method in the application of homogeneous Green's function retrieval is a one-step process, where the method is only used to retrieve focusing functions to create virtual receivers. In this case no virtual source is created, rather the actual response from a real source inside the medium is used. Monitoring real source signals is the eventual goal of this approach, such as for the case of induced seismicity. The boon of this method is that aside from the measured signal, no information about the source of the data is required. There are limitations to this approach as well, most pressing that to evaluate the integral, the signal needs to be recorded on the same receiver array that was used to record the reflection data. Similar to the two-step process, this approach also requires a symmetric source signal for the Green's function if we want to use equation (15) instead of equation (14), while the focusing function requires the broadband source signal. For the case of induced seismicity, the source signal can be more complex than just a single monopole point source. To include the mechanics for induced earthquakes more accurately, the double-couple source mechanism can be included in the representation. The double-couple source mechanism is accepted as representative for an earthquake response if the wavelength of the signal is at least of the same dimension as the size of the fault that originated the earthquake (Aki and Richards, 2002). It can be implemented through the use of a moment tensor, which is useful for the case of finite-difference modeling (Li et al., 2014). An example of the difference between the response of a monopole source and double-couple source can be seen in Figure 3. While the monopole source response has an uniform amplitude along the wavefront, the double-couple source response has a varying amplitude and polarity along the wavefront. Consequently, the orientation of the double-couple source affects the source signal, which is visible in the Figure 3-(b), while the orientation of the monopole source does not matter. Hence, the orientation of the fault is crucial to the characteristics of the double-couple source signal. To include this orientation in the representation, we introduce the operator D θ B , which acts on the wavefield and creates the double-couple source orientation from the monopole source signature. This operator is defined as where ∂ i,B is a component of the vector containing the partial derivatives acting on the monopole signal originating from source location x B , that turns it into a double-couple source mechanism, θ i is a component of the unit vector that orients one couple of the signal parallel to the fault plane and θ ⊥ i is a component of the vector that orients the other couple perpendicular to the fault plane. The operator can be applied to equation (14): and assuming that the source signal is symmetric in time, the operator is also applied to equation (15) In these two equations, the operator can be freely applied to both sides, because the integral is not evaluated over the source locations. Consequently, if the wavefield response used as a source for the homogeneous wavefield has a double-couple signature, the homogeneous wavefield will also have a double-couple signature. Note that the operator does not operate on the focusing functions, hence we can use the monopole responses for these signals.

Virtual receivers for extended faults
In case of induced seismicity, the fault or rupture plane that triggers the signal can be larger than the wavelength of the signal. In this case, the double-couple point source is no longer a valid approximation for the source of the signal. Studies of induced faults suggest that the signal develops over the fault during an extended period of time (Buijze et al., 2017). To approximate this type of source, a superposition of many point sources can be utilized. The total signal of the resulting superposition can be written as the superposition of the individual signals, where the superscript k, indicates the number of the source location x (k) B , that has the source spectrum s (k) (ω). The different source spectra determine the time at which the signal is triggered along the fault plane. P (x, ω) can be created in two different ways, similar as before. First, we consider the two-step process, where both the source and receiver are virtual. In this case, every source location can be treated separately to retrieve the homogeneous Green's function, and the superposition can be done after each signal has been retrieved through equation (18) and then shifted over t (k) , where H is the Heaviside step function and t (k) is the time at which point the k-th signal originates on the fault. The Heaviside in equation (20) selects the shifted causal signal from the shifted homogeneous (two-sided) signal before the superposition takes place, which is required to construct the correct signal. If the shifted homogeneous signals would be used instead, the shifted acausal part of later signals would overlap with the causal part of signals that originated earlier. Through use of equation (20) the correct signal can be retrieved.
In case the source signal is measured rather than virtually created, the same approach cannot be taken. This signal is measured after superposition, therefore each point source cannot be evaluated seperately. To represent this, equation (17) is adjusted to take the superposition into account, according to In this scenario, the sum is inside the integral and the entire signal is superposed before it is applied to the focusing function. This also results in a superposition of contributions of the focusing function between the virtual receiver location and the fault plane. Substituting equation (21) into equation (7) will not lead to a cancellation of the focusing function on the left-hand side, as the wavefield does not have a symmetric source signal, due to the time differences between all the sources. As such, equation (21) is the endpoint and we will not obtain a homogeneous Green's function, but rather a signal between the source and virtual receiver plus additional artefacts caused by the focusing funtion between the virtual receiver and the fault plane. Similar to the single source, each set of artefacts maps in between the shifted direct arrival of the Green's function and its time-reversal. However, due to the different shift of each signal, the artefacts overlap with the shifted causal and acausal parts of other signals and cannot be easily seperated. However, because of the limited duration of the artefacts, the signal at later times will be free from these artefacts. Additionally, due to the nature of the characteristic function, the artefacts also vanish when the source location x (k) B is outside the volume V A . In other words, if the virtual receiver location x A is above the shallowest source location, the correct signal can be retrieved for this virtual receiver.

Numerical point sources
To demonstrate the different approaches to homogeneous Green's function retrieval, we apply the methods first on synthetic data. Figure 4-(a) shows a density model and Figure 4-(b) shows the accompying P-wave velocity model. The model contains an area of faulting in the center of the model, which is highlighted with a black dashed line. To create the required reflection data, the model is used in a finite-difference modeling code for wavefield modeling (Thorbecke and Draganov, 2011). An example of an acoustic common-source record from the center of the model is shown in Figure 4-(c). This type of common-source records and a smoothed version of the velocity model in Figure 4-(b), are the only input that we will use for our applications. To retrieve the required Green's functions and focusing functions with the Marchenko method, we model the first arrival from a point in the subsurface to the surface of the medium using the smooth velocity model. This first arrival is then used to initiate the Marchenko method to retrieve focusing functions and a Green's function from the reflection response at the surface (i.e., from the common source records). The scheme that we use is based on the Marchenko code created by Thorbecke et al. (2017). This is a code for an acoustic wavefield Marchenko method, excluding free-surface multiples in the reflection data. Free-surface multiples could be included in the scheme as was shown by Singh et al. (2015). Figure 5 shows the results of the homogeneous Green's function retrieval. All snapshots show the same area in the subsurface, which is denoted by the white box in Figures 4-(a) and (b). Each pixel in the image is a receiver location and the source location for all images is exactly the same. The columns show snapshots of the wavefield in the subsurface at four different points in time, 0, 150, 300 and 450 ms. Each row corresponds to a specific way the wavefield in the subsurface was constructed. In the first row, the source and the receivers of the wavefield are placed inside the model and the wavefield is directly modeled. This is the benchmark that the other results will be compared to. All snapshots contain an overlay of black dashed lines, which indicate the locations of geological layer interfaces. As can be seen in the figure, the wavefield of the modeling scatters at these lines. The second row of Figure 5 shows the result of Green's function retrieval using the method described by equation (14). The Green's function and focusing functions that are required for this method are retrieved using the Marchenko method. This means that all the receivers and the source are virtual. When the result is compared to the benchmark, it is clear that there are some issues. The wavefield below the source location, as indicated by the green dashed line, contains numerous artefacts and the downgoing direct arrival of the wavefield is missing, however, the coda of the wavefield is present. This is caused by the fact that the focusing function between the virtual source and receiver is present and the lack of compensation for these contributions cause artefacts in the final result. When the virtual receivers are located above the virtual source location, the wavefield is comparable to the benchmark and the direct arrival is present. To make a more detailed comparison between the result, we extract the measurements from two receiver locations. These locations are indicated in Figure 5-(a), where the red dot is a receiver location above the source location and the blue dot a receiver location below the source location. Parts of these measurements are displayed in Figure 6, where the left column corresponds to the red dot and the right column to the blue dot. The results in the rows of Figure 6 correspond to the results of the rows in Figure 5. When the trace in Figure 6-(a) is compared to (c), the arrival times of the events match and there are no artefacts present, however there is a mismatch in amplitude. This is due to transmission losses in the reflection response, that the Marchenko method in its current form does not compensate. These effects have been partially compensated for through use of the method discussed by Brackenhoff (2016), although not all the effects have been compensated for. We ensure that the wavelet is zero-phase for the modeling and the Marchenko method to ensure the symmetric source signal requirement for the homogeneous Green's function representation. When the receiver location below the source is considered, the results are less accurate. The trace of the modeling contains no signal before the first arrival, whereas the trace for the Green's function retrieval contains numerous events and is lacking the first arrival. The coda of the traces shows a match that is comparable to the receiver location above the source. The arrival times of the events show a good match, while the amplitudes show errors. Because this receiver is located deeper inside the model, the transmission effects are stronger and therefore the error is larger. Next, the homogeneous Green's function retrieval using equation (15) is considered. The input for this approach is exactly the same as the one used for the previous approach using equation (14), however, this time, we expect to retrieve the correct result. Looking at Figures 5-(i)-(l), the result more closely matches the result of the benchmark. The improvement over the previous result for the deeper virtual receivers is clear. For some of the deeper receivers, part of the wavefield is still not completely present, however. This is the part of the wavefield that has a steep angle. The reason for this missing part is that the reflection response at the surface does not contain the reflections corresponding to the angles at larger depths, as they travel outside the aperture of the recording survey. Therefore, these steep angles cannot be reconstructed. As can be seen when the trace from Figure 6-(c) is compared to (e), the result between the two approaches is exactly the same if the virtual receiver is located above the source. The improvement is noticeable when the receiver is located below the source. Figure 6-(f) does contain the first arrival and lacks any signal before this arrival, and therefore shows a better match to Figure 6-(b). While the amplitude mismatch is still present, the arrival times of the events match and no artefacts are present. This also shows that the coda of Figure 6-(d) is correctly retrieved. We have indicated the moment that the correct coda is retrieved with a green line it this figure.
Finally, we consider the situation where the source mechanism is more complex, through the use of a double-couple signature. The retrieval in this case corresponds to the the approach in equation (18), using a virtual source. The double-couple is an elastic mechanism, however, as we only require the first arrival to initiate the Marchenko method, the coda of the wavefield is not of interest. The S-wave velocity used for the modeling of the first arrival is set to 500 m s , to ensure that all the S-wave events arrive after the first P-wave arrival. We incline the double-couple source at an angle of 45 degrees and use it to model the first arrival, which is used to initiate the Marchenko method to retrieve the wavefield response for the virtual source location. The focusing functions remain the same as the ones we used for the previous approaches. The result of this retrieval is shown in Figures 5-(m)-(p). As equation (18) states, because the Green's function contains a double-couple signature, the homogeneous Green's function contains the same signature, both in the direct arrival and in the coda of the wavefield. The double-couple signature affects the amplitude of the wavefield depending on the angle of the wavefront, however, the arrival times are similar to those when a monopole virtual source is used. This becomes clear when the traces from Figures 6-(g)-(h) are considered. The arrival times for the events are similar to the previous result, however, there are apparent amplitude and phase differences, caused by the different types of source signature. The differences are minor and the result shows that the double-couple signature can be succesfully integrated in the Marchenko method.

Numerical line sources
Up until now, we only considered single point sources that have a symmetric signal. To study the situation of induced seismicity, we simulate a source that evolves over time over a larger • • 1000 m x 1 x 1 x 1 x 1 x 3 x 3 x 3 x 3  area than a single point. We achieve this by placing a collection of sources along a line in the model. For this purpose, we place 131 sources along the fault plane that was indicated in Figure  4, starting at the bottom left corner, with a spacing of 7.07 m and a propagation speed along the fault of 589 m s . The fault is inclined at 45 degrees, therefore we make use of double-couple sources that are inclined at the same angle. We consider two scenarios, one where we have virtual sources and one where we have a measurement of a real source. For the first scenario, we approach the problem by considering each source position seperately. We do this by retrieving the homogeneous Green's function for each virtual source location separately and by shifting and superposing the results, similar to equation (20). Causality is applied to each individual wavefield before the superposition to avoid overlap between the causal and acausal part of the wavefields. Snapshots of the results are shown in Figures 7-(a)-(d), for 0, 500, 1000 and 1500 ms. The reason for the large timesteps is that all the sources along the fault have been activated during the final snapshot. The propagation of the source location along the fault is clear in these snapshots, however, a propagating wavefield appears to be largely absent, with only a few events and ringing effects present. The reason for this phenomenon is that the velocity at which the sources are activated along the faults is lower than  Figure 6: Traces from receivers in the subsurface at two locations. In the left column, the receiver is located above the source and corresponds to the red dot in Figure 5-(a) and in the right column it is located below the source and corresponds to the blue dot in Figure 5- the propagation velocity of the medium. This effectively means that the phase velocity of the combined wavefield along the fault is lower than the propagation velocity of the medium and the radiated wavefield therefore becomes evanescent. These evanescent waves do not propagate and are thus not visible. This effect can be seen more clearly by considering the traces from two receiver positions. Similar to Figure 6, we extract the same receiver locations to consider the individual traces, as shown in Figure 8. In Figures 8-(a)-(b), the trace for the receiver location above the shallowest source location shows a trace with few events, except for some high amplitude events. The receiver location below the deepest source shows a trace that contains more ringing effects with a uniform amplitude. Because the amplitudes are similar and the events located close together, litlle information can be gained from this trace. The evanescent problem can be avoided by changing the maximum amplitude between the wavefields. To this end, we apply random scaling to each wavefield before the superposition takes place. Because faults are extremely heterogeneous, the response of the wavefield can be approximated by such a simulation. The result of this approach is shown in Figures 7-(e)-(h). The propagation of the source location along the fault is similar to the uniform amplitude approach, however, the individual wavefields are visible due to the random amplitude approach. Both the first arrivals and the codas can be seen, although there is much overlap between all the wavefields which makes distinguishing individual events at later times challenging. When the two receiver traces in Figures 8-(c)-(d) are studied, this challenge is still present. The trace contains events, however, it is difficult to say whether these events correspond to the response of one source or another. To make an estimation for the arrival times of the homogeneous Green's function, we model the line source in the subsurface, using the same random amplitude distribution as in the previous case. As we cannot model the double-couple source acoustically, we make use of monopole point sources, instead of double-couple sources. As a result, the amplitudes of the events cannot be compared to the homogeneous Green's function, however, the arrival times can be compared. The wavefield in Figures 7-(i)-(l) shows that the arrival times are well comparable between the modeling result and the retrieved homogeneous Green's function. This is further proven when the traces in Figures 8-(e)-(f) are considered. The arrival times have a strong match, while the amplitudes are not comparable. This confirms that the correct events are retrieved through this approach. Next, we consider a different scenario, with a real source instead of a virtual one. Here, we once • • 1000 m x 1 x 1 x 1 x 1 x 3 x 3 x 3 x 3 Uniform virtual source 0 ms 500 ms 1000 ms 1500 ms  again retrieve the wavefield respone of each source seperately. However, instead of retrieving a seperate homogeneous Green's function for each of these responses and then superposing these results together, we superpose the responses before the homogeneous Green's function is retrieved, following equation (21). By using this approach we obtain a response record that matches the response of a real source recording in the subsurface. The same random amplitude is applied for this approach as well, to avoid the evanescent problem. The Green's function that is obtained is shown in Figures 7-(m)-(p), where we can see that the propagation of the source location along the fault is captured properly. There are issues with the approach due to the limitation of the representation that is used. The response to each source has artefacts that arrive before the first arrival when the virtual receiver is located below any of the source locations. These effects overlap with the causal wavefields of sources at other locations, and obscure the events that should be present. Additionally, the downgoing first arrival is missing for all source locations. These problems are inherent to the representation and cannot be easily avoided, however, the coda of the wavefield for later times will be correct, as we saw already for the point source in Figures 6-(e)-(h). When the traces for this approach from Figures 8-(g)-(h) are studied, we can see that if the receiver is located below the source locations, individual events belonging to the sources are impossible to distinguish. If the receiver is located above all the sources, however, the wavefield is retrieved correctly. The lower receiver does contain the correct coda at later time. We indicated this moment with a green line in Figure 8-(h), similar to Figure 6 -(d). This, combined with the fact that the source location of the signal can be clearly distinguished, shows that this approach has potential for field recordings.

Field data sources
To demonstrate that our approach is not limited to synthetic data, we also apply the method on field reflection data. The field data were recorded in the Vøring basin, in a marine setting by SAGA Petroleum A.S., which is currently part of Equinor. Due to the setting, the receivers only recorded P-waves. The data consist of 399 common-source records, an example of which is shown in Figure 9-(c). The data were preprocessed before the application of the homogeneous Green's function retrieval. Along with the reflection data, a smooth P-wave velocity model was also provided, which is shown in Figure 9-(a). We indicate the region of interest, where we will perform homogeneous Green's function retrieval, with a white dashed box. The reflection data and the velocity model are the only inputs that are available for the homogeneous Green's function retrieval. No direct information about the subsurface is available for this area, however, using the reflection data and the velocity model, an image of the subsurface was created, shown in Figure 9-(b), which we will use as a reference for where scattering is expected to take place. This imaging was done indepedently of the homogeneous Green's function retrieval and is only used as a reference. More information about imaging using the Marchenko can be found in Staring et al. (2018). The homogeneous Green's function retrieval for this dataset has been succesfully performed, as was shown in , however, in this work we will expand the results to include the line source configuration.
Because there is no information about the subsurface available, we cannot directly model in the subsurface and therefore have no benchmark, however, we have shown with the previous examples that the method is capable of retrieving the correct result. We perform homogeneous Green's function retrieval in the subsurface for both a virtual source and virtual receivers. The virtual source is a double-couple source, inclined at 20 degrees. The result is shown in Figures  10-(a)-(d) for 0, 300, 600 and 900 ms. The image of the subsurface from Figure 9-(b) is used as an overlay to help indicate the region where scattering of the wavefield is expected. The scattering takes place along regions where high amplitudes are present for the subsurface image, which indicates a match between the image and the homogeneous Green's function. Aside from the direct arrival, there is also a coda present, which contains several events. The result is not as clean as the synthetic data, however. This is due to the limitations of the field data. The data is attenuated, a problem that the Marchenko method cannot properly account for. The attenuation has been corrected for during the processing, however, this process is imperfect and will leave imperfections in the final result. There is also incoherent noise present in the field data, which has not been removed during the processing and will be present in the final result. Figure 10-(a) shows a red and blue dot, which indicate the location of traces that are extracted and are shown in the left and right column of Figure 11, respectively. No benchmark for these traces is available, and thus it cannot be directly validated. The results in Figures 11-(a)-(b) do show that the traces contain multiple well defined events, and that the noise on the trace is of a lower amplitude than these events. The amplitude of the first arrival is strong compared to the coda and the phase of all the events is similar. This shows that if the faults in the model are small compared to the wavelength, this approach can be useful for interpretation and characterisation of the source mechanism.
Next, we consider the two line source configurations for the virtual and the real source x 3 x 3 x 3 t Figure 10: Snapshots of the wavefield inside the white box in Figure 9 for the field data. (a)-(d) Homogeneous Green's function in the subsurface, retrieved for virtual receivers and a virtual double-couple source inclined at -20 degrees using equation (18). (e)-(h) Idem, for a line source of double couple sources with random amplitudes inclined at 67.6 degrees using equation (20).
(i)-(l) Idem, using a superposition of double-couple sources with random amplitudes using equation (21). The green line indicates the border between the area below and above the shallowest source. The images are overlain with the image of the subsurface from Figure 9-(b). All wavefields had their source wavelets reshaped to a 30 Hz Ricker wavelet.
configuration. As there is no clear fault present in the model, the fault line is placed in the center of the model, inclined at an angle of 67.6 degrees. 161 sources are used with a spacing of 6.99 m, with a propagation speed along the fault line of 583 m s . A random amplitude is assigned to each of the source locations to generate propagating waves. The first situation we consider is using equation (20), where homogeneous Green's function retrieval is performed for each location seperately and the results are superposed. The results of this approach are shown in Figures 10-(e)-(h), for 0, 1000, 2000 and 3000 ms. Similar to the synthetic data, the propagation of the source is well captured and the first arrival and the coda are present in the signal. Part of the wavefield is not present, which corresponds to high angles at deeper depths, which, as we explained before, are not present in the reflection response and can therefore not be reconstructed. The result is of a similar quality as the single double-couple source in Figures  10-(a)-(d) and the results on the synthetic data Figure 7.
There is no induced seismicity signal present for this area, so a real source signal cannot be used. Instead, similarly to the approach for the synthetic data, we use the Marchenko method to retrieve a wavefield response with a double-couple signature for each source location. These signals are then superposed to create a single source record, as a substitute for a real source signal. This approach follows equation (21), the results of which are shown in Figures 10-(i)-(l). Similar to the results for the synthetic data, the match between the two approaches above the shallowest source location is strong. This is proven further when the traces above the source from Figures 11-(c) and (e) are compared to each other. The traces are nearly identical. If we consider a location below the the deepest source location, the results are less comparable, again similar to the results that were achieved on the synthetic data. The traces for this location, shown in Figures 11-(d) and (f), support this conclusion. The match in this situation is nonexistent for earlier times, and the information is hard to appraise. At later times, as indicated by the green line, the coda of the two approaches match each other, similar as seen before. For both types of retrieval, the source locations are well-defined in both time and space and not obscured by artefacts that could cast doubt on the source locations. Using both types of approach shows potential for the determination of the source location and the coda and can help in the characterisation of the fault mechanism.  Figure 11: Traces of receivers in the subsurface at two locations. In the left column, the receiver is located above the source and corresponds to the red dot in Figure 10-(a) and in the right column it is located below the source and corresponds to the blue dot in Figure 10-(a). (a)-(b) Homogeneous Green's function in the subsurface, retrieved for virtual receivers and a virtual double-couple source inclined at -20 degrees using equation (18). (c)-(d) Idem, for a line source of double couple sources with random amplitudes inclined at 67.6 degrees using equation (20). (e)-(f ) Idem, using a superposition of double-couple sources with random amplitudes using equation (21). The green line in (f ) indicates the time after which the correct signal is retrieved. All wavefields had their source wavelets reshaped to a 30 Hz Ricker wavelet.

Conclusions
In this paper, we considered two methods to monitor wavefields in the subsurface using the Marchenko method. The first method is based on the creation of both virtual receivers and virtual sources in the subsurface. In this case, all the signals are created from the reflection data at the surface, and no response from a real subsurface source is used. For virtual point sources, we showed that we can assure that the source signal is symmetric and that therefore the full homogeneous Green's function can be retrieved without artefacts. The only limitation is that the steepest part of the wavefield at large depths cannot be retrieved. This approach works for virtual sources, both with a monopole signature and a more complex double-couple signature, the latter of which was used as a model for a small scale induced seismicity signal. Larger scale induced seismicity signals, emitted from a fault plane, were considered as well, simulated by a series of individual point sources with a double-couple signature. For this case, the homogeneous Green's function was retrieved for all the sources separately, after which the causal parts were isolated, shifted in time and superposed together. This produces a response from an extended fault rupture that is operating over a larger window of time, which produces a far more complex signal. All the source locations can be distinguished using this method. This method can be used to forecast in a data-driven way the response to possible future induced seismic events. The second method we considered creates virtual receivers in the subsurface that observe a real response from a subsurface source. To this end, we considered point sources where the source signal was not assumed to be symmetric in time. The causal Green's function that is retrieved in this case is missing part of the direct arrival and contains artefacts. These problems are only present when the virtual receiver is located below the source location, and the artefacts map exclusively in the time interval between the direct arrival of the homogeneous Green's function and its time reversal. The coda of the causal Green's function is retrieved in full, as well as the source location of the subsurface response. When considering the responses propagating from a fault, the artefacts are more severe. Unlike in the method with the virtual sources, to simulate the response to a real rupturing fault, we shifted and superposed the source responses before the homogeneous Green's function retrieval. Because of this, the artefacts are present for each point source, however, due to the time shift, the artefacts of one response coincided with the causal coda of other responses. As a result the coda of the retrieved Green's function is only partially obtained. The source locations of the fault response are retrieved correctly. This method can be used to monitor in a data-driven way the response to actual induced seismic events. We applied both methods to both synthetic and field data and showed that for both types of data, comparable results can be achieved. All responses that were used were created using the Marchenko method, because no real passive-source data were recorded with the receiver array that was used for the active-source reflection measurements. The results on the datasets show the potential for the application of the method on real source signals in the future.

Code availability
The modeling and processing software that has been used to generate the numerical examples in this paper can be downloaded from https://github.com/JanThorbecke/OpenSource.

Data availability
The seismic reflection data analysed in Figure 9, 10 and 11 are available from Equinor ASA, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Equinor ASA.

Videos
The videos of the snapshots of Figures 5, 7 and 10 can be found in https://github.com/ JanThorbecke/OpenSource/tree/master/movies.