Crustal-scale depth imaging via joint FWI of OBS data and PSDM of MCS data: a case study from the eastern Nankai Trough

of MCS data: a case study from the eastern Nankai Trough Andrzej Górszczyk1,2, Stephane Operto3, Laure Schenini3, and Yasuhiro Yamada4 1Institute of Geophysics, Polish Academy of Sciences, ul. Ks. Janusza 64, 01-452 Warsaw, Poland 2Univ. Grenoble Alpes, ISTerre, 38000 Grenoble, France 3Université Cote d’Azur, CNRS, OCA, Géoazur, Valbonne, France, 250 Rue Albert Einstein, 06560 Valbonne 4R&D Center for Ocean Drilling Science, JAMSTEC, 3173-25 Showa-machi, Yokohama, 236-0001, Japan Correspondence: Andrzej Górszczyk (agorszczyk@igf.edu.pl)


Introduction
Since decades seismic methods remain the primary source of information about the deep crust. The number of techniques related to seismic data acquisition and processing is continuously increasing stimulated mainly by the hydrocarbon exploration community. However, along with the impressive technological expansion made by Oil&Gas industry in terms of 3D depthimaging at the reservoir-scale, one observes stagnation in the corresponding development of seismic technologies like full 5 waveform inversion (FWI) (Tarantola, 1984;Pratt et al., 1998;Virieux et al., 2017) used by academia for the investigation of deep crustal targets.
The majority of regional seismic surveys conducted by the academic community still rely on sparse wide-angle 2D profiles carried out with a 5 km to 10 km receiver interval. While very few attempts of FWI application on crustal-scale ocean bottom seismometer (OBS) datasets were published (e.g., Dessa et al., 2004a;Operto et al., 2006;Kamei et al., 2013;Górszczyk et al., 10 2017), imaging approaches utilizing first-arrival traveltime tomography (FAT) in the high-frequency approximation are still the method of the first choice to process wide-angle data (e.g., Zelt and Barton, 1998). From one hand, this approach allows to significantly lower the costs of the field acquisition and does not impose large computational burden for processing. From the other hand, it leads to smooth velocity models which provide limited and uncertain insight into the underlying structure.
To mitigate this compromise, the development of up-to-date acquisition strategies and processing workflows dedicated to the 15 integration of wide-angle and short-spread reflection seismic data is needed. Indeed, during wide-angle offshore seismic experiment, MCS data can be collected with the aim of further processing reflection arrivals using migration techniques. To obtain high-resolution image of complex subsurface targets with strong lateral velocity variations, one may consider some variant of pre-stack depth migration (PSDM), e.g. Kirchhoff Depth Migration (Schneider, 1978), Reverse Time Migration (RTM, Baysal et al., 1983), Gaussian Beam Migration (Hill, 2001) or Wave-Equation Migration (e.g. Biondi and Palacharla, 1996). However, 20 the successful application of any type of PSDM inherently relies on the kinematic accuracy of the background velocity model.
In the typical scenario, one obtains initial interval velocity model V int via classical Velocity Analysis (VA) followed by the application of Dix's formula (Dix, 1955). In the presence of dipping events and lateral velocity variations, this initial model should be further updated to improve its accuracy. Among the decades various techniques were proposed to obtain more precise velocity models e.g. reflection tomography (Stork, 1992), CRP-scan (Audebert et al., 1997) or stereotomography (Lambaré, 25 2008) (see e.g. Jones et al., 2008, for detailed review). More recently FWI of the MCS data also became routinely applied by the Oil&Gas industry to build high-resolution background velocity models suitable for PSDM. The FWI velocity model can be even refined after PSDM by reflection traveltime tomography to optimize the reflector flattening in the prestack common-image gathers (CIGs). Applications of FWI on short-spread/narrow-azimuth MCS data have also fostered news developments of the FWI technology such as the so-called reflection waveform inversion (RWI) with the aim to better exploit reflected waves for 30 velocity model building by alternating reflection waveform tomography and least-squares migration (e.g. Brossier et al., 2015;Zhou et al., 2015). Therefore, without doubts, MCS data provide sufficient information to image targets of interest for hydrocarbon exploration.
However, finite length of the streamer hampers the procedure of picking sufficient moveouts in deep reflections, hence prevent-This paper is organized as follows. We start with a brief introduction of the study area and the presentation of the seismic acquisition. After that, we describe our integrated processing workflow followed by the presentation of the results. We compare the two velocity models inferred from the OBS (FAT+FWI processing) and the MCS data (standard velocity analysis), respectively, as well as the CIGs and the migrated sections built from these two velocity models. This comparison highlights how the integrated approach combining MCS and OBS data leads to more accurate and better resolved images of the crustal 15 structure than those inferred from the MCS data only. We further check the relevance of our imaging results by pointing out the correlations between geological features interpreted in both the migrated section and the FWI velocity model and well documented geological structures reported in the area. We finalize the article with the discussion followed by the summary of the study provided in the conclusion chapter.
2 Data 20 The Nankai Trough, offshore Japan, is one of the most complex subduction zone settings around the world. In this area the Philippine Sea plate subducts below the Eurasian plate towards the north-west direction developing large, sediment-dominated accretionary prism ( Fig. 1(a)). The whole area was divided into four segments ( Fig. 1(a)), according to the occurrence of the large earthquakes activating them each 100 to 200 years (Ando, 1975). In particular, due to the proximity of the collision zone between the Izu-Bonin Arc and central Japan, the Tokai segment (zone D in Fig. 1(a)) is characterized by high geodynamical 25 complexity and the deformation of the surrounding structures. The local tectonic regime implies formation and subduction of oceanic ridges, like the Zenisu Ridge (see Fig. 1(b)). Once such topography uplifts enter the subduction zone, they affect the local geodynamical setting and have potential to govern the nucleation and propagation of large earthquakes (Lallemand et al., 1992;Le Pichon et al., 1996;Mazzotti et al., 2002;Kodaira et al., 2003). In the eastern Nankai Trough, the existence of such a subducted ridge (so-called Paleo-Zenisu Ridge) was first investigated by Lallemand et al. (1992) based on experimental 30 sandbox modelling of oceanic ridge subduction and later supported by gravimetric and magnetic data analysis (Le Pichon et al., 1996). Interestingly the Tokai segment was not ruptured during last large Tonankai (1944) and Nankaido (1946) earthquakes, which implies gathering of compressive stress for more than 160 years now. This in turn rises speculations whether it will be the place of next megathrust earthquake (Le Pichon et al., 1996). Therefore, in order to investigate the seismogenic nature of the Nankai Trough, various data are collected in this area. Some measurements come from passive seismic monitoring of fault zones conducted both below and on the seafloor (e.g. within the framework of NanTroSEIZE drilling program or DONET (Dense Oceanfloor Network System for Earthquakes and Tsunamis)). Simultaneously, active seismic experiments are conducted for the purpose of geophysical imaging and reconstruction of physical parameters of the underlying geological 5 structure. In this study, we focus on the active seismic experiments. In particular we utilize two types of the seismic datanamely MCS reflection data and OBS wide-angle data -which combined with complementary processing techniques, shed new light on the deep structure of the eastern Nankai subduction zone.

Acquisition
In order to gain new structural insight into the eastern Nankai Trough, 2D MCS and dense OBS datasets were acquired in 2000 10 and 2001, as part of the Seize France Japan (SFJ) project. The profiles were roughly perpendicular to the trench axis, resulting in significant variations in bathymetry between approximately 500 m to 3750 m ( Fig. 1(b)).
During the 2001 SJF-OBS leg, wide-angle dataset was collected by 100 OBSs spaced 1 km apart and equipped with 4.5 Hz three-component geophones and hydrophones (Dessa et al., 2004a). Total number of 1404 shots were fired every 100 m along a 140 km long profile using an air-gun array with a total volume of 196 L. 15 A coincident towed-streamer profile was also acquired during the SFJ-MCS survey (Martin et al., 2000). Total air-gun array volume (GI GUN + BOLT) was 47 L and the shot interval was 50 m. Streamer cable with 360 channels and length of 4.5 km was towed at depth of 15 m. This acquisition setting translates to the CMP interval of 6.25 m and fold equal to 45.
Examples of MCS and OBS gathers are shown in Fig. 2. Both wavefileds have indeed fundamentally different anatomies, and hence carry out information on the subsurface of different resolution and depth penetration power. MCS data ( Fig. 2(a)) 20 typically record short-spread reflections that are sensitive to the sub-vertical impedance contrasts, which can be mapped by migration techniques. Additionally, smooth lateral velocity variations can be updated along the two one-way transmission paths connecting the reflectors mapped by migration with the sources and receivers at surface using either reflection traveltime (Woodward et al., 2008) or waveform  tomography. However, this velocity updating can be performed only at relatively shallow depths where the recorded reflection move out is significant ( 5 km according to the 4.5 km streamer 25 length). In contrast, anatomy of OBS data ( Fig. 2(b)) is dominated by diving waves and post-critical reflections, namely surface-to-surface wide-aperture arrivals, which provide an information on the long to intermediate sub-vertical wavelengths of the structure. Comparing to the towed-streamer surveys the stationary-receiver acquisitions have the necessary versatility to design long-offset regional acquisitions. At the expense of the receiver sampling they are able to record diving waves which undershoot the deepest targeted structure. For example, the first arrival shown in Fig. 2(b) beyond 60 km offset is the Pn wave 30 -namely the refracted wave in the upper mantle. This description highlights that the integrated imaging workflows, exploiting all the information from the wavefields which are caring different resolution power, should contribute to enrich the wavelength content of the crustal-scale images.

Processing
Integration of migration-and tomography-like techniques has been a common practice in seismic imaging for decades (e.g. Agudelo et al., 2009). However, when the recorded data lack low frequencies and the acquisition setup does not allow for the recording of a wide variety of wave types in terms of propagation regime (transmission versus reflection), there is a significant resolution gap between the velocity model built by tomography and the reflectivity mapped by migration (Claerbout,5 1985, see inset in Fig. 2(b)). Recent developments of the FWI technology associated with the acquisition of broadband wideazimuth/long-offset data aim at reducing this gap by pushing the tomography and the migration toward higher and lower wavenumber reconstructions, respectively (Biondi and Almomin, 2014). Although we use vintage MCS data in this study, which lack long offsets and low frequencies, the application of the FWI to the OBS data (after FAT) provides an illustration of the benefit from high-resolution waveform inversion techniques applied to long-offset data -namely, narrowing the resolution 10 gap between velocity models and reflectivity.
To summarize the motivation behind our seismic processing workflow for deep crustal imaging, the application of FWI to OBS data: (i) led to reliable velocity field for PSDM; (ii) provided high-resolution velocity model reducing the wavenumber gap; (iii) contributed significantly in geological interpretation of the imaging results; (iv) integrated wide-angle seismic data into the depth imaging workflow of reflection seismic data.

OBS data
The aim of the processing of the SFJ-OBS dataset was to obtain crustal-scale P-wave velocity model using FWI, such that it could provide insight into deep crustal targets with a wavelength-scale resolution. The dataset was recently reprocessed by Górszczyk et al. (2017).
The first step consisted of building an accurate starting model using FAT. In case of regional-scale datasets, long offsets imply 20 that a large number of wavelengths are propagated during which kinematic errors accumulate, hence making FWI prone to cycle skipping   Figure 7). To overcome the cycle skipping issue, one needs to carefully check that the starting model allows to match the picked first arrival traveltimes with an error that does not exceed half the period associated with the starting frequency (see green and red lines in Fig. 2(b)). In our case, we obtained average traveltime RMS below 50 ms through the iterative FAT, where a careful model-driven quality control (QC) of the traveltime picking was 25 performed. Resulting FAT model used as an initial model for FWI is presented in Fig. 3(a).
In the next step, acoustic Laplace-Fourier FWI (Shin and Cha, 2009;Brossier et al., 2009) combined with a layer-stripping approach was applied in the 1.5 -8.0 Hz frequency band. Applying strong exponential time damping of the data during the early stages of the FWI allowed us to start the inversion from frequency as low as 1.5 Hz, which together with the kinematic accuracy of our initial model contributes to mitigate the risk of cycle skipping. A multiscale inversion approach consisted of 30 proceeding from low to high frequency bands, which is combined with a continuation from wide to narrow scattering angles implemented by relaxation of the time damping of the Laplace-Fourier inversion. These two hierarchical data-driven strategies were complemented with a layer-stripping procedure implemented by progressive extension of the offset range involved in the inversion. Such combination allowed to mitigate the nonlinearity of the inverse problem and reduce the risk of cycle-skipping accordingly.
Comparison of the final FAT and FWI models in Fig. 3 shows the significant resolution improvement achieved by FWI.
Exhaustive validation, including quantitative evaluation of synthetic and real data-fitting with Dynamic Image Warping (DIW, Hale, 2013), wavelet estimation and checker-board tests, led to conclusion that the final FWI model explain the field data with 5 high accuracy.

MCS data -preprocessing
Processing of the SFJ-MCS data was focused on the best possible reflection imaging of the subduction system. The difficulty of the task relied not only on the complexity of the underlying structure but also on the low ratio between the streamer length and the deepest targeted structure as well as two-dimensional acquisition setting. The processing sequence is summarized in 10 Table 1.
At the preprocessing stage, we mainly focus on the noise attenuation, wavelet corrections and multiple elimination. To improve the signal-to-noise ratio (SNR), few filters aiming on different kinds of reverberations (e.g. swell noise, seismic interference noise, linear noise, double shots) were applied. The bubble effect was removed followed by the zero-phasing of the wavelet and reduction of the ghost on the source and receiver side. 15 In the following step, we focused on the attenuation of multiples. The lack of general demultiple approach imposes the need of tuning the processing workflow according to the limitations caused by the geological setting and the acquisition parameters. In our case, the deeply underlying complex structure causes a lot of triplications and discontinuities of the wavefield especially at later times. What is more, due to the relatively short streamer the ability to track the moveouts is limited. This in turn confines the robustness of the moveout-discrimination-based methods (e.g. utilizing Radon transform, Hampson, 1986;Kabir and 20 Marfurt, 1999) which can easily damage the primaries if aggressive parametrization is used. Alternatively, approaches based on multiples prediction and subtraction, like Surface-Related Multiple Elimination (SRME, Verschuur et al., 1992) could be utilized. While SRME is a data-driven technique, and therefore do not require an a prioi knowledge about the subsurface, its assumptions can be violated in different ways by the field data. In their overview, Dragoset and Jeričević (1998) discussed in details limitations of SRME when 2D data are affected by the off-plane propagation, lacking the near-offset traces or contain 25 distortions caused for example by the streamer feathering.
In our processing scheme, we chose first to apply 2D SRME, which is unlikely to affect the primaries. After SRME the vast majority of the free surface multiples were attenuated. However, in certain areas of the accretionary wedge (e.g. between 55-80 km of the model distance in Fig. 3), we observed residual multiples still strongly contaminating the later arrivals. These residuals are visible on the gathers presented in Fig. 4 (a-b) both with and without NMO correction applied. To further attenuate 30 remaining multiples we applied High-Resolution Radon Demultiple (HRRD, Hargreaves and Cooper, 2001). Data after HRRD are presented in the corresponding panels (c-d) of Fig. 4. Note that the multiple arrivals were better suppressed now, although generally weaker signal amplitudes are observed in the processed part of the gathers. The fact that SRME was unable to fully remove these arrivals can be an indicator of the out-of-the-plane wavefield propagation in this area associated with the complex geology and/or cross-dip inclination of the structures.

MCS data -velocity analysis
In this section, we will throw some light upon the issue of building velocity model from short-streamer MCS data. In the classical VA the moveout curve is (in a typical case) approximated by the hyperbolic relationship between zero-offset time 5 and velocity. Stacking velocities are often picked in certain distance intervals and are further interpolated between the analysis points. This procedure is usually repeated several times with the simultaneous refining of the sampling interval leading to the more detailed velocity estimation (Yilmaz, 2001). The sampling of the velocity picks supposed to be tailored by the interpreter to the degree of horizontal and vertical velocity heterogeneities defining the structure under consideration. Therefore the capability to build the accurate velocity model via VA will firstly rely on the ability to track the moveouts, which in turn 10 can be affected in the countless way (e.g. noise, multiples, streamer length, arrival time, AVO variations, frequency bandwidth, velocity itself, geological structure etc. etc.). Secondly it will depend on the optimal velocity picks sampling, and therefore on the initial interpretation of the data and the corresponding velocity field.
In The results confirm that the meaningful information about the hyperbolic moveouts can be retrieved only in the shallow part where the points of the semblance amplitudes are sufficiently well focused and allow to track the increasing velocity trend with time. However, on the intermediate and later times the VA panels become more chaotic and no distinct focusing points can potentially be picked. This is of course straightforward consequence of the increasing depth of the target to be imaged. 25 Moreover, it is also worth to mention that the final FWI model ( Fig. 3(b)) manifests few places where velocity gradient with depth becomes negative. Some of these areas are related to existence of Low-Velocity Zones (LVZ) which might be difficult to detect through the classical VA.

MCS data -slope tomography
To further improve the velocity model obtained via VA, one may consider application of some variant of reflection travel- generated in the model with raytracing and those measured from the data (Bishop et al., 1985;Farra and Madariaga, 1988).
In addition to traveltimes, slopes of locally-coherent events can be picked in common-shot and common-receiver gathers and included as objective measures in the inversion (Billette and Lambaré, 1998). Development of so called slope tomography or stereotomography proved that this technique gives a promising alternative for estimating laterally heterogeneous velocity models from reflection seismic data (see Lambaré, 2008, for a review).

5
In the data domain, the input data (picks) are defined by their source to receiver two-way traveltime and two slopes -namely the horizontal component of the slowness vectors at the shot and receiver positions ( Fig. 6(a-b)). For a given velocity model, each locally-coherent event is associated to two ray segments (and therefore two one-way traveltimes) emerging at certain angles from a scattering position (whose position needs to be estimated) towards the source and receiver (Billette and Lambaré, 1998).
Different variants of ST were developed since the time it was first proposed. Some were aiming on reducing sensitivity of the 10 method to the quality of the input data (Biondi, 1992). Other established the connection with migration-based velocity analysis where the picking is performed in the prestack migrated volume (Chauris et al., 2002;Nguyen et al., 2008). More recently Tavakoli  This parsimonious parametrization is achieved by using the focusing equations of Chauris et al. (2002) to position the scattering points in depth through a migration of two kinematic invariants (two-way traveltime and one slope) instead of processing the coordinates of the scattering positions as optimization parameters. 20 One of the advantageous assumptions about the input data for ST is that the picked events need to be only locally coherent (see Fig. 6(a-b)). Unlike traveltime tomography, it is not necessary to follow continuous arrival across the whole trace gather either follow some specified interface/horizon (Billette and Lambaré, 1998). Tracking locally-coherent events by semi-automatic picking instead of continuous reflections leads generally to denser sampling, and hence better resolved velocity model. The attributes of locally-coherent events, referred to as kinematic invariants, are simultaneously picked in the common-shot and 25 common-receiver gathers by local slant stack (Billette et al., 2003). Alternatively, one can infer them from de-migration of the dip attribute and scattering positions picked in the pre-stack depth-migrated volume with the benefits of a better SNR and an easier interpretation or QC (Guillaume et al., 2008). cold-coloured dots (blue, light blue and cyan) follow seaward dipping structures, while warm-coloured dots (yellow, orange and red) trail the interfaces of opposite inclination. Not surprisingly the amount of picks decreases significantly with increasing depth, although we can observe quite dense collection of picks around the plate interface on the landward side between the backstop and subducting oceanic plate at depth around 15 km. To express in a more quantitative way the relative sampling of the model by the picks we calculate their hit-rate inside square grid cells of size 0.5 km. Consequently we clip 10 % of the 5 highest values, normalize the results between 0 and 1 and divide into 3 uniform intervals. Fig. 7(c) presents the PSDM section with superimposed shading masks corresponding to the normalized sampling of the model by picks. The brightest area of the section was sampled between 100-66 % of the maximum (clipped) picks hit-rate value. The next two grey-shaded areas were sampled between 66-33 % and 33-0 % respectively. The black area define the part of the model not sampled by picks.
To conclude on this analysis we can make the statement that the robustness of the ST is reflected by the distribution of the scat-10 tering points demonstrating high consistency with the dipping structures observed in the PSDM section. This in turn assures regular and redundant information which might be exploited for velocity model building. However, the amount of the picks reduces significantly with depth as the reflectivity of the structure decreases. This is more pronounced on the landward part (backstop) of the model. Therefore, the constraints of velocity model at greater depths -although significantly better than those provided by the classical VA -are weaker than at the shallow part.

PSDM
During the PSDM step, we tested different background velocity models, targeting on the best flattening of the CIGs. Here we consider only four of them. The first one was built from MCS data by classical VA, while the second one is the FWI model inferred from the OBS data ( Fig. 3(b)). In this sense, PSDM of MCS data can be seen as a additional QC of the FWI model. To further check how much we can improve the the PSDM results by refining these velocity models, we perform RMO picking in 20 the pre-stack migrated gathers which were subsequently minimized by ST. The original and refined velocity models produced four PSDM images. We essentially tested Kirchhoff PSDM as well as Controlled Beam Migration (CBM) with the aim of improving SNR of the final result (considering our low-fold dataset). CBM is an extension of Beam Migration technology which combines the steep-dip imaging capabilities of Kirchhoff techniques with the multi-arrival abilities of WEM (Vinje et al., 2008). The results obtained with both migration techniques however, did not provide clear justification which approach 25 was superior and therefore, for simplicity, we decided to adopt PSDM in the classical Kirchhoff version.

Results
In the following sections we present the PSDM results inferred from the four above-mentioned background velocity models.
We assess them by looking at the flatness of the CIGs as well as through the correlation of the migrated sections with the underlying velocity models. Furthermore, we assess the main observed structures against the results reported by previous 30 geological studies conducted in this area.

Imaging
The PSDM sections, superimposed on the VA-and FWI-related velocity models with which they were computed, are presented in Fig. 8. Amplitudes of the stacks were scaled with a function of depth equal to depth 1.5 . Generally, all PSDM sections exhibit similar quality in the shallow seabed area (first 2-3 km). However the PSDM sections obtained with the FWI and FWI+ST velocity models (Fig. 8(c-d)) show much better focused reflectivity at intermediate and large depths. This is especially well 5 pronounced in the area below the accretionary prism (5-10 km depth, 55-100 km distance) as well as in the landward dipping subducting oceanic crust down to about 15 km depth.
We validate resulting PSDM sections against the corresponding background velocity models. Indeed, structure imaged in PSMD sections inferred from VA velocity models is less focus than in those inferred from the FWI models. This is the conse-10 quence of the fact that VA models lack resolution and accuracy in term of velocity estimation. Comparing panels (a) and (b) in Fig. 8 on can observe that the VA-related velocity models exhibit a similar level of smoothness, although the PSDM section corresponding to the VA model updated by ST (Fig. 8(b)) looks more geologically relevant. The main changes in the PSDM sections computed with the VA and VA+ST velocity model are shown in the area where ST introduced the most significant velocity updates. These perturbations are shown in Fig. 8(e). Colour-scale was clipped between ±1000 km/s for better readability, 15 however, the lowest velocity perturbations (dark blue colour) reach −1500 km/s providing some flavour of the inaccuracy of the VA model.
Analogous comparison of the FWI and FWI+ST velocity models and related PSDM sections (Fig. 8(c-d)) leads also to few conclusions. First, the high-resolution FWI model provides focused PSDM image (Fig. 8(c)). Second, the reflectivity image 20 follows nicely the absolute velocity changes mapped by the FWI model. Third, the updated FWI+ST model marginally improves the final PSDM section (Fig. 8(d)). This improvement results from relatively small velocity perturbations introduced in the FWI model by ST (compare Fig. 8(e-f)). Also, the ST damages some dipping velocity structures which are more visible in the FWI model than in the FWI+ST model (see backstop between 5-10 km depth or accretionary prism between 65-85 km distance). The last effect is related to the nature of the perturbations typically introduced by the reflection data which are more 25 sensitive to the vertical velocity variations. In contrast, the long-offset OBS data are better suited to capture the horizontal and steeply dipping velocity changes.
To further QC the imaging results in Fig. 9 we present a set of CIGs after PSDM using each of the four velocity models. The CIGs inferred from the VA model ( Fig. 9(a)  ning PSDM, the events are flatter ( Fig. 9(b)). However under-corrected RMOs are still visible in the deeper part of the gathers.
Additionally, we start seeing some over-corrected reflections (e.g. CIG 3245) indicating too slow velocities. For comparison, CIGs inferred from the final FWI model (Fig. 9(c)) show that the majority of reflections are flat including those originating from the deep crust. Refining the FWI model by ST (Fig. 9(d) The overall trend after ST is the up-shifting of the reflectors suggesting that the velocity refinement by ST mainly lower the FWI velocities. This likely highlights the imprint of the anisotropy. The velocities estimated by FWI being close to the horizontal velocities, while those constrained by MCS data being closer to the normal moveout counterparts. In all cases, some RMOs are more pronounced at the side edges of the profile. The effect is caused by the lack of coverage in these areas, which 5 translates to weak constraints during velocity model building.
To quantitatively express the depth changes in the PSDM results introduced by ST update of the FWI model we estimate the depth shifts between the PSDM sections inferred from FWI and FWI+ST models. We calculated local cross-correlation for each two corresponding traces from the mentioned sections which led to consistent map of depth changes between them.

Geological consistency
Before interpretation step, we filter-out (using curvelet transform (Górszczyk et al., 2015)) some of the strong migration smiles apparent in the final PSDM sections which we addressed as residual multiples or out-of-the-plane arrivals after analysis in pre-stack domain. To foster the joint interpretation of the PSDM results and FWI velocity models, we overlay migrated section on the (i) FWI+ST model, (ii) gradient image, i.e., average of horizontal and vertical derivative of the FWI velocity model (Fig.   25 11(a-b)) and (iii) perturbation model -the FWI model after removing velocity trend obtained by 5th order polynomial fitting of each vertical profile of the FWI model (Figure 12(a-b)). The resulting sections provide not only an image of the reflectivity but also a high-resolution background insight on the seismic properties (P-wave velocity changes) of the structural units delineated by the reflectors. The resulting images show the high consistency between the velocity structures and the reflectivity image, improving continuity and delineation of some interfaces which are locally better visible in PSDM section than in the velocity 30 model and vice versa. Based on the deformation features and the velocity structures identified in the obtained images, we classify the geology of this region into block-segments including different deformation domains (see inset in Fig. 11(b)), which are bounded by three major faults and the basement -namely subducting plate interface.

Structure of the subducting oceanic crust
The oceanic crust in the region of Tokai is commonly described as being affected by volcanism (Le Pichon et al., 1996;Nakanishi et al., 1998;Martin, 2003;Mazzotti et al., 2002;Dessa et al., 2004b;Kodaira et al., 2002;Takahashi et al., 2002). The main reason for this is the proximity of the Izu-Bonin Arc which develops periodical volcanic ridges on its western flank (see white arrows in Fig. 1(a)), which are further subducted beneath the central Japan. Such setting can explain the undulating shape 5 of the subducting oceanic crust with the local increasing/lowering of velocity values observed in Fig. 11(a) as well as weak and rather difficult to distinguish boundaries between the layers of the subducting crust. In the resulting images, the Moho boundary is derived from both absolute velocity model ( Fig. 11(a)) and the positive change of the gradient value (red color) in Fig. 11(b). The migrated image at this depth (not surprisingly owing 4.5 km streamer length) is strongly affected by so-called "migration smiles". To support interpretation of upper mantle one can additionally calculate refraction ray-tracing in the final 10 FWI model and track the rays corresponding to Pn arrival, which tend to be channelled along the Moho boundary (see Fig. 13(e) in Górszczyk et al., 2017, for the illustration).
Within the subducting oceanic crust, we can distinguish discontinuous and local reflections which can be interpreted as a top of layer 3. This interpretation is supported by the image in Fig. 11 (Dessa et al., 2004b;Kodaira et al., 2002;Tsuji et al., 2013;Lallemand, 2014). Above the top of layer 2 with solid line we interpret the top of the sedimentary subduction channel acting as a decollement which is relatively well supported by reflectivity of the PSDM image and by the high-resolution velocity gradient changes.
Towards land-side, between 40-55 km of model distance we clearly point the negative gradient of velocity creating major LVZ.
The top of this LVZ is consistent with the plate interface. In our case, access to the high-resolution velocity model directly 25 indicate negative velocity gradient which alternatively could be obtained from the polarization reversal in the reflectivity image -addressing for example presence of fluids. However at this depth we observe bunch of ringing-like reflections rather than clear interface suggesting relatively thick layer of accumulated deformed material. Discontinuous reflection patterns can be explained by the clastic deposition or duplexing of sedimentary or volcanic material (Gutscher et al., 1998b, a). Alternatively the LVZ can be interpreted as a damage-zone caused by a topography high passing through this area. Region of similar nature

Subduction front and recent accretionary wedge
On top of the subducting oceanic crust, the imaging results reveal complex accretion system of the Tokai segment, which can be divided into four domains. The weakly deformed domain is located at the south-east of the profile starting at ∼85 km km of model distance with two faults originating at its apex. This is most likely a subducting volcano considering the decay of velocity within the oceanic crust below it ( Fig. 11(a)) The sub-horizontal solid line between 80-95 km of distance marks the 10 strong reflector identified as an extension of the decollement since the deformation of the sediments accumulated in the trench starts to occur above this interface.
The weakly deformed domain is separated form the accretion front -moderately deformed domain -with the major thrust fault that generates ocean-floor topography at 85 km of model distance and reach the decollement around 13 km further landward direction. It corresponds to the frontal thrust of this plate subduction margin, reported by Hayman et al. (2011). The 15 whole domain shows analogous reflection surfaces as in case of the weakly deformed domain, suggesting that it consists of sedimentary layers, but is folded by several thrust faults reaching to the detachment interface. The consistency of these thrusts with the velocity models is striking even in a small details. For example gradient variations in Fig. 11(b) follows consistently the steeply dipping stacked sheets of sediments in the accretionary prism. In Fig. 12(a), shallow slope basins are clearly delineated by the blue color and separated by the thrust creating bathymetry outcrop at 73 km of model distance. Similar consistency 20 can be observed in terms of the vanishing reflectivity of the PSDM image above the decollement between 67-75 km (dotted line, zoom-panel 5) which is nicely delineated by lower velocity values in this place indicated by blue colour. Interestingly the decollement on the right side of this feature manifests itself in Fig. 11(b) as a positive (red) gradient while on the left side it becomes opposite (blue) suggesting decrease of the velocity. 25 The moderately deformed domain extends landward until the Tokai thrust separating the frontal part of the prism from heavily deformed domain (Kawamura et al., 2009;Hayman et al., 2011). This segment in turn shows less pronounced reflectivity which can be significantly augmented for the interpretation purposes by the pattern of the velocity changes in the absolute, gradient and de-trended velocity models. Structural characteristic of this domain is different from two previous deformation domains, suggesting that it has undergone longer period of deformation at this margin (according to Martin et al., 2003, 30 starting at the Upper Miocene). In particular significant increase of the velocity values just above the plate interface ( can suggest topography high (namely, evidence of Paleo Zenisu ridge) whose left flank is colliding with the old accreted thrusts (dotted line between 55-60 km of distance). We can also observe in Fig. 12(a-b) (zoom-panel 4) that the small topography popup is separating two areas of blue colour indicating zones of lower velocities. Such setting suggests that the Tokai Thrust can be rooted at the down-stepping plate interface following-up the left flank of this velocity anomaly, then it is bending above it and increases the dipping again to reach its outcrop in the seabed. 5 We also observe the thickness of sloping sediments in the shallow part of this segment is larger than in case of moderately deformed domain. This suggests longer time of accumulation and partial erosion of the uplifted, steeply dipping thrust sheets around the Kodaiba Fault -a major thrust fault with significant strike-slip displacement (Pichon et al., 1994). Strong gradient variations in Fig. 11(b) delineate geometry of this fault especially above the plate interface. High consistency of PSDM image and velocity changes can also be observed in Fig. 12(a-b) (zoom-panel 3). We root this fault at the plate boundary suggesting

Backstop area
Finally, on the most landward part of the profile, we image the old deformation complex acting now as a backstop. The defor-15 mation of large landward-dipping stacked thrust sheets can be indicated by the velocity changes correlating with reflectivity visible in the PSDM image. The hypothesis that this old complex could have been pushed landward by large subducting ridges was imposed based on the results of wide angle seismic experiments (Kodaira, 2000;Nakanishi et al., 2002) as well as sandbox tests (Dominguez et al., 2000). Placing our results in this context would require further geological expertise. Nevertheless, from the joint interpretation of velocity models and PSDM section one can divide this domain into: (i) the thick relatively not 20 deformed sediment cover of the fore-arc basin suggesting that the (ii) underlying complex of large thrust sheets between 4-10 km depth is inactive and overlay the (iii) significant amount of underplated material delineated by the increase of velocity at depth of around 11 km. 25 The ability of the wavefield to penetrate the deep crust is limited by various factors including acquisition geometry, employed equipment, geological context etc. Sufficiently dense shot coverage increases the fold and therefore spatial sampling redundancy. Ability to generate and record broadband frequency signals improves resolution and signal to noise ratio. Long streamers, although challenging to operate in the field, provide offset range allowing to track the moveouts of later arrivals originating from deep interfaces. All these aspects improve the capability of the MCS streamer data to retrieve meaningful 30 information about the deep lithosphere. Our MCS data were recorded using 4.5 km long streamer, which compared to indus- trial 15-km long streamer drastically reduces this capability. Nevertheless, we showed that the robust depth imaging of the legacy data with short offset and limited fold is possible when supported by the velocity model-building based on FWI of the wide-angle stationary receiver data. Despite the different regimes of the wavefield propagation between wide-angle and reflection data our FWI model provided accurate velocity field for PSDM. This correctness allowed to pick the slopes for ST and introduce relatively small velocity changes even at greater depths ( 15 km). In contrast, the velocity model obtained via VA 5 was far more inaccurate to allow for similar PSDM accuracy after ST update. Therefore building high-resolution FWI velocity model from wide-angle OBS data can significantly improve the PSDM results from short-streamer MCS data especially in terms of velocity accuracy below the shallow sub-seabed.

2D/3D PSDM
2D seismic data processing is inherently inaccurate because of the three-dimensional wavefield propagation within the litho-10 sphere which takes place during field experiment. Assumption about the ideal inline scattering along the shooting profile is closer to the wishful thinking rather than reality. Nevertheless, the compromise between the high costs of the 3D experiment and the accuracy of the results in terms of resolution allowing to retrieve meaningful geological information often capitalizes in the two-dimensional acquisition geometries. This in turn provides limited prospect for subsequent processing (for example accuracy of retrieved attributes) and the following quality of the results when cross-dipping structures are expected in the 15 vicinity of the 2D profile. While the imaging of deep crustal targets requires long time of wavefield propagation, it means that the wavefront is more prone to travel out of the 2D profile plane. Complexity of the geological structure that we aimed to reconstruct here combined with its significant depth allow us to believe that the final imaging results can be affected by the "3D effect" reducing the continuity of the deeper interfaces. This was also apparent from locally occurring arrivals which were stacking into migration artifacts. We decided to filter them out before final interpretation as we were unable to flatten them 20 with any velocity model even with velocity scaling ranging between 80%-120%. To overcome such issues, one can consider cross-dip processing which takes into account the 3-D character of the data (for example feathering of the streamer) and of the geology (Nedimović et al., 2003). However, ultimately we believe that there is a need to move toward future 3D experiments tackling the problem of complex deep crustal imaging (Morgan et al., 2016). Indeed, illumination of the structure from different azimuths would increase the reliability of the final image and minimize the amount of speculations in terms of geological 25 interpretation. On the other hand, this development shall be combined with the efficient acquisition logistic schemes and robust processing techniques able to handle large scale imaging problems.

On subducted oceanic ridges
The typical problem from Tokai region, which can justify use of 3D rather than 2D crustal-scale imaging, is the existence of the basement volcanic ridges and mountains subducting below the accretionary wedge and affecting the geodynamical setting of 30 the region. While their existence is undeniable (even from the bathymetric observations in Fig. 1(a)) the exact shape and scale of those features in the historical studies utilizing reflection or wide-angle seismic imaging and velocity model building, sandbox experiments or gravity and magnetic anomalies analysis is rather inconsistent. Moreover, demonstrating these ridges as continuous features of uniform height is oversimplified. To better understand the problem it is enough to look how in Fig. 1(b) the Zenisu Ridge along the axis rapidly changes its height with respect to the trench from nearly 3 km to few hundreds meters.
Therefore to precisely image and investigate the influence of such chain of distributed volcanoes of various size subducted below the complex accretionary prism, 3D imaging seems mandatory.
4.4 PSDM using FAT model 5 FWI in its classic form is a local optimization problem. In other words, the misfit functional representing difference between real and synthetic data is iteratively minimized (toward global minimum) starting from the initial model which is sufficiently close to the true one, such that it fulfils the cycle-skipping criterion. If this condition is not met, FWI will be guided toward inaccurate local minimum. Therefore accuracy of the initial FWI model determines correctness of its final version. In this sense, the robustness of the initial FWI model (in our case derived by FAT) can be also verified through PSDM. Indeed, if the  Figure 13 presents analogous comparison of PSDM gathers as in case of Fig. 9. After careful inspection of the migrated events we can conclude that the flatness of the CIGs inferred from the FAT model is locally worst than those inferred from the FWI model 9(c). In overall however, these results are much better than those obtained with VA and VA+ST velocity 15 models ( Fig. 9(a-b)). This shows how wide-angle data can be useful to build correct velocity model for PSDM using classical FAT technique. In our case, this approach outperforms the results of PSDM with classical VA model as well as its updated version with rather sophisticated ST method. On the other hand, one can argue that the improvement of PSDM results obtained with FWI as compared to FAT are not proportional to the efforts devoted to apply FWI. Note however, that the smooth FAT model is unable to bring detailed structural information which is directly accessible from the high-resolution FWI model and 20 therefore we believe that moving from FAT to FWI is absolutely crucial for better geological interpretation.

Conclusions
Presented case study shows that crustal-scale FWI of OBS data can significantly improve depth migrated images inferred from towed-streamer data. Firstly, final FWI model provides reliable information about the velocity not only at shallow depths (< 5 km) but also at depths down to the upper mantle, which are beyond the range of the typical length of streamer. These more re- 25 liable velocities translates to better-focused PSDM at depth. Secondly, high-resolution FWI velocity model brings quantitative information on the seismic properties of the main structural units at long-to-intermediate scales, which are complementary to the short scales mapped in the PSDM image. This justifies usefulness of the crustal-scale FWI models derived from wide-angle data for geological interpretation. We believe that further development of crustal-scale imaging toward 3D OBS acquisition and 3D multi-parameter visco-elastic FWI will bring us closer to better understanding of the deep crustal targets through the Tarantola, A.: Inversion of seismic reflection data in the acoustic approximation, Geophysics, 49, 1259-1266, 1984: Slope tomography based on eikonal solvers and the adjoint-state method, Geophysical Journal International, 209, 1629-1647, https://doi.org/10.1093/gji/ggx111, 2017.   Example of single OBS gather. The red and green lines show picked and synthetic traveltimes, these later being computed in the FAT velocity model. The match between the two lines gives some insight on the kinematic accuracy that may need to be achieved to prevent cycle skipping during subsequent FWI. Note also the challenging picking of first arrivals between 35km and 45km offset, where weak-amplitude first arrivals correspond to head waves that propagate along a thrust in the backstop. See Górszczyk et al. (2017) for details. Inset in (b) presents resolution gap between tomographic velocity models and migrations images ( Fig.1.13 of Claerbout, 1985) 23 Solid Earth Discuss., https://doi.org/10.5194/se-2019-33 Manuscript under review for journal Solid Earth     The symbols + denote the velocity nodes. Two rays are shot toward the source s and receiver r from the scatterer x with take-off angles Θs and Θr corresponding to one-way traveltimes Ts and Tr. The velocity model, the scatterer coordinates x and the ray attributes (Θs, Θr, Ts and Tr) form the model space of classical ST. The scatterer position x and the two take-off angles (Θs, Θr) define a dip (migration facet).
The horizontal component of the slowness vectors at the source and receiver positions, ps and pr, the two-way traveltime Tsr and the source and receiver positions (s, r) form the data space of the classical ST (adapted from (Tavakoli F. et al., 2017) 26 Solid Earth Discuss., https://doi.org/10.5194/se-2019-33 Manuscript under review for journal Solid Earth