A comparison of straight-ray and curved-ray surface wave tomography approaches at near-surface studies

. Surface waves are widely used to model shear-wave velocity of the subsurface. Surface wave tomography (SWT) has recently gained popularity for near-surface studies. Some researchers have used straight-ray SWT in which it is assumed that surface waves propagate along the straight line between receiver pairs. Alternatively, curved-ray SWT can be employed by computing the paths between the receiver pairs using a ray-tracing algorithm. SWT is a well-established method in 10 seismology and has been employed in numerous seismological studies. However, it is important to make a comparison between these two SWT approaches for near-surface applications since the amount of information and the level of complexity in near-surface are different from seismological studies. We apply straight-ray and curved-ray SWT to four near-surface examples and compare the results in terms of the quality of the final model and the computational cost. In three examples we optimise the shot positions to obtain acquisition layout which can produce high coverage of dispersion curves. In the other example, the 15 data have been acquired using a typical seismic exploration 3D acquisition scheme. We show that if the source positions are optimised, the straight-ray can produce S-wave velocity models similar to the curved-ray SWT but with lower computational cost than the curved-ray approach. Otherwise, the improvement of inversion results from curved-ray SWT can be significant.


Introduction 20
Surface waves are commonly analysed to build shear-wave velocity (VS) models. Surface wave tomography (SWT) is a well-established method in seismological studies, where numerous researchers have used it to construct subsurface velocity models at global and regional scale by inverting earthquake signals (Woodhouse and Dziewonski, 1984;Ekstrom et al., 1997;Ritzwoller and Levshin, 1998;Boschi and Dziewonski, 1999;Simons et al., 1999, Boschi andEkstrom, 2002;Yao et al., 2010). Some authors have applied SWT using ambient noise cross-correlation to retrieve regional crustal structures 25 (Shapiro et al., 2004;Lin et al., 2008, Kästle et al., 2018. SWT usually consists of three steps (Yoshizawa and Kennett, 2004;Yao et al., 2008). First, different path-averaged dispersion curves (DCs) are computed for different receiver pairs aligned with a source. Then, the DCs are inverted to build phase velocity maps at different period (frequency). Finally, the obtained phase velocity maps are inverted to produce 1D VS models at different locations. However, the efficiency of SWT can be increased by the direct inversion of the path-averaged 30 DCs, i.e., skipping the intermediate step of building phase velocity maps (Boschi and Ekstrom, 2002;Boiero, 2009;Fang et al., 2015).
Traditionally in seismology, SWT has been employed assuming great-circle propagation of surface waves (Trampert and Woodhouse, 1995;Ekstrom et al., 1997;Passier et al., 1997;Ritzwoller and Levshin, 1998;Boschi and Dziewonski, 1999; van Heijst and Woodhouse, 1999;Simons et al., 1999;Boschi and Ekstrom, 2002;Lin et al., 2008;Yao et al., 2010;Bussat 35 and Kugler, 2011;Kästle et al., 2018). However, some researchers have employed SWT not assuming the great-circle propagation of surface waves (Spetzler et al., 2002;Yoshizawa and Kennet, 2004;Lin et al., 2009). SWT has been used in seismological studies for decades and different SWT approaches have been compared by seismologists. For instance, Laske (1995) studied deviations from straight line in the propagation of long-period surface waves and concluded that they usually have small effects on the propagation phase. Spetzler et al. (2001) applied both straight-ray and curved-ray SWT methods. 40 They computed the maximum deviations of ray paths from straight lines and pointed out that this maximum is typically below the estimated resolution, except for long paths at short periods. Some studies showed that a more complex forward modelling in SWT did not improve the results (Sieminski et al., 2004;Levshin et al., 2005) while other studies reported obtaining better results (Ritzwoller et al., 2002;Yoshizawa and Kennett, 2004;Zhou et al., 2005). Trampert and Spetzler (2006) pointed out that the choice of regularization has a major impact on SWT results. They studied SWT methods based 45 on ray theory (straight-ray and curved-ray) and scattering theory in which the integral along the ray path is replaced by the integral over an influence zone. They showed that these methods are statistically alike and any model from one method can be obtained by the other one by changing the value of the regularization. They concluded that the only option to increase the resolution of the model is to increase and homogenize the data coverage. Bozdag and Trampert (2008) compared straight-ray and curved-ray SWT methods in their study and mentioned that performing ray tracing could be so time-consuming that the 50 potential gain in crustal corrections on a global scale might not be worth the additional computational effort imposed by ray tracing. Despite seismological studies, a comparison between the performance of straight-ray and curved-ray SWT at the near-surface scale is missing.
In near-surface studies, the shot locations can be optimised to ensure that a high coverage of DCs can be achieved. This 55 abundance of information facilitates shallow 2D or 3D characterization with great details. Due to its ability to provide high lateral resolution, SWT has recently attracted the attention of researchers for near-surface studies, where high lateral heterogeneity is expected. Few researchers have used SWT for near-surface characterization assuming straight ray propagation of surface waves: Kugler et al. (2007) characterized shallow-water marine sediments using Scholte waves dispersion data, Picozzi et al. (2009) applied SWT on high-frequency seismic noise data to construct a VS model up to 25 m 60 depth, Rector et al. (2015) employed SWT to obtain a VS model in a mining site, Ikeda and Tsuji (2020) successfully applied SWT in laterally heterogeneous media, Papadopoulou (2021) showed the applicability of SWT in near-surface characterization in a mining site consisting of hard rocks, Khosro Anjom (2021) constructed a 3D VS model applying SWT on a large 3D dataset acquired for testing purposes in a mining area.
Since the level of complexity and lateral heterogeneity in the near-surface is expected to be higher than in most 65 seismological studies, the straight ray approximation of surface waves may not be valid and curved-ray tomography should be used by means of ray tracing at each frequency. Fang et al. (2015) applied SWT on a shallow crustal study considering the effect of heterogeneity on wave propagation. They performed surface wave ray tracing at each frequency using a fastmarching method (Rawlinson and Sambridge, 2004). Wu et al. (2018) applied curved-ray SWT to obtain a shallow VS model at a mining site. Barone et al. (2021) applied different tomography methods, including eikonal tomography, to 3D 70 active seismic data.
In curved-ray SWT, at each iteration a ray tracing method is applied at each frequency component to compute the ray path between the source and receivers. Even though this can increase the accuracy of the final model, it will lead to higher computational cost compared with straight-ray SWT. The computational efficiency is of great importance in seismic nearsurface since, compared to seismological studies, the abundance of data at active seismic near-surface projects can increase 75 the computational cost significantly. Therefore, it is necessary investigate the gained improvement together with the associated additional computational cost from the curved-ray SWT over the straight-ray approach.
We apply straight-ray and curved-ray SWT on four datasets. Two examples include 3D synthetic models containing lateral velocity heterogeneity. We then apply SWT on two field datasets to evaluate the method on real data. For each dataset, 3D VS models from straight-and curved-ray SWT are obtained by direct inversion of DCs, and the accuracy and computational 80 efficiency of the two approaches are compared.

Method
In this section, the applied methodology is described. Besides explaining the straight-ray and curved-ray SWT approaches and the differences between them, we also describe the employed procedure to optimise source positions and the process to estimate the DCs from the raw data. 85

Optimisation of source layout
For a given (random or regular) array configuration, we can optimise the locations of shots to ensure having high coverage DCs with minimum number of shots based on the guidelines by Da Col et al. (2020). In this approach, many shot positions are defined as the potential shot candidates. For each shot, we find all receiver pairs aligned with the shot. After computing all the possible receiver pairs for all the defined shot positions, the shots are sorted based on the number of in-line receiver 90 pairs. Then we pick the shots which could provide the greatest number of unique pairs (i.e., potential DCs). If the data coverage is satisfactory also from the azimuthal point of view, we consider the selected shots as the final ones. Otherwise, more shots are added to increase the data coverage. We also use a dataset (case study 4) where the acquisition layout mimics at a smaller scale the classical seismic exploration 95 3D cross-spread acquisition scheme with orthogonal lines of sources and receivers. This dataset, not being optimised (for a SWT study) will help analysing the criticalities introduced by a non-optimal acquisition scheme.

Estimation of DCs
Once the acquisition layout is finalised, the DCs are estimated from the acquired data. We use a MATLAB code (Papadopoulou, 2021) that automatically retrieves the DCs between each receiver pair that are collinear with a source. 100 Here, we provide the general concepts based on which the code estimated the DCs from the raw data (for detail see Papadopoulou, 2021). For each receiver pair, a frequency-domain narrow band-pass Gaussian filter, which was originally proposed by Dziewonski and Hales (1972), is used to analyse the traces into monochromatic components. The traces are then cross-correlated frequency by frequency to produce the cross-correlation matrix. The phase velocities of surface waves correspond to the maxima on the cross-correlation matrix, but there are many maxima because in the two-station method the 105 observed phase delay is invariant under 2 translation (Magrini et al., 2022). Hence, to avoid ambiguity in picking the correct maxima, a reference DC is used. The reference DC is estimated automatically using multichannel analysis method (Park et al., 1998) for the positions near to the receiver pair. The code picks all candidate DCs on each cross-correlation matrix. Then, the candidate that is closest to the reference DC at all frequencies is picked. Afterward, a set of QC processes allow to reject data points that do not follow the smooth trend of the DC and also to remove poor quality DCs. 110 For the frequency band of the generic i th estimated DC, we put the corresponding phase velocity values into a vector ( i V ).
The input data for the inversion is a vector (

1D forward modelling
We carry out our experiments in a Cartesian coordinate system. The subsurface is discretized into a set of 3D grid blocks where it is assumed that the only unknown parameter of each grid block is the VS value while Poisson ratio (ν) and density (ρ) are assumed to be known as a priori information. Several 1D model points m are defined and for each one, the DC is 120 computed using a Haskell (1953) and Thomson (1950) forward model modified by Dunkin (1965). For the k th model point k m , the local phase velocity for a given frequency   k Vf is determined as: (3)

Computation of forward response
To obtain the forward response in the curved-ray SWT, first the ray path between the generic receiver pair R1-R2 for each 125 frequency of the DC should be computed. Having computed the phase velocities as a function frequency at the position of each model point, the 2D phase slowness maps at each frequency are built. Then, ray tracing (Noble et al., 2014) is performed at every built phase velocity map to compute the frequency dependent ray paths between the receiver pair ( 12 RR l ).
To evaluate the accuracy of the ray-tracing algorithm, we have applied it in a to a homogeneous media and noticed that the error (i.e., deviation from straight-line) in this condition is almost zero (not shown here). The frequency dependent ray path 130 between the receiver pair is discretized to many points. At the generic i th discretised point along the path, the phase slowness ( i p ) is computed using a bilinear interpolation among the computed 1D local forward models at the four surrounding model points as: where x and y show the position of each point. Figure 1 shows a schematic representation of the bilinear interpolation along 135 the discretised path between the receiver pair. 6 The corresponding phase velocity (   12 RR Vf ) is equal to the inverse of the computed phase slowness: We obtain the vector of the simulated DC for the receiver pair (R1-R2) by: 145 where 1 , , n ff represent the frequency components of the corresponding DC. It should be noted that each estimated DC may have a frequency band different from the others and therefore, the lengths of DCs are not necessarily the same. The

Inversion algorithm
The employed inversion algorithm is based on the method proposed by Boiero (2009). We solve the inverse problem aiming at minimising the misfit function (  ) which is defined as: where m shows the vector of the model parameters, obs d is the observed data,   fw m represents the forward response of the model that is computed from Eq. (8). The spatial regularization matrix p R contains values of 1 and -1 for the constrained parameters and zeros elsewhere (see Auken and Christiansen, 2004, for details) and the extent of variation of each model parameter with respect to its neighbouring cells is controlled by the covariance matrix P R C . To reduce the impact of spatial 160 regularization on the inversion results, in all four examples in this study, a large value (10 6 ) is assigned to P R C . It means that the VS difference between the neighbouring cells is constrained to 1000 m/s. The matrix obs C consists of the uncertainties of the observed data and is obtained as: where V σ is computed using Eq.
(2). The defined misfit function (Eq. (9)) is minimised iteratively. At the n th iteration, the 165 current model n m is updated as (Boiero, 2009): where G is the sensitivity matrix of the data and  represents the damping factor (see Marquardt, 1963 To estimate the DCs from raw data, we have used the auto-picking code (Papadopoulou, 2021) in which the DCs are sampled uniformly in frequency. This means that each DC is non-uniformly sampled in terms of wavelength which can drive the inversion algorithms to the shallowest part of the subsurface without any significant updates in the deeper portion of the initial velocity model (Khosro Anjom and Socco, 2019). To address this issue, a wavelength-based weighting scheme was 175 applied in the inversion process to compensate for this non-uniformity (see Khosro Anjom et al., 2021, for details). Hence, the obs C is modified as: where , ij  is the standard deviation of the i th data point of the j th DC, and , ij w is the corresponding weight that is computed as: 180 where , ij   represents the wavelength difference between the data point i of the j th DC and the data point with the smallest wavelength distance from i, and ,max j   is the maximum computed wavelength difference for the j th DC.

Results
In this section, we apply straight-ray and curved-ray SWT approaches on four datasets and compare the results. For each 185 example, the straight-ray and curved-ray SWT inversions start from the same initial model. Other inversion parameters ( P R C and  ) are also the same for the sake of comparison. It should be noted that only VS values are updated during the 8 inversion and the other parameters (h, ν, and ρ) are fixed. In case of the synthetic examples, the true values of ν and ρ are used in the inversion. For the field examples, ν and ρ are approximated based on the available a priori information. Having erroneous values of ν and ρ can induce errors in the inversion results even though the sensitivity of surface waves to VS is 190 more than ν (and way more than ρ).

Case study 1: the Blocky model
The Blocky model consists of a sequence of layers with vertically increasing velocity values, surrounding two blocks of velocity anomalies which extend 4 m in horizontal and vertical directions (Fig. 2). The receivers are located in a regular grid with 1 m spacing in an area of 20 m  20 m (Fig. 2a). Sixteen shots (Fig. 2a) were chosen to generate the raw data after 195 optimising the source positions (explained in Section 2.1). The synthetic data were generated using a 3D-finite difference (FD) code (SOFI3D software described in Bohlen, 2002) and no error has been added to the data. The code is an FD modelling program based on the FD approach described by Virieux (1986) and Levander (1988)    The estimated 971 DCs are shown in Fig. 3a.   Figure 4 shows that straight-and curved-ray SWT have modelled the location and the value of the high-velocity anomaly quite accurately. The model from the curved-ray method is slightly superior at the grid blocks surrounding the high-velocity box (the black arrows in Fig. 4). In case of low-velocity anomaly, curved-ray SWT has provided better results, since the bottom half of the low velocity block is better resolved by the curved-ray approach (the white arrows in Fig. 4b and e). As 230 shown in Table 4, the curved-ray approach has produced slightly lower model misfit than straight-ray SWT.

Case study 2: the Sand Bar model
The Sand Bar model is designed to simulate a saturated environment where a sand layer is buried in unconsolidated clays.
The model contains a curved-shape high-velocity anomaly (the sand layer) embedded between two low-velocity clay layers (Fig. 5). The geophysical parameters of the Sand Bar model are shown in Table 2. The receivers are distributed at the surface 235 as a regular grid with 2 m spacing (Fig. 5a). We defined sources at the same location of receivers and for each source, we computed the aligned receiver pairs. Then, we picked 13 shots (Fig. 5a) which provided the highest data coverage to generate the synthetic data. The same finite difference code used for the Blocky model was used to obtain the Sand Bar synthetic dataset and no error was added to the synthetic data. The source is a function of Ricker wavelet with dominant frequency of 40 Hz, and the minimum element size of the mesh grid was set to 0.1 m to prevent numerical dispersion. The 240 time stepping was equal to 1.0e-5 to avoid simulation instability.   We can see in Fig. 7 that the VS models from both approaches are like each other. Figure 7 shows that both methods have successfully located the high-velocity anomaly. Not only the velocity values are close to the true model ( Fig. 5b to d), also the shape of the anomaly has been retrieved clearly. The vertical slices at X (Fig. 7a and d) and Y directions ( Fig. 7b and e) do not display significant differences. The areas marked in dashed black in the horizontal slices ( Fig. 7c and f) shows that the 260 boundaries of the anomaly are slightly clearer in the curved-ray approach (Fig. 7f) and also the VS values in these areas are closer to the true VS value (Fig. 5d).

Case study 3: Pijnacker field
The data were acquired in a field near Pijnacker, South Holland, Netherlands (Fig. 8a). An area of 27 m×30 m was investigated by 120 geophones and 59 shot locations (Fig. 8b). The shot positions were optimised (Section 2.1) for the 265 15 locations inside the array area. Fifteen shot locations which provided the highest coverage were chosen as the optimised shot positions. Moreover, 44 shot locations were chosen outside the acquisition area, each one at 3 m distance from every geophone located at the outer boundary of the acquisition area. The source was a vibrator that emitted a linear sweep signal from 2 to 100 Hz for 5 seconds at a force level of 1150 N. Some of the available shallow well data close to the field are depicted in Fig. 8c. They show that the field mainly consists of clay, together with peat and possibly sand in some locations. 270 The extracted972 DCs from the data are displayed in Fig. 8d.

275
Each inversion block extends 3 m horizontally. In this case, the initial model contains 6 layers where the layers get thicker with depth. The first two layers are 1 m thick, following by two layers of 2 m and two of 3 m. The initial model is defined regardless of the well information. The well data are used later to assess the inversion results. The inversion started from an initial VS value of 60 m s -1 . Since the medium was (almost) saturated, a high ν value (0.45) was chosen for the initial model.
The ρ values in the medium were assumed to be low (1700 kg m -3 ) because it consisted of unconsolidated materials. The 280 16 values of the initial model parameters are reported in Table 3. The values of the misfit function at different iterations are displayed in Fig. 9, and the SWT inversion results for straight-and curved-ray methods are depicted in Fig. 10.

290
We can see in Fig. 10 that also in this case the VS models from the straight-and curved-ray SWT are similar. The difference between the horizontal slices ( Fig. 10a and d) are clearer than the vertical ones. As shown in dashed black, we can see that the cells around the high velocity portion have lower VS values in the curved-ray (Fig. 10d) than straight-ray (Fig. 10a). A previous 2D full waveform study (Bharadwaj et al., 2015) on a clay-field which was not very far from the field location of our study, obtained a VS model in range of 40-80 m s -1 up to 15 m depth. This is in agreement with the inversion results 295 shown in Fig. 10. The high velocity portions relate to the sand. It can be seen in the vertical slices in Fig. 10 that the depth of high-velocity part (sand layer) is mainly in range of 2-9 m which seems reasonable given the a priori well data (Fig. 8c).

Case study 4: CNR field
The field data were acquired at National Research Council (CNR) headquarter in Turin, Italy ( Fig. 11a and b). The site consists of compacted sand and gravel formations surrounding an artificial loose sand body. The sand body occupies an area 300 of 5 m  5 m at the surface and the maximum depth reaches 2.5 m. The receiver layout consists of 4 lines which are 2.5 m apart. Each acquisition line includes 18 vertical 4.5 Hz geophones with 0.5 m spacing. The used source was an 8 kg hammer.
The acquisition was carried out in 83 shot locations. Figure 11c shows the acquisition setup which mimics a typical exploration 3D cross-spread acquisition scheme at a smaller scale. The estimated 315 DCs are shown in Fig. 11d.  Fig. 12. Both straight-and curved-ray SWT inversion started from the same initial model and the results are presented in Fig. 13.  Figure 13 shows that the difference between straight-and curved-ray models are more pronounced in this example. There are 320 some cells with relatively high velocity values inside the sand body in the model obtained from the straight-ray (Fig. 13a).
The boundaries of the loose sand body at the surface are better retrieved by the curved-ray SWT (Fig. 10d). The area shown in dashed white in Fig. 13b and e shows that the gradual increase of the VS values with depth inside the sand body is clearer in the model from the curved-ray (Fig. 13e). The black arrow in Fig. 13c shows the high velocity cells inside the loose sand body in the retrieved model from the straight-ray SWT. The reason is that in this area (close to the interface of the sand body 325 and the background medium) the assumed paths in the straight-ray approach are much shorter than the true paths and therefore, the obtained VS from the inversion becomes unrealistically high. This artefact does not exist in the corresponding slice from the curved-ray SWT (Fig. 13f).

Discussion
We have applied straight-ray and curved-ray SWT on four datasets and compared the results. In this section, we investigate 330 the results in more details considering ray paths, data weighting, models and data misfits, and computational cost.

Ray paths
The improvement of the model obtained by the curved-ray SWT with respect to straight-ray SWT, particularly at the boundaries of velocity anomalies has been shown in the synthetic and real world examples. Some selected examples of the computed ray paths at the last iteration of the curved-ray SWT are depicted in Fig. 14. 335 In all the three models in Fig. 14 the receivers A and B are located outside the velocity anomalies, and we see that the 340 computed ray paths between them do not cross the anomalies. Therefore, the obtained paths do not deviate considerably from straight lines. In Fig. 14a, the high-velocity anomaly exists at the depth range of 3-6 m. Hence, we can see that the high-frequency components of the DCs, which correspond to the shallow parts of the model, do not deviate from straight 22 lines. But the lower frequencies (i.e., higher wavelengths) for the C-D and G-H pairs have deviated from straight lines and travelled through the high-velocity parts. In Fig. 14b, the depth of velocity anomalies is in range of 2 to 6 m. We see that also 345 in this case the ray paths for higher frequencies have almost no deviations from straight lines since they do not cross the anomalies. However, we can see for the obtained paths between B-C and D-E pairs that the lower frequencies have bypassed the low-velocity anomaly. Similarly, lower-frequencies in case of the G-F pair have deviated from straight paths and travelled through the high-velocity anomaly. In Fig. 14c, the sand body (low-velocity anomaly) starts at the surface and reaches to maximum of 2.5 m depth. Its area shrinks from 5 m × 5 m at the surface to 3 m × 3 m at 2.5 m depth. The 350 shrinkage in size of the anomaly can be seen in the computed path for the B-G, C-D, and E-F pairs, where the degree of the deviation from the straight line decreases as the depth increases (frequency decreases).
Even though the exact boundaries of the anomaly (sand layer) are unknown for the Pijnacker field, the computed ray paths can provide helpful insights. For instance, the computed ray paths from straight-and curved-ray SWT, for the DCs data with the wavelengths in range of 6-9 m are displayed in Fig. 15. 355 Figure 15. The computed ray paths for the data points with wavelengths in range of 6-9 m for the Pijnacker field dataset from: (a) the last iteration of the straight-ray, (b) the last iteration of curved-ray SWT inversion. The colours of ray paths correspond to the computed pathaveraged phase velocity along the path.
Since the initial VS model is vertically and horizontally homogeneous, the initial ray paths for both straight-and curved-ray 360 SWT are straight lines. As shown in Fig. 15a, the ray paths do not change during the inversion in the straight-ray approach.
However, the paths are updated at every iteration of the curved-ray SWT inversion. We see that some areas in Fig. 15b (shown in dashed red) are bypassed by almost all the rays even though the data coverage of these areas in the straight-ray approach (Fig. 15a) is considerably high. Therefore, these portions correspond to the low velocity materials, i.e., clay and peat. The area between these low-velocity portions has both higher concentration of ray paths and higher average phase 365 velocity values. Therefore, they probably show the sand layer. These locations agree with the obtained VS model from the curved-ray SWT inversion (Fig. 10d).

Misfits and computational costs
We have shown the inversion results from the straight-ray and curved-ray SWT. In this part, we compare the results quantitatively. We carried out the inversions on 40 cores on a cluster with the processor type of Intel® Xeon® E5-2650 v3. 370 In Table 4, we report the number of iterations (ni), the running time (rt), the maximum memory consumption during the inversion process (memmax), the relative data misfit at the last iteration of the inversion (ed), the relative model misfit for all cells (em) and only the ones in the depth range of the target (et), and the cost of inversion to be run at Microsoft cloud service.
We define the relative data misfit ( For each parameter in Table 4, the last column shows the relative difference between the curved-ray and straight-ray show the value of each parameter from curved-ray and straight-ray approaches, respectively, and J is the number of examples for which there is an existing value for that parameter. We can see in Table 4 that in all examples except for the Blocky model, curved-ray SWT has converged in less iterations than the straight-ray. However, the curved-ray SWT has increased rt by 25 % compared to straight-ray. For all the case studies, curved-ray SWT needed more memory (11 % by average) than straight-ray. In terms of data misfit (ed), straight-ray approach has provided better performance than the curved ray. We can also see that the difference between the obtained ed values from the straight-and curved-ray SWT is negligible in case of the synthetic examples (the Sand bar and Blocky 390 models), but the difference is more pronounced for the field examples (the CNR and Pijnacker). Despite having higher ed, the curved-ray approach has produced lower model misfits than straight-ray. Using curved-ray SWT has decreased the overall model misfit (em) and the target model misfit (et) by 2 % and 4 %, respectively. Finally, we see in Table 4 that using curved-ray SWT has increased the computational cost by an average of 25 %.

Impact of the data coverage 395
In all the presented examples, the computed VS models from the straight-ray and curved-ray SWT do not differ significantly except for the CNR field. Since the source positions had not been optimised in this example, the DCs coverage is low, particularly in the shallower portion of the medium. Figure 14 depicts the ray paths of the DCs data with the wavelengths in range of 0-1 m. Figure 16. The ray paths for the data points of CNR example with wavelengths in range of 0-1 m from: (a) the last iteration of the straightray, (b) the last iteration of curved-ray SWT inversion. The boundaries of the sand body at the surface are superimposed in blue. The colours of ray paths correspond to the computed path-averaged phase velocity along the path.
We can see in Fig. 16a that some areas of the medium are not covered with straight-rays, especially outside the sand body. It should be noted that for both cases, the ray paths at the first iteration are straight lines since the initial model has a constant 405 VS value for all cells. However, in the curved-ray approach (Fig. 16b) the ray paths are flexible and can adjust to the updated subsurface velocity during the inversion process. Figure 16b also shows that the ray paths have been responded properly to the edges of the loose sand body, travelling through the faster part of the model.

Weighting effect
As mentioned previously, uneven sampling of DC data in terms of wavelength can be problematic in SWT inversion. For 410 instance, most of the extracted DC data (81 %) of the Pijnacker field have wavelengths less than 3 m while the available well data from the area suggest that the depth of the target is expected to vary in range of 2-7 m. This can be a serious problem since the inversion might reach the defined stopping criteria without any significant updates in the deeper portion of the initial velocity model. Fig. 17 shows the obtained VS models with ( Fig. 17a and b) and without ( Fig. 17c and d) the wavelength-based weighting at 5.5 m depth. 415 Figure 17. The impact of weighting on the SWT inversion results for the Pijnacker dataset. The computed VS model at the depth of 5.5 m from: (a) weighted straight-ray, (b) weighted curved-ray, (c) non-weighted straight-ray, (d) non-weighted curved-ray SWT.
In Fig. 17, we can see the improvement of the model after applying the wavelength-based weighting method ( Fig. 17a and b) compared with the non-weighted model ( Fig. 17c and d) where the non-weighted inversions have barely retrieved any 420 pattern to model the target (sand layer).

Comparison with previous studies
We have shown that optimisation of source positions can provide higher data coverage than a typical 3D cross-spread acquisition scheme. We have evaluated straight-ray and curved-ray at the near-surface scale and comparing our results might not be necessarily agree with previous (global or regional scale) seismological studies. For instance, Spetzler et al. (2001) 425 pointed out that the maximum deviations of ray paths from straight lines is mostly below the estimated resolution. However, we have shown that the deviation from straight lines can be resolved at the near-surface scale (Fig. 14). We also showed that in case of low data coverage, using curved-ray approach significantly improves the obtained VS model. This result might not agree completely with the result of the (seismological) study by Trampert and Spetzler (2006) where they pointed out that increasing the data coverage is the main factor to increase the resolution of the model. Our results show that in case of high 430 data coverage, the gained model improvement in curved-ray approach may not worth the additional computational effort, which agrees with the result of the study by Bozdag and Trampert (2008).

Conclusions
We have applied SWT to four datasets and built near-surface VS models. We have compared the obtained results from the straight-ray and curved-ray SWT in terms of data misfit, model misfit, and computational cost. We showed that compared to 435 the straight-ray approach, using curved-ray SWT improves the accuracy of the computed VS model. We illustrated that the acquisition layout can play an important role in the obtained data coverage and consequently in the inversion results. We showed that the classical cross-spread acquisition layout (which was used in the CNR example) may not provide high DC coverage. In this case, the improvement of inversion results from curved-ray SWT can be significant. We also showed that in case of high data coverage, which can be achieved by optimisation of source positions, the difference between the obtained 440 VS models from the straight-ray and curved-ray can be very small even in the presence of high lateral variation of the velocity.
Code and data availability. The data may be available by contacting the corresponding author. The code may also be available by contacting the corresponding author after carrying out some additional work to make it user-friendly.

445
Author contributions. MK worked on the processing and inversion of all the datasets, with coordination and supervision of ES and LVS. MK and ES contributed to seismic data acquisition in Pijancker field and LVS coordinated the seismic data acquisition in the CNR site. MK wrote the original paper draft, with contribution from all the authors.
Competing interests. The contact author declares that neither him nor his co-authors have any competing interests. 450