Anisotropic P-wave traveltime tomography

. We present the implementation of Thomsen's weak anisotropy approximation for VTI media within TOMO3D, our code for 2-D and 3-D joint refraction and reflection traveltime tomographic inversion. In addition to the inversion of seismic P-wave velocity and reflector depth, the code can now retrieve models of the Thomsen's parameters δ and ε . Here we test this new implementation following four different strategies on a canonical synthetic experiment in ideal conditions with 10 the purpose of estimating the maximum capabilities and potential weak points of our modeling tool and strategies. First, we study the sensitivity of traveltimes to the presence of a 25 % anomaly in each of the parameters. Next, we invert for two combinations of parameters, ( v, δ, ε ) and ( v, δ, v ⊥ ), following two inversion strategies, simultaneous and sequential, and compare the results to study their performances and discuss their advantages and disadvantages. Simultaneous inversion is the preferred strategy and the parameter combination ( v, δ, ε ) produces the best overall results. The only advantage of the 15 parameter combination ( v, δ, v ⊥ ) is a better recovery of the magnitude of v . In each case we derive the fourth parameter from the equation relating ε , v ⊥ and v . Recovery of v , ε and v ⊥ is satisfactory whereas δ proves to be impossible to recover even in the most favorable scenario. However, this does not hinder the recovery of the other parameters, and we show that it is still possible to obtain a rough approximation of δ distribution in the medium by sampling a reasonable range of homogeneous initial δ models and averaging the final δ models that are satisfactory in terms of data fit.


Introduction
An isotropic velocity field is the rare exception in the Earth subsurface.Anisotropy is a multiscale phenomenon, and its causes are diverse.In the crust, it can be produced by the preferred orientation of mineral grains or their crystal axes (Schulte-Pelkum and Mahan, 2014;Almqvist and Mainprice, 2017), the alignment of cracks and fracture networks and the presence of fluids (Crampin, 1981;Maultzsch et al., 2003;Yousef and Angus, 2016), or the bedding of layers much thinner than the wavelength used to explore them (Backus, 1962;Johnston and Christensen, 1995;Sayers, 2005).In the mantle, anisotropy is related to the alignment of olivine crystals due to mantle flow (Nicolas and Christensen, 1987;Montagner et al., 2007), aligned melt inclusions (Holtzman et al., 2003;Kendall et al., 2005), large-scale deformation (Vinnik et al., 1992;Vauchez et al., 2000), and preexisting lithospheric fabric (Kendall et al., 2006), among others.Anisotropy has proven an informative physical property in the understanding of the Earth's interior (Ismaïl and Mainprice, 1998;Long and Becker, 2010), most particularly in continental rifts (Eilon et al., 2016), mid-ocean ridges (Dunn et al., 2001), and subduction zones (Long and Silver, 2008).
The theory of anisotropic wave propagation has been described in several publications (Kraut, 1963;Babuska and Cara, 1991).Numerous formulations of varying complexity have been proposed to approximate anisotropy depending on its magnitude and the symmetry conditions of the medium (Nye, 1957).Twenty-one elastic stiffness parameters define the most general anisotropic medium with the lowest symmetry conditions, whereas for the highest symmetry not equivalent to isotropy, only five parameters are needed (Almqvist and Mainprice, 2017).Regarding the strength of anisotropy, in view of the overall success of isotropic methods in studying the Earth's subsurface, and of the experimental evidence and sample measurements available, it is admitted that the anisotropy is generally weak (Thomsen, 1986).Specifically, anisotropy is considered weak when Thomsen's parameters are much smaller than 1, i.e. for a ~20 % or smaller velocity variation with angle.Precisely Thomsen (1986) presented the formulation for the transverse isotropy symmetry on weakly anisotropic media, which is the reference that we follow in this work.Thomsen's parameters are by far the most common and convenient combinations of stiffness tensor elements used in seismic anisotropy modeling (Tsvankin, 1996;Thomsen and Anderson, 2015).Applications of simpler approximations exist, as is the case of elliptical symmetry (Song et al., 1998;Giroux and Gloaguen, 2012), as well as others that assume the most general anisotropic model (Zhou and Greenhalgh, 2008).
The objectives of this work are (1) presenting the anisotropic version of TOMO3D (Meléndez et al., 2015) for the study of VTI weakly-anisotropic media in terms of the Thomsen's parameters δ and ε using P-wave arrival times, and (2) comparing several parametrizations and inversion strategies under optimal and equal conditions for all parameters with the purpose of defining an upper limit to the code's capabilities, an ideal but generalizable estimation of the code's performance, as well as highlighting its potential weaknesses.Moreover, the development of this code is motivated by the need to combine wideangle and near-vertical traveltime picks in field data applications, as we plan to do with the trench-parallel 2-D profile in Sallarès et al. (2013), which is affected by a ~15 % anisotropy judging from the mismatch in the interplate boundary locations obtained separately from near-vertical and wide-angle data.The P-wave velocity, δ and ε models obtained from the modeling of field data would be useful and geologically informative by themselves, but they would also serve as initial models in anisotropic FWI.
In the following section, we describe the anisotropy formulation and the modifications implemented on our 3-D joint refraction and reflection traveltime tomography code TOMO3D to incorporate the inversion of Thomsen's parameters.Next, in the third section, we present the synthetic tests performed and their results, including accuracy and sensitivity analyses and synthetic inversions.In section four, these results are discussed and interpreted in terms of the ability of the code to retrieve both the velocity field and Thomsen's anisotropy parameters.Finally, in the last section we summarize the main conclusions of this work.

Modeling anisotropy
The first part of this section is a general overview of the treatment of anisotropy within the field of seismic inversion, while the second one describes the implementation of Thomsen's weak VTI anisotropy formulation in the 3-D joint refraction and reflection traveltime tormography code TOMO3D.

Anisotropy in seismic inversion methods
Anisotropy was first incorporated to seismic inversion methods in traveltime tomography with the development of the linearized perturbation theory (Cervený, 1982;Cervený and Jech, 1982;Jech and Psencik, 1989).Previously, the approach to deal with anisotropy was to approximately remove its estimated effect to then apply an isotropic method (e.g.McCann et al., 1989).Linearized perturbation theory was first implemented in anisotropic traveltime tomography by Chapman and Pratt (1992) and Pratt and Chapman (1992) assuming the weak anisotropy approximation, which allowed them to use isotropic ray tracing and approximate anisotropy effects as being caused by small perturbations of the isotropic system.The initial development of anisotropic ray tracing is attributed to Cervený (1972).Methods for anisotropic ray tracing and traveltime computation depend on the symmetry assumptions made regarding the medium.The most common of those is rotational symmetry around a vertical pole.This formulation is known as vertical transverse isotropy (VTI), also polar anisotropy (e.g.Rüger and Alkhalifah, 1996;Alkhalifah, 2002), and it is the simplest geologically applicable case: it reproduces the symmetry exhibited by minerals in sedimentary rocks, and that produced by parallel cracks or fine layering.Furthermore, it significantly simplifies the mathematical formulae since anisotropy is defined by only five parameters, which contributes to a greater computational efficiency.The generalization of VTI to a tilted symmetry axis is the so called tilted transverse isotropy (TTI).Some authors argue that it is not possible to distinguish TTI from VTI in real experimental cases without a priori information (Bakulin et al., 2009).Assuming the most general anisotropic media has also become rather usual, in particular with the improvement of computational resources, allowing for a more detailed and complex reconstruction of the subsurface physical properties (e.g.Zhou and Greenhalgh, 2005) although successful field data applications are yet to be achieved to the best of our knowledge.Regarding the inversion process, the main difficulty arises from the trade-off between velocity heterogeneity and anisotropy (e.g.Bezada et al., 2014).To the best of our knowledge, Stewart (1988) was the first to propose an inversion algorithm, specifically for the recovery of Thomsen's parameters in a weakly-anisotropic VTI medium.Other authors have produced inversion algorithms for different formulations such as azimuthal anisotropy (e.g.Eberhart-Phillips and Henderson, 2004;Dunn et al., 2005) or a 3-D TTI medium (e.g.Zhou and Greenhalgh, 2008).
Concerning FWI, anisotropy in active data is typically modeled following Thomsen's parameters and the VTI and/or TTI approximation for the medium.The first anisotropic wave propagators appeared during the 80s and 90s (e.g.Helbig, 1983;Alkhalifah, 1998) and new improvements on this matter continue today (e.g.Fowler et al., 2010;Duveneck and Bakker, 2011).When performing anisotropic full waveform inversion (FWI), both 2-D and 3-D, some authors choose to invert only for the velocity field, fixing the initial anisotropy models throughout the inversion because it simplifies the process (e.g.Prieux et al., 2011;Warner et al., 2013).However, other works have explored the feasibility of multiparameter inversions, that is, using different combinations of velocity and anisotropy parameters and of inversion strategies (e.g.Gholami et al., 2013a,b;Alkhalifah and Plessix, 2014).

Anisotropy in TOMO3D: Thomsen's weak VTI anisotropy formulation
We adapted TOMO3D (Meléndez et al., 2015) to perform anisotropic ray tracing and traveltime calculations, as well as inversion of Thomsen's parameters for P-wave data.In TOMO3D, the forward problem solver is parallelized to simultaneously trace rays for multiple sources and receivers, and it uses an hybrid ray tracing algorithm that combines the graph or shortest path method (Moser, 1991) and the bending refinement method (Moser et al., 1992).The inverse problem is solved sequentially using the LSQR algorithm (Paige and Saunders, 1982).Velocity models are discretized as 3-D orthogonal and vertically-sheared grids that can account for topography and/or bathymetry.Velocity values are assigned to the grid nodes, and the velocity field is built by trilinear interpolation within each cell.Apart from first-arrival traveltimes, the code allows for the inversion of reflection traveltimes to obtain the geometry of major geological boundaries associated to impedance contrasts that produce strong seismic energy reflections in the data recordings.Such reflecting interfaces are modeled as 2-D grids independent of the velocity grid.The code is also prepared to extract information from the water-layer multiple of refracted and reflected seismic phases (Meléndez et al., 2014).A detailed description of the code can be found in Meléndez (2014).
Our anisotropy formulation is based on Thomsen (1986) and specifically in the following weakly-anisotropic velocity equation for the P-wave velocity: (1) where v a is the anisotropic velocity, v is the velocity along the symmetry axis (α 0 in Thomsen,1986), θ is the angle with respect to the symmetry axis, and δ and ε are Thomsen's parameters controlling the anisotropic P-wave propagation.
Studying the cases of θ=0 and θ=π/2 the meaning of ε becomes clear: it is the relative difference between the velocities along and across the symmetry axis, that we refer to as parallel and perpendicular velocities, respectively.
(2) According to Thomsen (1986) the meaning of δ is far from intuitive but the author states that it is associated to the nearvertical anisotropic response and shows that it relates v and the normal move-out velocity (V NMO ).V NMO models are a byproduct of multichannel seismic reflection data processing, a mathematical construct that involves the assumptions of a stratified media with constant velocity layers and of small spread, i.e. near-vertical propagation.It does not seem wise to try estimating it by other means, less so if the data and modeling used do not necessarily fulfill the assumptions for the normal move-out correction that define V NMO .Moreover, we want to combine traveltimes from as many types of seismic data sets as possible, notably from multichannel reflection (near-vertical propagation) and wide-angle (sub-horizontal propagation) experiments, in order to have the best polar coverage, and with that, the best recovery of v and ε (or v ⊥ ).Thus, we do not consider V NMO to be a useful parameter in describing the general anisotropic VTI media for our modeling method, and we did not implement parametrizations (v, V NMO , ε) and (v, V NMO , v ⊥ ).We do think however that V NMO models can help in the building of initial δ models, just as the comparison between near-vertical propagation and sub-horizontal data can provide an initial estimation of ε.In conclusion, we only considered Eq. ( 2), and we implemented two parametrizations of the medium: (v, δ, ε) and (v, δ, v ⊥ ).From here on, for simplicity, we will refer to them as P[ε] and P[v ⊥ ] respectively.
The linearized inverse problem matrix equation including anisotropy parameters for a refraction-only case is as follows (3) Smoothing (L) and damping (D) constraints for δ and ε parameters follow the same formulation described in Meléndez (2014) for velocity parameters.The kernels (G) have been modified to account for anisotropy.The linearized and discretized equation that relates the traveltime residual of the n-th refracted pick to changes in the model parameters is written as (4) In each of the three terms, the first summation corresponds to the model cells that are illuminated by the n-th ray path.A cell is considered illuminated if it contains a ray path segment.The second summation is over the eight nodes of each of those cells.In the third term, εis replaced by v ⊥ when using this alternative parametrization.
is the traveltime residual for the nth refracted pick, u a =1/v a is the anisotropic slowness, u=1/v is the along-axis slowness, Δu i , Δδ j , and Δε k (or ) are the parameter perturbations for each illuminated cell in their respective grids, and the r m factors are the weights that distribute these perturbations among the eight nodes of each illuminated cell according to the trilinear interpolation used to define the four fields (u, δ, ε, and v ⊥ ).
In order to build the kernel matrices we need to compute two partial derivatives for each model parameter.The first order partial derivative of traveltime with respect to the anisotropic slowness is the ray path segment s i within each cell that is covered in a given traveltime at a given slowness From Eq. ( 1) the first order partial derivatives of the anisotropic slowness with respect to the model parameters (u, δ, ε, and (10)

Synthetic tests
We have performed a number of tests using canonical synthetic models made of an anomaly centered in a uniform background with two main objectives: (1) checking that the newly implemented anisotropic traveltime tomography method works properly, and (2) providing a quantitative measure of the potential recovery of anisotropy based on P-wave traveltimes alone.First, we run a sensitivity test to assess the effect that a variation in each model parameter has in the synthetic traveltimes, and we calibrated the code by comparing the synthetic data that it generates to analytically calculated data.Next, we performed a number of synthetic inversion tests considering both possible parametrizations and inversion strategies.
These tests are conducted under ideal and equal conditions for all parameters with the purpose of obtaining an upper-limit but widely-applicable estimation of the code's performance, and detecting any potential weak points.
The models in all these tests are cubes of 5-km-long edge.The background model of all four parameters is set to a constant value, i.e. v, δ, ε and v ⊥ background models are homogeneous.Note that the z axis positive direction points downwards.Grid spacing is 0.125 km for the four parameters in all three dimensions so that differences in model discretization do not influence the test results.The volume of the anomaly is determined by the 3σ region of a 3-D Gaussian function centered in the cube setting 3σ = 0.5 km.The values of v, δ, ε and v ⊥ within this volume are homogeneously increased resulting in a discretized representation of a spherical anomalous body.

Sensitivity
We define the sensitivity of a parameter as the difference between the first arrival traveltimes with and without the anomaly in that parameter.Figure 2 shows synthetic and analytic sensitivities for two selected meridians.We express sensitivity both as normalized and as relative traveltime difference.In its normalized form, traveltime differences are divided by the greatest of these differences among all parameters, i.e. normalized to 1, whereas in its relative form, these differences are given with respect to the traveltime without the anomaly and multiplied by 100.Note that the analytic response does not change between meridians, i.e. given the symmetry of the models and of the anisotropic formulation, sensitivity is independent of the azimuth angle.
The acquisition configuration consists of 482 diametrically-opposed source-receiver pairs.Each receiver records exclusively the first arrival traveltime from its corresponding source for a total of 482 traveltimes (Fig. 1).Sources and receivers are located at 2.5-km of the center of the model, at the locus defined by the surface of the sphere inscribed in the cube, and placed at the crossing points of 32 meridians with 15 parallels, and at each pole.
According to the definition of sensitivity, alternately for v, δ, and ε, the said anomaly was added at the center of the cube representing a 25 % increase on the background value, while the models for the rest of parameters in the parametrization remained homogeneous.Table 1 summarizes the background and anomaly values for all parameters in each sensitivity test, along with their equivalence in the alternative parametrization.Since v is related to ε and v ⊥ through Eq. ( 2) its sensitivity pattern changes depending on the parametrization used (Fig. 2).Indeed, a 25 % increase in v with respect to equivalent background models in P[ε] and P[v ⊥ ] (Table 1) yields different sensitivities because of the different parameters involved (ε or v ⊥ ) in the representation of the medium.Contrarily, δ sensitivity pattern is independent of the parametrization used.ε and v ⊥ sensitivities are only defined in their respective parametrizations.However, a 25 % increase in v ⊥ yields an equivalent ε of 0.45 which is greater than the ~0.2 limit for weak anisotropy approximation.Thus, instead of establishing the comparison with v ⊥ sensitivity based on a proportional anomaly increment and measuring its effect on traveltimes, we based it on an equal traveltime change, i.e. the same change in traveltime requires a change of 25 % in ε but only a ~3.4 % change in v ⊥ (Table 1), indicating that data is ~7 times more sensitive to v ⊥ than to ε changes.
v sensitivity in P[ε] is the highest for all angles, 4.5 % to 5 % (Fig. 2), and it can be shown that expressed in its relative form, the analytic solution is a constant 5 % (see mathematical proof in supplementary material).In its normalized form it follows the same sinusoidal pattern as in P[v ⊥ ] (Fig. 2e), and both have equal maxima in the directions parallel to the symmetry axis.
In both cases minima are found in the directions perpendicular to the symmetry axis, but in P[v ⊥ ] they reach down to 0 whereas in P[ε] the value is ~0.85.As expected from Eq. ( 1), ε sensitivity goes to 0 % in the directions parallel to the symmetry axis and has its maxima (~0.8 % or ~0.15) for the polar angles perpendicular to it.v ⊥ sensitivity would follow the same angular dependence but with maxima of the order of magnitude of v sensitivities.Finally, δ sensitivity is, at its maxima, more than one order of magnitude smaller than v sensitivity, around 0.25 % or 0.05.These sensitivity results indicate that we can generally expect similar recoveries for v and v ⊥ , better than for ε, and that retrieving δ might prove complicated.Keep in mind that these sensitivities for v, δ, and ε are produced by an anomaly that represents a 25 % increase with respect to the background value, and that an anomaly in v ⊥ would produce the same sensitivity pattern as ε with only a ~3.4 % increment.
The differences in synthetic sensitivities between meridians arise from the discretization of the model space in a Cartesian system of coordinates.Such approximation inevitably defines privileged directions for ray tracing, and consequently produces differences in synthetic traveltimes.The mismatch between synthetic and analytic sensitivities (Fig. 2) occurs because the discretization used cannot represent the surface of a perfect sphere.These effects are most notable in the v sensitivity, in P[ε] and to a lesser extent in P[v ⊥ ], precisely because it is the most sensitive parameter, and thus the errors in the representation of a sphere and the existence of privileged directions have a much larger influence on the calculated traveltimes.Figure S1 shows how refining the v model reduces the relative traveltime error (Fig. S1a), and generates a more accurate sensitivity pattern (Fig. S1b).In a real case study one can always refine the grid spacing of a particular parameter to achieve better accuracy, but here we wish to test the performance of the code in the modeling and recovery of each parameter under the same conditions, i.e. equivalent anomalies and identical model discretization.

Accuracy
For the four simulations in the sensitivity analysis, we compared the synthetic traveltimes obtained with our code to the analytic solution to quantify the accuracy of the code's performance.The comparison along the two selected meridians at 0 rad and π/4 rad azimuths is displayed in Fig. 3 expressed as relative traveltime error, i.e. the difference between synthetic and analytic traveltimes relative to the latter in percentage.Table 2 contains the mean of the relative traveltime errors and their respective mean deviations for these four tests and along the two selected meridians, as well as the overall values for each of them.
Comparison of Fig. 2a,b with the values in Table 2 and Fig. 3 indicates that the forward calculation of traveltimes is accurate enough with respect to the traveltime residuals expected for the selected anomalies.Sensitivity is at least 5 times and up to two orders of magnitude greater than traveltime accuracy errors depending on the parameters, with the exception of angles for which sensitivity tends to zero. Figure S2 illustrates that the code is able to reproduce nearly identical accuracies using both parametrizations.Furthermore, given that we are using the same synthetic traveltimes that we used for the sensitivity analysis, this also implies that the sensitivity patterns obtained with the alternative parametrization would be virtually equal to those in Fig. 2.

Inversion results
For the inversion tests, we considered a synthetic medium defined by the anomaly models of all four parameters.Here we refer to these models as target models, and the goal of the inversion is to retrieve the heterogeneity in each of them.These tests are conducted for the two parametrizations of the anisotropic medium described in section 2: P[ε] and P[v ⊥ ].Note that, in order to perform the inversion tests on equivalent cases for both parametrizations, the heterogeneity in v ⊥ is calculated with Eq. ( 2) considering the 25 % anomalies in v and ε, which yields a ~29.3 % anomaly in v ⊥ (Table 3).If not indicated otherwise, we use background models as initial models.Finally, we study the potential recovery of δ because inverting this particular parameter proves notoriously difficult due to its low sensitivity (Fig. 2).
The synthetic data set is made of 114 sources each recorded at 113 receivers for a total 12,882 first arrival traveltimes.For the acquisition geometry, again sources and receivers are located at the surface defined by the sphere inscribed in the cube (Fig. 1).The 114 positions at the surface of this sphere are shared by sources and receivers, and each receiver records all sources, except for the one source located at its same position.
For both parametrizations, we compared two inversion strategies: simultaneously inverting for all parameters and a two-step sequential inversion.First, in Figs 5 and 6 we show the best results for the simultaneous inversion strategy.For each parametrization we derived the fourth parameter applying Eq. ( 2).
Table 4 shows several statistical measures to quantify the quality of these inversion results.As a measure of data fit improvement, we provide the RMS of traveltime residuals for the first and last iterations.As a measure of model recovery or fit, for each parameter, we calculate the mean relative difference for the background area between the inverted model and either the target or the initial one, as they are identical in this area, as well as for the anomaly area comparing the inverted model to both the target and the initial models.In the case of a perfect recovery, mean relative difference for the background area would be 0 %, whereas for the anomaly area, it would be 0 % when using the target model as a reference, and 25 % (~29.3 % for v ⊥ ) when comparing to the initial model.
In an attempt to improve the recovery of δ, we repeated these two tests for different values of the smoothing constraints, but it proved impossible.Correlation lengths tested for all four parameters include 0.25 km (twice the grid spacing), 0.5 km, and 1 km.The weights of the smoothing submatrices for each parameter, λ in Eq. ( 3), were varied between 1 and 100, with intermediate values of 2, 5, 10, 20, 30, and 60.For successful inversions, very similar results for the other parameters were obtained regardless of the final δ model.In other words, the low sensitivity of δ makes it extremely hard, if not impossible, to recover this parameter from traveltime data, but for this same reason it has little or no influence on the recovery of v and ε The two-step sequential inversion strategy was also tested for both parametrizations P[ε] and P[v ⊥ ].For the first step, we tested two options: (a) inverting for v while fixing δ and ε or v ⊥ and (b) fixing only δ.In the second step we used the inverted models from step 1 as initial models and tested three options: (c) inverting for all three parameters, (d) fixing only δ, when following option (a) in step 1, and (e) fixing v and/or ε or v ⊥ , when following option (b) in step 1. Figure 4 shows a flowchart describing all the sequential inversion combinations tested.Again smoothing constraints were varied for similar correlation lengths and submatrix weights as detailed for the simultaneous inversion case.
As indicated in Fig. 4, in the case of P[ε], the best result (Fig. 7) was obtained inverting for v and ε while fixing δ in the first step, and fixing only ε in the second step, whereas for P[v ⊥ ], the best combination for the two-step inversion (Fig. 8) was fixing only δ in step 1, and inverting for the three parameters in step 2. Tables 5 and 6 summarize the statistical quantification of data and model fit for each parametrization.

Modelling δ
Observing that good results for v and ε or v ⊥ are achieved regardless of the result in δ, and knowing that the sensitivity of δ is notably smaller than that of the other parameters, we explored a strategy to have an estimate of this parameter.First, as a reference, we considered an unrealistically optimal scenario in which the real v and ε or v ⊥ models are known to us.Figs 9 and 10 show the resulting δ achieved by repeating inversions in Figs 5 and 6 with v and ε or v ⊥ target models as initial models.Table 7 summarizes the traveltime residuals RMS and the mean relative difference for each parameter in these inversions.Again, these two tests were repeated for ranges of smoothing constraints in all four parameters, as described for the cases in Figs 5 and 6.Table 7 and Figs 9 and 10 correspond to the best results obtained, which indicate that the recovery of δ is, at best, extremely complicated due to the limited sensitivity of traveltime data to changes in this parameter.
Next, we decided to try neglecting δ in Eq. ( 1), and we repeated a number of inversions, such as the ones displayed in Figs 5 and 6, following the equation ( 11) The purpose of these tests was checking whether it was possible to invert v and ε or v ⊥ with data generated following Eq.( 1) using the approximation in Eq. ( 11) given that the influence of δ on the results for other parameters is rather small, that δ cannot be accurately retrieved from traveltime alone, and that it has the smallest sensitivity.To do so, a homogeneous model of δ=0 was fixed throughout the inversions.These tests were unsuccessful, with noticeably poorer results than when considering a dependence on δ (Table 8).However, they were useful in proving that even if a detailed δ model is not necessary to successfully retrieve the other parameters, at least a rough approximation of the δ field is needed to recover the other parameters, e.g. the background δ model that we used as initial model in inversions displayed in Figs 5 and 6.
Finally, we tested whether it would be possible to obtain at least this rough approximation of δ in the medium, valid in the sense that it allows for the successful recovery of the rest of parameters, using any a priori information available such as compilations of anisotropy measurements (e.g.Thomsen, 1986;Almqvist and Mainprice, 2017).Once again we repeated inversions from Figs 5 and 6 (initial δ = 0.16), now using different homogeneous initial models for δ within a range of possible values from 0.1 to 0.24.Table 9 contains  The rough estimate of the δ field could be built, for instance, as the average of the mean δ values for the inverted models in the central subranges defined by the change in magnitude of the final RMS of traveltime residuals.One final inversion could be run using a homogeneous initial δ model with this average value, with the additional option of fixing it and inverting only for the other two parameters.As mentioned in subsection 2.2, potentially more detailed initial δ models could be obtained from the normal move-out correction of near-vertical reflection seismic data.

Discussion
We have tested two parametrizations of the VTI anisotropic media, P[ε] (v, δ, ε) and P[v ⊥ ] (v, δ, v ⊥ ), and two inversion strategies, simultaneously inverting for all three parameters and a two-step sequential process fixing some of the parameters in each step.We consider three criteria for evaluating and comparing the quality of the inversion results obtained following the four possible combinations of strategies and parametrizations: visual inspection of the results, as well as traveltime data and model fits.For both inversion strategies and both parametrizations, the recovery of the parameters is positively correlated with their respective sensitivities; the more sensitive parameters are systematically better recovered.
For the simultaneous inversion, both parametrizations were able to produce acceptable final results (Figs 5 and 6).According to our tests, P[ε] provides the best outcome, specifically because data and model fits (Table 4) are better for this option but, more importantly, because of the difference in the quality of the recovery of ε.However, visual comparison of the recovery of v as well as the v model fit in Table 4 indicate that P[v ⊥ ] yields a slightly better result regarding the magnitude of the anomaly for this parameter, as is particularly evident at its center.In P[ε], given the disparity in sensitivities between v and ε, the former might be prone to accounting for data misfit that actually corresponds to the latter, whilst in P[v ⊥ ] this disparity is less acute.The boundaries of the anomalies for both velocities are still better retrieved with P[ε].Also, the recovery of δ, even though it is far from acceptable, is significantly better in the case of P[ε].This might be explained because the disparity of δ sensitivity is greater with respect to the other two parameters in P In general, sequential inversion is a more complex process that requires more human intervention and fine tuning in each step.In addition, fixing some of the parameters in the first step may result in the inverted parameters artificially accounting for part of the data misfit that is actually related to the fixed ones.This can more easily lead convergence into a local minimum, and it might be impossible to correct this tendency in the second step.For this inversion strategy, it is also P[ε] that produces the best results (Figs 7 and 8).Data fit and the model fit of ε are slightly better than for the simultaneous inversion of this parametrization (Tables 4 and 5), whereas model fits and the visual aspect of both velocities are almost identical to those obtained by simultaneously inverting all three parameters (Figs 5 and 7).Visually, it is difficult to decide whether the recovery of ε is better or not than for the simultaneous inversion (Figs 5 and 7).As for δ, recovery is unsuccessful and artifacts appear in the background area of the model but, according to both its model fit and its visual aspect, it is notably better than for the simultaneous inversion.As in the case of simultaneous inversion, the only advantage of using P[v ⊥ ] instead of P[ε] is that it yields a better recovery of the anomaly magnitude of v (Tables 4 and 6).The results for both velocities are virtually identical to those obtained by simultaneous inversion of this parametrization (Figs 6 and 8).δ and ε are not properly retrieved, but the results are significantly better than for the simultaneous inversion of this same parametrization.
δ has been shown to be by far the most complicated parameter to retrieve because of the low sensitivity of traveltime to its variation (Fig. 2).Even when excellent v and ε or v ⊥ models are available, i.e. the target models for these parameters in our synthetic tests, the recovery of δ is limited at best (Figs 9 and 10 and Table 7).However, and for the same reason, poor recoveries of δ do not affect the recovery of the other two parameters, meaning that a detailed δ model is not necessary to satisfactorily retrieve v and ε or v ⊥ (Figs 5-10).Still, our inversion tests also proved that neglecting δ in Eqs ( 1) and ( 3) is not an option, the accuracy in the recovery of the other parameters resulting severely affected (Table 8).Thus, even if a detailed inversion of δ is, at the very least, hard to achieve, and it is not needed for a successful result in the other parameters, some sort of simple, even homogeneous, initial δ model with a value or values about the average δ in the medium, is necessary for a good recovery of the other parameters.
We showed that given some a priori information on the range of possible δ values in the medium, it should be possible to create the necessary initial δ model.In order to illustrate this, we chose a range of δ values for the initial model and rerun the inversions in Figs 5 and 6.The results indicate that for any homogeneous initial δ model in a certain subrange close to the actual average δ value of the medium, the results for v and ε or v ⊥ are satisfactory and virtually identical to those of the original inversions (Table 9).This subrange is easily defined by looking at the final RMS of traveltime residuals, which experiences a notorious change of one order of magnitude.Any model within this subrange works similarly well as initial δ model.Alternatively, a possibly more robust selection of the constant value for a homogeneous initial δ model might be the mean (or also the median or the mode) of the mean δ values for the inverted models in this subrange.It is worth noting that, whereas for the purpose of this work we used the same discretization for all parameters, in a real case study it would probably be recommendable to use a coarser discretization for δ than for the other parameters, and in general a finer discretization for the more sensitive parameters (Fig. S1).Indeed, a heterogeneity of a given spatial scale and relative variation will produce a greater effect on data for a parameter of greater sensitivity.Thus, for a parameter of greater sensitivity, it will be easier for the code to identify smaller heterogeneities both in scale and variation, which will require a finer grid.

Conclusions
We have successfully implemented and tested a new anisotropic traveltime tomography code.For this implementation we had to modify both the forward problem and the inversion algorithms of the TOMO3D code (Meléndez et al., 2015).The forward problem was adapted to compute the velocities observed by rays considering Eq. ( 1) for the weak VTI anisotropy formulation in Thomsen (1986).The inversion solver was extended to include the δ, ε and v ⊥ kernels in the linearized forward problem matrix equation, as well as smoothing and damping matrices for these parameters defined following the same scheme as for velocity in the isotropic code (Eqs 3-10).
Regarding the synthetic tests, we first determined the sensitivity of traveltime data to changes in each of the parameters defining anisotropy in the medium (v, δ, ε, v ⊥ ) (Fig. 2), and we checked the proper performance of the code by comparing the synthetic traveltimes produced with their respective analytic solutions (Fig. 3).Next, we performed canonical inversion tests to compare two possible media parametrization, P[ε] and P[v ⊥ ], and two possible inversion strategies, simultaneous and sequential.According to our tests, both parametrizations have their strengths: P[ε] produces the best overall result in the sense that all parameters are acceptably recovered, with the exception of δ, and trade-off between parameters is lower, but P[v ⊥ ] yields the best result for the magnitude of the anomaly in v. Regarding the inversion strategy, simultaneous inversion is more straightforward and involves less human intervention, and given that both strategies yield similar results, it would be our first choice.Sequential inversion is always a more complex process that can be shown to work in a synthetic case because the target models are available, but in field data applications the complexity would most likely be unmanageable.
These tests were conducted under ideal conditions, and thus the conclusions provided by their results are an upper-limit but generalizable estimation of the strengths and weaknesses in the code's performance.
An acceptable recovery of δ turned out to be impossible due to the small sensitivity of traveltimes to this parameter, but we verified that it cannot simply be neglected in the equations.Whereas the recovery of the other parameters is not significantly affected by that of δ, a rough estimate of the average δ value in the medium is necessary and sufficient to generate a homogeneous initial model that allows for satisfactory inversion results in these other parameters.We also proved that it is possible to obtain it, provided that some a priori knowledge on δ values in the medium is available to define a range of plausible values, such as field or laboratory measurements.

Sensitivity test for
Model area P  Step 1 30 -0.5 0.5 21.4 2.9 ---1.3 8.2 13.5 0.5 22.8 5.1 Step 2 0.5 -0.3 0.5 21.4 3.0 0.9 5.3 19.4 ---0.5 22.8 5.1 the initial and final RMS of traveltime residuals, as well as δ mean values for the inverted model along with the corresponding mean deviations.It is straightforward to note that, for a central subrange of the tested initial δ values, final RMS values are an order of magnitude smaller, a few tenths of millisecond compared to the few milliseconds outside this subrange.Specifically, very similar results to those in Figs 5 and 6 in terms of traveltime residuals RMS are produced by initial δ values between 0.13 and 0.22 for P[ε], and between 0.12 and 0.22 for P[v ⊥ ].The narrowing of the initial δ distribution to a smaller subrange of mean δ values for the inverted models is indicative of a good general convergence trend.
the quality of recovery in terms of data and model fit for the simultaneous inversion of both parametrizations (Figs5 and 6).Initial and final RMS values for traveltime residuals to quantify data fit.For each parameter, as a measure of model fit, we computed the mean relative differences between the background areas of the inverted and target models (BG), the anomaly areas of the inverted and initial models (AI), and the anomaly areas of the inverted and target models (AT).Since the initial model is equal to the target model in the background area, it is not necessary to calculate the difference between inverted and initial models for this area.The ideal difference value for BG is 0 %.AT and AI indicate the resemblance between true and recovered anomalies, and their ideal values are 0 % and 25 % (~29.3 % in the case of v ⊥ ) respectively.Recovery is consistent with sensitivity (Fig.2): v and v ⊥ are well retrieved, ε is only partially recovered with P[ε], whilst inversion of δ is unsuccessful in both cases.
Figure 1: (a) Horizontal and (b) vertical views of the acquisition geometry for the inversion tests.114 sources and 114 receivers (red boxes) are located at 2.5 km from the center of the model, at the locus defined by the surface of the sphere inscribed in the 630

Figure 2 :
Figure 2: Sensitivities for the meridians at azimuths 0 rad (left) and π/4 rad (right) as a function of the polar angle (origin in the downward vertical axis).(a) and (b) Synthetic relative sensitivities in percentage, (c) and (d) synthetic normalized sensitivities, and (e) normalized analytic sensitivities.Sensitivity values displayed correspond to all sourcereceiver pairs along the selected 640

Figure 3 :
Figure 3: For both selected meridians, 0 rad and π/4 rad azimuths, relative traveltime errors in percentage with respect to the analytic value for each of the four simulations used in the sensitivity analysis.Polar angle origin is in the downward vertical axis.Mean values and deviations are shown in Table 2.

Figure 4 :
Figure 4: Flowchart describing the two steps for all the sequential inversion options tested.The best options for the sequential inversion strategy in P[ε] (Fig. 7) and in P[v ⊥ ] (Fig. 8) are marked in green and red respectively.

Figure 5 :
Figure 5: Simultaneous inversion with P[ε].Horizontal slices of the relative differences between target and initial (first row), final 650

Figure 6 :
Figure 6: Same as Fig. 5 but with P[v ⊥ ]. ε is derived from Eq. (2).The quality of the recovery is in correlation with sensitivity (Fig.660

Figure 7 :Figure 8 :
Figure 7: Same as Fig. 5 but for the two-step sequential inversion strategy with P[ε].In the first step only δ was fixed.In the second step, only ε was fixed.Final models from step 1 were used as initial model for step 2. v (1st column) is well recovered in step

Figure 9 :Figure 10 :
Figure 9: Same as Fig. 5 but using target models as initial models for all parameters in P[ε] but δ.This test was conducted to study 675

Table 7 : Same as Table 4 but for the simultaneous inversion of
P[ε] and P[v ⊥ ] using v and ε or v ⊥ target