On the choice of finite element for applications in geodynamics

Thieulot, Cedric; Bangerth, Wolfgang

doi:https://doi.org/10.5194/se-13-229-2022

Articles | Volume 13, issue 1

https://doi.org/10.5194/se-13-229-2022

Articles | Volume 13, issue 1

Method article

28 Jan 2022

Method article |

| 28 Jan 2022

On the choice of finite element for applications in geodynamics

Cedric Thieulot and Wolfgang Bangerth

Abstract

Geodynamical simulations over the past decades have widely been built on quadrilateral and hexahedral finite elements. For the discretization of the key Stokes equation describing slow, viscous flow, most codes use either the unstable Q₁×P₀ element, a stabilized version of the equal-order Q₁×Q₁ element, or more recently the stable Taylor–Hood element with continuous (Q₂×Q₁) or discontinuous ( $Q_{2} \times P_{- 1}$ ) pressure. However, it is not clear which of these choices is actually the best at accurately simulating “typical” geodynamic situations.

Herein, we provide a systematic comparison of all of these elements for the first time. We use a series of benchmarks that illuminate different aspects of the features we consider typical of mantle convection and geodynamical simulations. We will show in particular that the stabilized Q₁×Q₁ element has great difficulty producing accurate solutions for buoyancy-driven flows – the dominant forcing for mantle convection flow – and that the Q₁×P₀ element is too unstable and inaccurate in practice. As a consequence, we believe that the Q₂×Q₁ and $Q_{2} \times P_{- 1}$ elements provide the most robust and reliable choice for geodynamical simulations, despite the greater complexity in their implementation and the substantially higher computational cost when solving linear systems.

Received: 03 Jun 2021 – Discussion started: 15 Jun 2021 – Revised: 24 Nov 2021 – Accepted: 26 Nov 2021 – Published: 28 Jan 2022

1 Introduction

For the past several decades, the geodynamics community's workhorse for numerical simulations of the incompressible Stokes equations has been the use of (continuous) piecewise bilinear and/or trilinear velocity and piecewise constant (discontinuous) pressure finite elements, often in combination with the penalty method for the solution of the resulting linear systems (e.g., Donea and Huerta, 2003). This velocity–pressure pair is often referred to as the Q₁×P₀ Stokes element and sometimes as the Q₁×Q₀ element (Gresho and Sani, 2000). It is used, for example, in the ConMan (King et al., 1990), SOPALE (Fullsack, 1995), SLIM3D (Popov and Sobolev, 2008), CitcomCU (Moresi and Gurnis, 1996; Zhong, 2006), CitcomS (Zhong et al., 2000; McNamara and Zhong, 2004; Zhong et al., 2008), Ellipsis (Moresi et al., 2003; O'Neill et al., 2006), UnderWorld (Moresi et al., 2003), DOUAR (Braun et al., 2008), and FANTOM (Thieulot, 2011) codes and has therefore been used in hundreds of publications.

The popularity of this element can be explained by its very small memory footprint and ease of implementation and use. On the other hand, it has a rather low convergence order that makes it difficult to achieve high accuracy; maybe more importantly, the element is known not to satisfy the so-called Ladyzhenskaya–Babuška–Brezzi (LBB) condition condition (e.g., Donea and Huerta, 2003) and is therefore unstable. This instability noticeably manifests itself through oscillatory pressure modes (e.g., Fig. 18 of Thieulot et al., 2008 or Fig. 36 of Thieulot, 2014) and makes it not suited for large-scale three-dimensional simulations coupled to iterative solvers (May and Moresi, 2008). The unreliability of the pressure also makes this element a dubious choice for models in which some of the parameters – e.g., the density or the viscosity – depend on the pressure.

The more modern alternative to this choice is the Taylor–Hood element that uses (continuous) polynomials of degree k for the velocity and of degree k−1 for the pressure, where k≥2.¹ This element is not only LBB-stable, but owing to its higher polynomial degree is also convergent of higher order. It is therefore widely used in commercial flow solvers and is also the default element for the Aspect code in geodynamics (Kronbichler et al., 2012; Heister et al., 2017). This element is obviously more difficult to implement, and building efficient solvers and preconditioners is also more complicated (Kronbichler et al., 2012; Clevenger et al., 2020). However, these drawbacks can be mitigated by building on one of the widely available finite-element libraries that have appeared over the past 20 years; for example, Aspect inherits all of its finite-element functionality from the deal.II library (see Bangerth et al., 2007; Arndt et al., 2020). We will note that one can also use a number of variations of the underlying idea of the Taylor–Hood element, for example on quadrilaterals and hexahedra by using $Q_{k} \times P_{- (k - 1)}$ (see, for instance, May et al., 2015, Lechmann et al., 2011, and Thielmann and Kaus, 2012) in which the pressure is discontinuous and of (total) polynomial degree k−1, but missing the part of the finite-element space on every cell that distinguishes the space Q_k on quadrilaterals and hexahedra from the space P_k that is typically used on triangles and tetrahedra.² Another variation is to enrich the pressure space by a constant shape function on each cell (see, for example, Boffi et al., 2011, and the references therein). All of these alternatives are stable for k≥2, and in keeping with common usage of the term, we will also refer to all of these variations as Taylor–Hood or Taylor–Hood-like elements even though they are strictly speaking not what Taylor and Hood proposed in Taylor and Hood (1973).

A third option is the use of Q₁×Q₁ elements with both velocity and pressure using bilinear or trilinear shape functions. This combination of elements is not LBB-stable by default, but numerous stabilization techniques – typically adding a pressure-dependent term to the mass conservation equation – have been proposed in the literature (see, e.g., Norburn and Silvester, 2001; Elman et al., 2014; Gresho et al., 1995). Herein, we will discuss in particular the variation by Dohrmann and Bochev (2004) that is simple to implement and does not involve any tunable parameter. This approach is used in the Rhea code (Burstedde et al., 2009, 2013) in conjunction with adaptive mesh refinement (AMR), allowing for the numerical solution of whole Earth models at high resolutions (Stadler et al., 2010; Alisic et al., 2012). Another example of the use of this method is the work of Leng and Zhong (2011), also using AMR, to study thermochemical mantle convection. Both the ELEFANT code with an application to the 3D thermal state of curved subduction zones (Plunder et al., 2018) and the GALE code (Moresi et al., 2012), with application to the 3D shapes of metamorphic core complexes (Le Pourhiet et al., 2012) or oceanic plateau subduction (Arrial and Billen, 2013), use the stabilized Q₁×Q₁ method. Finally the ADELI code was coupled to a stabilized Q₁×Q₁ flow solver in the context of lithosphere–asthenosphere interaction studies (Cerpa et al., 2014, 2015, 2018).

The availability of all of these options leads us to the main question of this paper: which element should one use in geodynamics computations based on the Stokes equations? Or, in the absence of clear-cut conclusions, which ones should not be used? On the face of it, this seems like a simple question: the consensus in the computational science community is that using moderately high-degree elements (say, k=3 or k=4) yields the best accuracy for a given computational effort (measured in CPU cycles) unless one wants to change the solver technology to use matrix-free methods whereby even higher polynomial degrees become more efficient. This conclusion is based on the higher convergence order of higher-degree methods but balanced by the rapidly growing cost of matrix assembly and linear solver effort for higher-degree methods. On the other hand, the recommendation to use higher-degree methods is predicated on the assumption that the solution is smooth enough – say, the velocity is in the Sobolev space H^k+1 of functions that have, loosely speaking, at least k+1 derivatives – that one can actually achieve a convergence rate of 𝒪(h^k) in the energy norm and 𝒪(h^k+1) in the L₂ norm, where h is the mesh size. This assumption generally requires that all coefficients, such as density and viscosity, are sufficiently smooth on length scales resolvable by the mesh. This may not be the case in realistic geodynamics problems given that density and viscosity often depend discontinuously on the solution variables (velocity or strain rate, pressure, temperature, and compositional variables); indeed, in many models, the viscosity may vary by orders of magnitude on short length scales.

Such considerations put into question whether higher-order methods are really worth the effort for actual geodynamics simulations. Given these divergent theoretical thoughts, the only way to resolve the question is by way of numerical comparisons. We have consequently extended Aspect so that it can use all of the element combinations above, and we will use these implementations in the comparisons in this paper.

Goals of this paper. Having outlined the conflict between the expected superiority of higher-degree elements for the Stokes equation on the one hand and the expected lack of smoothness of solutions in realistic geodynamic cases, our goals in the paper are as follows.

Quantitatively compare the solution accuracy of the various options (Q₁×P₀, $Q_{k} \times Q_{k - 1}$ , $Q_{k} \times P_{- (k - 1)}$ and stabilized Q₁×Q₁) using a variety of analytical benchmarks for which the exact solution is known. As we will see below, there is little point working with k>2 in geodynamics applications, and so the only cases we consider for Taylor–Hood-like elements are Q₂×Q₁ and $Q_{2} \times P_{- 1}$ .
Extend these numerical comparisons to cases in which it is known that the stabilized Q₁×Q₁ demonstrates problematic behavior that may make it unusable in many practical situations. In particular, we will consider the case of buoyancy-driven flows.
Conclude our considerations by comparing the available options using a realistic geodynamical application. This will allow us to draw conclusions as to what element one might want to recommend for geodynamics applications.

While we have approached this study with an open mind and without a strong prior idea of which element might be the best, let us end this Introduction by noting that members of the crustal dynamics and mantle convection communities have occasionally expressed a dislike of the stabilized Q₁×Q₁ element for its inability to deal with large lithostatic pressures and free surfaces absent special modifications of the formulation. For example, Arrial and Billen (2013) comment on the need to modify the physical description of the problem due to the stabilization (with references replaced by ones listed at the end of this paper).

All the models were run with the open source code Gale. […] Gale uses Q₁–Q₁ elements to describe the pressure and the velocity. However, this formulation is unstable and a slight compressible term is added in the divergence equation to stabilize it (Dohrmann and Bochev, 2004). Ideally, this term should be applied on the dynamic pressure and not on the full pressure. To fix this, a hydrostatic term corresponding to the reference density and temperature profile, is subtracted from the full pressure and the body force vector.

Few other negative comments concerning the Q₁×Q₁ element appear on record in the published literature, although one can find the following quote in Lehmann et al. (2015).

We do not consider the Q₁×Q₁/stab element (Dohrmann and Bochev, 2004; Bochev et al., 2006; Burstedde et al., 2009), as stabilization of this element is achieved by introducing an artificial compressibility that dominates for flows mainly driven by buoyancy variations (May et al., 2015). In geophysical flow models this yields unphysical pressure artifacts for cases where both the free surface of the Earth and mantle flow are considered, because the driving density contrast between cold sinking plates and the warmer surrounding Earth's mantle is much smaller than the density difference between rocks and air (Kaus et al., 2010; Popov and Sobolev, 2008; Mishin, 2011). In our experience, this results in artificial “compaction” of the Earth’s mantle if Q₁×Q₁/stab element is used, which makes them unsuitable for these purposes.

Indeed, our numerical experiments will encounter a similar issue; see Sect. 6.

We are not aware of any other significant publications in the geodynamics literature that specifically discuss the relative trade-offs between the elements we consider herein, specifically between the Q₁×P₀ and Taylor–Hood elements, and consequently believe that our discussions here are useful for the community.

2 The governing equations

For the purpose of this paper, we are concerned with the accurate numerical solution of the incompressible Stokes equations:

\begin{array}{l} (1) & - \nabla \cdot [2 η ε (u)] + \nabla p & = ρ g & in Ω, \\ (2) & - \nabla \cdot u & = 0 & in Ω, \end{array}

where η is the viscosity, ρ the density, g the gravity vector, ε(⋅) denotes the symmetric gradient operator defined by $ε (u) = \frac{1}{2} (\nabla u + \nabla u^{T})$ , and $Ω \subset R^{d}, d = 2$ or 3 is the domain of interest. Both the viscosity η and the density ρ will, in general, be spatially variable; in applications, this is often through nonlinear dependencies on the strain rate ε(u) or the pressure, but the exact reasons for the spatial variability are not of importance to us here: what matters is that these coefficients may vary strongly and on short length scales.

In applications, the equations above will be augmented by appropriate boundary conditions and will be coupled to additional and often time-dependent equations, such as ones that describe the evolution of the temperature field or of the composition of rocks (see, for example, Schubert et al., 2001; Turcotte and Schubert, 2012). This coupling is also not of interest to us here.

3 Discretization using finite-element methods

3.1 Formulation and basic error estimates

For the comparisons we intend to make in this paper, Eqs. (1)–(2) are discretized using the finite-element method. A straightforward application of the Galerkin method yields the following finite-dimensional variational problem: find $u_{h} \in U_{h}, p_{h} \in P_{h}$ so that

\begin{matrix} (3) & \begin{aligned} (ε (v_{h}), 2 η ε (u_{h})) - (\nabla \cdot v_{h}, p_{h}) & = (v_{h}, ρ g), \\ - (q_{h}, \nabla \cdot u_{h}) & = 0, \end{aligned} \end{matrix}

for all test functions $v_{h} \in U_{h}, q_{h} \in P_{h}$ . Here, $(a, b) = \int_{Ω} a (x) b (x) d x$ . For simplicity, we have omitted terms introduced through the treatment of boundary conditions. The finite-dimensional, piecewise polynomial spaces 𝒰_h and P_h can be chosen in a variety of ways, as discussed in the Introduction. In particular, if they are chosen as 𝒰_h=Q_k and $P_{h} = Q_{k - 1}$ – i.e., the Taylor–Hood element – then the discrete problem is known to satisfy the LBB condition and the solution is stable (Elman et al., 2014). Here, Q_s is the space of continuous functions that are obtained on each cell K of a mesh 𝕋 by mapping polynomials of degree at most s in each variable from the reference cell [0,1]^d. Likewise, the problem is stable if one chooses 𝒰_h=Q_k and $P_{h} = P_{- (k - 1)}$ , where now P_−s is the space of discontinuous functions obtained by mapping polynomials of total degree at most s from the reference cell. In both of these cases, we expect from fundamental theorems of the finite-element method (see, for example, Elman et al., 2014) that the convergence rates are optimal, i.e., that the errors satisfy the relationships

\begin{matrix} (4) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & = O (h^{k}), \\ ‖ u - u_{h} ‖_{L_{2}} & = O (h^{k + 1}), \\ ‖ p - p_{h} ‖_{L_{2}} & = O (h^{k}), \end{aligned} \end{matrix}

where h is the maximal diameter over all cells in the mesh 𝕋.

On the other hand, if one chooses 𝒰_h=Q₁ and 𝒫_h=P₀, i.e., the unstable Q₁×P₀ element with piecewise linear continuous velocities and piecewise constant discontinuous pressure, then the best convergence rates one can hope for would satisfy the following relationships based solely on interpolation error estimates:

\begin{matrix} (5) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & = O (h), \\ ‖ u - u_{h} ‖_{L_{2}} & = O (h^{2}), \\ ‖ p - p_{h} ‖_{L_{2}} & = O (h) . \end{aligned} \end{matrix}

In practice, if the numerical solution shows pressure oscillations (see for instance Sani et al., 1981 a, b), one will not even observe the rates shown above but might in fact obtain a worse pressure convergence rate, for example $‖ p - p_{h} ‖_{L_{2}} = O (h^{1 / 2})$ .

Finally, if one uses 𝒰_h=Q₁ and 𝒫_h=Q₁, then this unstable element combination can be made stable if one replaces the discrete formulation (3) by the following stabilized version due to Dohrmann and Bochev (2004):

\begin{matrix} (6) & \begin{aligned} (ε (v_{h}), 2 η ε (u_{h})) - (\nabla \cdot v_{h}, p_{h}) & = (v_{h}, ρ g), \\ (q_{h}, \nabla \cdot u_{h}) - ((I - π_{0}) q_{h}, \frac{1}{η} (I - π_{0}) p_{h}) & = 0 . \end{aligned} \end{matrix}

Here, I is the identity operator and π₀ is the projection onto piecewise constant functions – i.e., π₀f is the function that on each cell is equal to the mean value of f on that cell. For this element, the rates one might hope for are as follows (see again Dohrmann and Bochev, 2004):

\begin{matrix} (7) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & = O (h), \\ ‖ u - u_{h} ‖_{L_{2}} & = O (h^{2}), \\ ‖ p - p_{h} ‖_{L_{2}} & = O (h) . \end{aligned} \end{matrix}

Dohrmann and Bochev (2004) report that for some test cases, one might in fact obtain $‖ p - p_{h} ‖_{L_{2}} = O (h^{t})$ with t≈1.5, though it is not clear whether this rate can be obtained for all possible applications. We also observe this improved rate in one of our benchmarks in Sect. 5.

We end this section by noting that in many of the setups we use in Sect. 5, the boundary conditions we impose lead to a problem in which the pressure is only determined up to an additive constant. The same is then true for the linear system one has to solve after discretization. As a consequence, we can only meaningfully compute quantities such as $‖ p - p_{h} ‖_{L_{2}}$ if both the exact and the numerical solution are normalized; a typical normalization is to ensure that their mean values are zero. Aspect enforces this normalization after solving the linear system.

3.2 A closer look at the error estimates

A comparison of Eq. (4) with Eqs. (5) and (7) would suggest that the Taylor–Hood element can obtain substantially better rates of convergence if one only chooses the polynomial degree k large enough.

However, this is an incomplete understanding because the 𝒪(h^m) notation hides the fact that the constants in this behavior depend on the solution. More specifically, a complete description of the error behavior would replace Eq. (4) by the following statement: there are constants $C_{1}, C_{2}, C_{3} < \infty$ so that

\begin{matrix} (8) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & \leq C_{1} h^{k} ‖ \nabla^{k + 1} u ‖_{L_{2}}, \\ ‖ u - u_{h} ‖_{L_{2}} & \leq C_{2} h^{k + 1} ‖ \nabla^{k + 1} u ‖_{L_{2}}, \\ ‖ p - p_{h} ‖_{L_{2}} & \leq C_{3} h^{k} ‖ \nabla^{k} p ‖_{L_{2}} . \end{aligned} \end{matrix}

The validity of these statements clearly depends on the solution being regular enough so that ∇^k+1u and ∇^kp actually exist and are square-integrable – in other words, that $u \in H^{k + 1}$ and p∈H^k, where H^k represents the usual Sobolev function spaces. ³ On the other hand, all that is guaranteed by the existence theory for partial differential equations is that u∈H¹ and $p \in L_{2} = H^{0}$ ; any further smoothness should only be expected if, for example, the domain Ω is convex and if viscosity η and right-hand side ρg are also smooth. Indeed, this is the case for many artificial benchmarks for which these functions are chosen a priori; on the other hand, in “realistic” geodynamics applications, one might expect η and ρ to be discontinuous at phase boundaries and potentially vary widely. In such cases, one needs to accept that the solutions only satisfy u∈H^q and $p \in H^{q - 1}$ with q≥1 but possibly $q < k + 1$ . Numerical analysis predicts that in such cases, the best-case rates in Eq. (8) will be replaced by the following:

\begin{matrix} (9) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & \leq C_{1} h^{min {q - 1, k}} ‖ \nabla^{min {q, k + 1}} u ‖_{L_{2}}, \\ ‖ u - u_{h} ‖_{L_{2}} & \leq C_{2} h^{min {q, k + 1}} ‖ \nabla^{min {q, k + 1}} u ‖_{L_{2}}, \\ ‖ p - p_{h} ‖_{L_{2}} & \leq C_{3} h^{min {q - 1, k}} ‖ \nabla^{min {q - 1, k}} p ‖_{L_{2}} . \end{aligned} \end{matrix}

Similar considerations apply for the Q₁×P₀ and the stabilized Q₁×Q₁ combinations; a closer examination yields the following rates that would replace Eqs. (5) and (7):

\begin{matrix} (10) & \begin{aligned} ‖ \nabla (u - u_{h}) ‖_{L_{2}} & \leq C_{1} h^{min {q - 1, 1}} ‖ \nabla^{min {q, 2}} u ‖_{L_{2}}, \\ ‖ u - u_{h} ‖_{L_{2}} & \leq C_{2} h^{min {q, 2}} ‖ \nabla^{min {q, 2}} u ‖_{L_{2}}, \\ ‖ p - p_{h} ‖_{L_{2}} & \leq C_{3} h^{min {q - 1, 1}} ‖ \nabla^{min {q - 1, 1}} p ‖_{L_{2}} . \end{aligned} \end{matrix}

In other words, we will only benefit from the added expense of the Taylor–Hood element with k≥2 if the solution is sufficiently smooth, namely if at least $q > k \geq 2$ . The question of whether q>2 indeed for a given situation is one of partial differential equation (PDE) theory and difficult to answer in general without using particular knowledge of η, ρg, and Ω. On the other hand, one can observe convergence rates experimentally for a number of cases of interest, so in some sense, it would be legitimate to ask the following question: what is the regularity index q of typical solutions in geodynamics applications? At the same time, this requires careful convergence studies on problems that are already typically quite challenging to solve on any reasonable mesh, let alone several further refined ones. As a consequence, we cannot answer this question in the generality stated above. Instead, we will approach it below by considering a number of benchmarks that illustrate typical features of geodynamic settings in an abstracted way (in Sect. 5), followed by a model application (in Sect. 6). In particular, the examples in Sect. 5.2 and 5.3 will illustrate cases in which the exact solution is not smooth enough to achieve the optimal convergence rate.

We end this section by noting that all of the estimates shown above guarantee that the error on the left of an inequality decreases at least at the rate shown on the right side, but they do not state that on a given sequence of meshes, the rate might not in fact be better. Indeed, this often happens: for example, if one aligns meshes with a discontinuity in coefficients (as we do for the SolCx benchmark discussed in Sect. 5.2), one often observes optimal rates – or convergence rates between the minimal theoretically guaranteed and the optimal ones – for some elements even if the solution lacks regularity. Actually observing the minimal theoretically guaranteed convergence rate for solutions lacking regularity often requires choosing randomly arranged meshes – a case we will not consider herein.

4 Comments about the use of the Q₁×Q₁ element in geodynamics computations

Before delving into the details of numerical experiments, let us consider one other theoretical aspect. An interesting complication of geodynamics simulations compared to many other applications of the Stokes equations is that the hydrostatic component of the pressure is often vastly larger than the dynamic pressure, even though only the dynamic component is responsible for driving the flow. As we will discuss in the following, this has no importance when using the Q₁×P₀ or the Taylor–Hood elements, but it turns out to be rather inconvenient when using a stabilized formulation that contains an artificial compressibility term. This issue is also mentioned in the quote from Arrial and Billen (2013) reproduced in the Introduction and in May et al. (2015).

To illustrate the issue, consider the force balance equation (Eq. 1). We can split the pressure into hydrostatic and dynamic components, $p = p_{s} + p_{d}$ , where we define the hydrostatic pressure via the relationship

\begin{matrix} (11) & \frac{\partial}{\partial z} p_{s} = ρ_{ref} (z) g_{z} (z), \end{matrix}

coupled with the normalization that p_s=0 at the top of the domain. In defining p_s this way, we have made the assumption that the vertical component g_z of the gravity vector dominates its other components. Furthermore, we have introduced a reference density ρ_ref that somehow reflects a depth-dependent profile. As we will discuss below, there is really no unique or accepted way to define this profile, though one should generally think of it as capturing the bulk of the three-dimensional variation in the density via a one-dimensional function.

By splitting the pressure in this way, Eq. (1) can then be rewritten as follows:

- \nabla \cdot [2 η ϵ (u)] + \nabla p_{d} = ρ g - ρ_{ref} g_{z} e_{z} in Ω .

Since this is the only equation in which the pressure appears, it is obvious that the velocity field so computed is the same whether or not one uses the original formulation solving for u and p or the one solving for u and p_d. More concisely, the observation shows that the velocity field so computed does not depend on how one chooses the reference density ρ_ref. The original formulation is recovered by using the simplest choice, ρ_ref=0. As a consequence, many geodynamics codes use formulations that only compute the dynamic pressure p_d using a reference density ρ_ref(z). Importantly, however, there is no canonical way for this definition: one might choose a constant reference density, a depth-dependent adiabatic profile, or one computed at each time step by laterally averaging the current three-dimensional density field $ρ (x, y, z, t)$ ; each of these options – and likely more – have been used in numerical simulations one can find in the literature. In any case, pressure-dependent coefficients such as the density or viscosity are then evaluated by using p_s+p_d, where p_d is computed as part of the solution of the Stokes problem and p_s is the hydrostatic pressure defined by Eq. (11) using the particular choice of reference density used by a code. On the other hand, the Aspect code notably always computes the full pressure instead of splitting it in hydrostatic and dynamic components (see the discussion in Kronbichler et al., 2012) corresponding to the particular choice ρ_ref=0.

The problem with the stabilized Q₁×Q₁ formulation – different from the use of the other element choices – is that the velocity field computed from the Stokes solution is not independent of the choice of the reference density. This is because the mass conservation equation is modified by the stabilization term and – in the simple case of a constant viscosity – reads

\begin{matrix} (12) & - \nabla \cdot u - \frac{1}{η} Π p_{d} = 0 . \end{matrix}

Here, $Π = (I - π_{0})$ is the operator that corresponds to the stabilization term in Eq. (6). ⁴

The point of these considerations is that different choices of ρ_ref (including the choice ρ_ref=0 that leads to the original formulation) do have an effect here because they lead to different $p_{d} = p - p_{s}$ for which Πp_d is different: that is, the amount of artificial compressibility depends on the splitting of the pressure into static and dynamic pressures. In other words, the discretization errors $‖ u - u_{h} ‖_{L_{2}}$ and $‖ \nabla (u - u_{h}) ‖_{L_{2}}$ discussed in the previous section will in general depend on the choice of the reference density profile, and the latter will need to be carefully defined in order to lead to acceptable error levels. As we will show in the benchmarking section, the specific choice of ρ_ref in fact has a rather large effect. This is in line with the previously quoted comments in Arrial and Billen (2013).

Let us end this section by commenting on two aspects of why this issue may not be as relevant in other contexts in which stabilized formulations have been used. First, in many important applications of the Stokes equations, the flow is not driven by buoyancy effects but by inflow and outflow boundary conditions (e.g., Turek, 1999; Zienkiewicz and Taylor, 2002). Indeed, in those conditions both the density and the gravity vector are generally considered spatially constant, and the choice of reference density and hydrostatic pressure is then obvious and unambiguous. In these cases, computations are always performed with only the dynamic pressure because the hydrostatic pressure does not enter the problem at all except in the rare cases of fluids with pressure-dependent viscosities.

Second, while we have here considered the stabilization first introduced in Dohrmann and Bochev (2004), earlier stabilized formulations used a pressure Laplacian in place of the operator Π above. (See, for example, Brezzi and Pitkäranta, 1984, or the variation in Silvester and Kechkar, 1990, as well as the analysis in Bochev et al., 2006.) That is, instead of Eq. (12) they used a formulation of the form

\begin{matrix} (13) & - \nabla \cdot u - c h^{2} Δ p = 0, \end{matrix}

where c is a tuning parameter that also incorporates the viscosity. If one uses this formulation for cases in which the reference density is chosen as a function that is constant in depth – as was often done in earlier mantle convection codes considering the Boussinesq approximation – and if one computes in a Cartesian box with a constant gravity vector g=ge_z, then p_s is a linear function, and consequently Δp_s=0. In other words, $Δ p = Δ (p - p_{s}) = Δ p_{d}$ , which implies that the computed velocity field again did not depend on the exact choice of ρ_ref as long as it was chosen constant. This property does not hold for the formulation of Dohrmann and Bochev because $Π p \neq Π (p - p_{s}) = Π p_{d}$ for linear pressures p_s because Πp_s≠0: Π subtracts from p_s the average value on each cell, leaving a piecewise linear discontinuous function.

Of course, whether one uses the Dohrmann–Bochev formulation (Eq. 12) or the addition of a pressure Laplace as in Eq. (13), the formulation is consistent. That is, as the mesh size h goes to zero, the added stabilization term also goes to zero. In the limit, the numerical solution therefore satisfies the original mass conservation equation. In other words, the limit is independent of the choice of ρ_ref, even though the solutions on a finite mesh are not.

5 Numerical results for artificial benchmarks

In this section, let us present computational results for three analytical problems and a buoyancy-driven flow community benchmark. While the first of these (Sect. 5.1) is simply used to establish the best convergence rates one can hope for in the case of smooth solutions, the remaining test cases were chosen because they illustrate aspects of what we think “typical” solutions of geodynamic applications look like in an abstracted, controlled way. In particular, the “SolCx” benchmark in Sect. 5.2 demonstrates features of solutions in which the mesh can be aligned with sharp features in the viscosity, and the “SolVi” benchmark in Sect. 5.3 does so in the more common case in which the mesh cannot be aligned. Finally, the “sinking block” case in Sect. 5.4 shows a buoyancy-driven situation in which all of the discussions of the previous section on the choice of a reference density will come into play. All of these cases are simple enough that we know (quantitative or qualitative features of) the solution to sufficient accuracy to investigate convergence rigorously.

While these benchmarks provide us with insight that allows us to conjecture which elements may or may not work in practical application, they still are just abstract benchmarks. As a consequence, we will consider an actual geodynamic application in Sect. 6.

All models are run with the Aspect code. We have limited ourselves to two-dimensional cases as we do not expect that three-dimensional models would shed any more light on the conclusions reached. Although Aspect is built for adaptive mesh refinement (AMR), we have chosen not to use this feature in order to reflect the fact that the majority of existing codes use structured meshes.

5.1 The Donea and Huerta benchmark

Let us start our numerical experiments with the simple 2D benchmark presented in Donea and Huerta (2003). The exact definition involves lengthy formulas not worth repeating here, but in short it consists of the following ingredients: (i) the domain is a unit square, (ii) the viscosity and density are set to 1, and (iii) velocity and pressure fields are chosen to correspond to smooth polynomials describing circular flow with no-slip boundary conditions. We then choose an (unphysical) gravity vector field that produces these velocity and pressure fields. This setup produces the smooth solution shown in Fig. 1 for which we would expect that the higher-order Taylor–Hood element is highly accurate.

https://se.copernicus.org/articles/13/229/2022/se-13-229-2022-f01

Figure 1Donea and Huerta benchmark. Velocity (a) and pressure (b) fields obtained on a 32×32 mesh with Q₂×Q₁ elements.

On the choice of finite element for applications in geodynamics

3.1 Formulation and basic error estimates

3.2 A closer look at the error estimates

5.1 The Donea and Huerta benchmark

5.2 The SolCx benchmark

5.3 The SolVi (circular inclusion) benchmark

5.4 The sinking block