An Adjoint Method for a High-Order Discretization of Deforming Domain Conservation Laws for Optimization of Flow Problems

An Adjoint Method for a High-Order Discretization of Deforming Domain Conservation Laws for Optimization of Flow Problems

Abstract

The fully discrete adjoint equations and the corresponding adjoint method are derived for a globally high-order accurate discretization of conservation laws on parametrized, deforming domains. The conservation law on the deforming domain is transformed into one on a fixed reference domain by the introduction of a time-dependent mapping that encapsulates the domain deformation and parametrization, resulting in an Arbitrary Lagrangian-Eulerian form of the governing equations. A high-order discontinuous Galerkin method is used to discretize the transformed equation in space and a high-order diagonally implicit Runge-Kutta scheme is used for the temporal discretization. Quantities of interest that take the form of space-time integrals are discretized in a solver-consistent manner. The corresponding fully discrete adjoint method is used to compute exact gradients of quantities of interest along the manifold of solutions of the fully discrete conservation law. These quantities of interest and their gradients are used in the context of gradient-based PDE-constrained optimization.

The adjoint method is used to solve two optimal shape and control problems governed by the isentropic, compressible Navier-Stokes equations. The first optimization problem seeks the energetically optimal trajectory of a 2D airfoil given a required initial and final spatial position. The optimization solver, driven by gradients computed via the adjoint method, reduced the total energy required to complete the specified mission nearly an order of magnitude. The second optimization problem seeks the energetically optimal flapping motion and time-morphed geometry of a 2D airfoil given an equality constraint on the -directed impulse generated on the airfoil. The optimization solver satisfied the impulse constraint to greater than digits of accuracy and reduced the required energy between a factor of and , depending on the value of the impulse constraint, as compared to the nominal configuration.

1Introduction

Optimization problems constrained by Partial Differential Equations (PDEs) commonly arise in engineering practice, particularly in the context of design or control of physics-based systems. A majority of the research in PDE-constrained optimization has been focused on steady or static PDEs, with a large body of literature detailing many aspects of the subject, including continuous and discrete adjoint methods [1], parallel implementations [6], one-shot or infeasible path methods [6], and generalized reduced gradient or feasible path methods [1]. This emphasis on steady problems is largely due to the fact that

However, there is a large class of problems where steady analysis is insufficient, such as problems that are inherently dynamic and problems where a steady-state solution does not exist or cannot be found reliably with numerical methods. Flapping flight is an example of the first type, a fundamentally unsteady problem that has become increasingly relevant due to its application to Micro-Aerial Vehicles (MAVs) [11]. Systems with chaotic solutions, such as those encountered in turbulent flows, are an example of the second type of problems where steady analysis breaks down. Design and control of these types of systems calls for time-dependent PDE-constrained optimization of the form

where the last constraint corresponds to a system of conservation laws with solution , parametrized by ; the objective and constraint functions of the optimization take the form of space-time integrals of pointwise, instantaneous quantities of interest and over the surface of the body .

In this work, the large computational cost associated with time-dependent PDE-constrained optimization will be addressed by two means. The first is the development of a globally high-order numerical discretization of conservation laws on deforming domains 1. For many important problems, high-order methods have been shown to require fewer spatial degrees of freedom [12] and time steps [13] for a given level of accuracy compared to low-order methods. Highly accurate quantities of interest, usually the time-average of a relevant surface- or volume-integrated quantity, is paramount, at least at convergence, since they drive the optimization trajectory through the objective function and constraints. Large errors in quantities of interest will cause the optimization procedure to be driven by discretization errors causing termination at a suboptimal design/control. The second approach to reduce the computational impact of time-dependent optimization is the use of gradient-based optimization techniques due to their rapid convergence properties, particularly when compared to derivative-free alternatives.

An efficient technique for computing derivatives of optimization functionals, required by gradient-based optimization solvers, is the adjoint method. It has proven its utility in the context of output-based mesh adaptivity and gradient-based PDE-constrained optimization as only a single linearized dual solve is required to compute the gradient of a single quantity of interest with respect to any number of parameters. In the context of partial differential equations, the adjoint equations can be derived at either the continuous, semi-discrete, or fully discrete level. The fully discrete adjoint method will be the focus of this work as it ensures discrete consistency [5] of computed gradients, i.e. the gradient of the discrete solution, including discretization errors, is computed. Discrete consistency is beneficial in the context of gradient-based optimization as inconsistent gradients may cause convergence of black-box optimizers to be slowed or hindered [15], unless specialized optimization algorithms are employed that handle gradient inexactness [16].

In this work, a globally high-order numerical discretization of general systems of conservation laws, defined on deforming domains, is introduced and the corresponding fully discrete adjoint equations derived. The goal is to harness the advantages of high-order methods in the context of gradient-based optimization. The solution of the adjoint equations – the dual solution – will be used to construct exact gradients of fully discrete quantities of interest. A Discontinuous Galerkin Arbitrary Lagrangian-Eulerian (DG-ALE) method [17] is used for the high-order spatial discretization. Previous work on the adjoint method for conservation laws on deforming domains predominantly considers a Finite Volume (FV) spatial discretization [1], and recently extended to DG-ALE schemes [20]. The DG-ALE discretization is chosen rather than FV due to its stable, high-order discretization of convective fluxes. The Geometric Conservation Law (GCL) is satisfied in the DG-ALE scheme through the introduction of an element-level auxiliary equation. The fully discrete adjoint equations derived in this work fully incorporate this GCL augmentation [23], ensuring discrete consistency is maintained.

High-order temporal discretization will be achieved using a Diagonally Implicit Runge-Kutta (DIRK) [24] method, marking a departure from previous work on unsteady adjoints, which has mostly considered temporal discretization via Backward Differentiation Formulas (BDF) [9], with some work on space-time DG discretizations [23]. Apart from being limited to second-order accuracy, if A-stability is required, high-order BDF schemes require special techniques for initialization [14]. While DIRK schemes require additional work to achieve high-order convergence in the form of additional nonlinear solves at a given timestep, this investment has been shown to be worthwhile [14]. The fully discrete, time-dependent adjoint equations corresponding to a Runge-Kutta temporal discretization has been studied in the context of ODEs [26] and optimal control [27]. A difficulty that arises when considering Runge-Kutta discretizations is the stages encountered during the primal (forward) and dual (reverse) solves will not align if the continuous or semi-discrete adjoint method is employed. This difficulty cannot arise in the fully discrete formulation as only the terms computed in the primal solve are used in the derivation of the discrete adjoint equations.

High-order discretization of integrated quantities of interest will be done in a solver-consistent manner, that is, spatial integrals will be evaluated via integration of the finite element shape functions used for the spatial discretization and temporal integrals evaluated via the DIRK scheme from the temporal discretization. This ensures the discretization order of quantities of interest exactly matches the PDE discretization. This marks a departure from the traditional method of simply using the trapezoidal rule for temporal integration of quantities of interest [10], which is limited to second-order accuracy. Since the fully discrete adjoint method is used, discretization of the quantities of interest is accounted for in the adjoint equations, which is the final component in ensuring discrete consistency of their gradients.

This notion of a solver-consistent discretization of quantities of interest has an immediate connection to adjoint-consistency for stationary problems [29]. A necessary condition for adjoint-consistency is that the quantity of interest is discretized with the same scheme as the governing equations and boundary conditions [31]; however, this does not refer to the practical issue of evaluating the integrals that arise in the finite-dimensional primal or adjoint residual. Solver-consistency ensures the same approximation is used to evaluate the integrals arising the primal residual and quantity of interest, and therefore the adjoint residual. A key property of adjoint-consistent discretizations is they possess optimal convergence rates in the -norm and in quantities of interest [32].

With the high-order numerical discretization in place, and the corresponding adjoint method developed, the proposed methodology is employed to solve deforming-domain PDE-constrained optimization problems, such as optimal control and shape optimization for fluid flows. Existing techniques for parametrizing the domain deformation include

The latter approach is adopted in this work. The proposed methodology, including the high-order primal discretization, fully discrete adjoint method, and domain parametrization, will be demonstrated on relevant optimal control and shape optimization problems. An important component will be the incorporation of solution-based constraints, which has previously been done only through heuristic penalization of the objective function [22].

The remainder of this document is organized as follows. Section 2 introduces the general form of a system of conservation laws defined on deforming domains, as well as the globally high-order discretization using DG-in-space, DIRK-in-time and a solver-consistent discretization of output integrals. Section 3 derives the fully discrete adjoint method corresponding to the high-order discretization from Section 2, with relevant implementational details discussed in Section 4. Section 5 couples the adjoint framework to state-of-the-art optimization software to solve a realistic optimization problem of determining energetically optimal flapping motions of an airfoil. An alternate derivation of the adjoint equations is provided in Appendix A by interpreting the dual variables as Lagrange multipliers of an auxiliary PDE-constrained optimization problem.

2Governing Equations and Discretization

This section is devoted to the treatment of conservation laws on a parametrized, deforming domain using an Arbitrary Lagrangian-Eulerian (ALE) description of the governing equations and exposition of a globally high-order numerical discretization of the ALE form of the system of conservation laws. Subsequently, Section 3 will develop the corresponding fully discrete adjoint equations and the adjoint method for constructing gradients of quantities of interest.

The methods introduced in this work are not necessarily limited to Partial Differential Equations (PDE) that can be written as conservation laws (Equation 2). In Section 2.2, the chosen spatial discretization (discontinuous Galerkin Arbitrary Lagrangian-Eulerian method) is applied to the PDE, resulting in a system of first-order Ordinary Differential Equations (ODE), which is the point of departure for all adjoint-related derivations. Time-dependent PDEs that are not conservation laws can be written similarly at the semi-discrete level after application of an appropriate spatial discretization, e.g. a continuous finite element method for parabolic PDEs. In this work, the scope is limited to first-order temporal systems, or those which are recast as such.

2.1System of Conservation Laws on Deforming Domain: Arbitrary Lagrangian-Eulerian Description

Consider a general system of conservation laws, defined on a parametrized, deforming domain, , written at the continuous level as

where the physical flux is decomposed into an inviscid and a viscous part , is the solution of the system of conservation laws, represents time, and is a vector of parameters. This work will focus on the case where the domain is parametrized by , although extension to other types of parameters, e.g. constants defining the conservation law, is straightforward.

The conservation law on a deforming domain is transformed into a conservation law on a fixed reference domain through the introduction of a time-dependent mapping between the physical and reference domains, resulting in an Arbitrary Lagrangian-Eulerian description of the governing equations.

Denote the physical domain by and the fixed, reference domain by , where is the number of spatial dimensions. At each time , let be a time-dependent diffeomorphism between the reference domain and physical domain: , where is a point in the reference domain and is the corresponding point in the physical domain at time and parameter configuration .

Time-dependent mapping between reference and physical domains.
Time-dependent mapping between reference and physical domains.

The transformed system of conservation laws from (Equation 2), under the mapping , defined on the reference domain takes the form

where denotes spatial derivatives with respect to the reference variables, . The transformed state vector, , and its corresponding spatial gradient with respect to the reference configuration take the form

where , , , and the arguments have been dropped, for brevity. The transformed fluxes are

For details regarding the derivation of the transformed equations, the reader is referred to [17].

When integrated using inexact numerical schemes, this ALE formulation does not satisfy the Geometric Conservation Law (GCL) [38]. This is overcome by introduction of an auxiliary variable , defined as the solution of

The auxiliary variable, is used to modify the transformed conservation law according to

where the GCL-transformed state variables are

and the corresponding fluxes

It was shown in [17] that the transformed equations (Equation 7) satisfy the GCL. In the next section, the ALE description of the governing equations (Equation 3) and (Equation 7) will be converted to first-order form and discretized via a high-order discontinuous Galerkin method.

2.2Spatial Discretization: Arbitrary Lagrangian-Eulerian Discontinuous Galerkin Method

The ALE description of the conservation law without GCL augmentation will be considered first. To proceed, the second-order system of partial differential equations in (Equation 3) is converted to first-order form

where is introduced as an auxiliary variable to represent the spatial gradient of the . Equation (Equation 10) is discretized using a standard nodal discontinuous Galerkin finite element method [39], which after local elimination of the auxiliary variables leads to the following system of ODEs

where is the block-diagonal, symmetric, fixed mass matrix, is the vectorization of at all nodes in the high-order mesh, and is the nonlinear function defining the DG discretization of the inviscid and viscous fluxes.

The GCL augmentation is treated identically, i.e. conversion to first-order form and subsequent application of the discontinuous Galerkin finite element method, where is taken as the state variable. The result is a system of ODEs corresponding to a high-order ALE scheme that satisfies the GCL

where each term is defined according to their counterparts in (Equation 11). From the conservation law defining (Equation 6), the corresponding flux is continuous, implying the physical flux can be used as the numerical flux. This implies no information is required from neighboring elements and (Equation 6) can be solved at the element level, i.e. statically condensed. Furthermore, the residual, , does not depend on itself since the physical flux is independent of .

Since the equation for does not depend on , it can be solved independently of the equation for . This enables to be considered an implicit function of , i.e. through application of the implicit function theorem. Then, (Equation 12) reduces to

Equations (Equation 11) and (Equation 13) are abstracted into the following system of ODEs

for convenience in the derivation of the fully discrete adjoint equations. Evaluation of the residual, , in (Equation 14) at parameter and time requires evaluation of the mapping, and , and , if GCL augmentation is employed. The implicit dependence of on requires special treatment when computing derivatives with respect to , which will be required in the adjoint method (Section Section 3). Discussion of treatment of such terms will be deferred to Section ?.

A convenient property of this DG-ALE scheme is that all computations are performed on the reference domain which is independent of time and parameter. An implication of this is that the mass matrix of the ODE (Equation 14) is also time- and parameter-independent. This property simplifies all adjoint computations introduced in Section 3 as terms involving and are identically zero. This, in turn, simplifies the implementation of the adjoint method and translates to computational savings since contractions with these third-order tensor are not required; see [9] for a discretization with parameter-dependent mass matrices and the corresponding adjoint derivation. In subsequent sections, it will be assumed that the mass matrix is time- and parameter-independent.

The DG-ALE scheme outlined in this section constitutes a spatial discretization, which yields a system of Ordinary Differential Equations (ODEs) when applied to the PDE in (Equation 2). The semi-discrete form of the conservation law is the point of departure for the remainder of this document. The subsequent development applies to any system of ODEs of the form (Equation 14) without relying on the specific spatial discretization scheme employed. The DG-ALE scheme was chosen to provide a high-order, stable spatial discretization of the conservation law (Equation 2). With the high-order spatial discretization of the state equations introduced, focus is shifted toward high-order temporal discretization to yield a globally high-order accurate numerical scheme.

2.3Temporal Discretization: Diagonally Implicit Runge-Kutta

The two prevailing classes of high-order implicit temporal integration schemes are:

BDF schemes are popular since high-order accuracy can be achieved at the cost of the solution of a single nonlinear system of equations of size at each time step. However, they suffer from initialization issues and are limited to second-order, if A-stability is required. In contrast, IRK schemes are single-step methods that can be A-stable and arbitrarily high-order, at the cost of solving an enlarged nonlinear system of equations of size , for an -stage scheme. For practical problems, this can be prohibitively expensive, in terms of memory and CPU time.

Table 1: Butcher Tableau for -stage diagonally implicit Runge-Kutta scheme

A particular subclass of the IRK schemes, known as Diagonally Implicit Runge-Kutta (DIRK) schemes [24], are capable of achieving high-order accuracy with the desired stability properties, without requiring the solution of an enlarged system of equations. The DIRK schemes are defined by a lower triangular Butcher tableau (Table Table 1) and take the following form when applied to (Equation 14)

for and , where are the number of time steps in the temporal discretization and is the number of stages in the DIRK scheme. The temporal domain, is discretized into segments with endpoints , with the th segment having length for . Additionally, in (Equation 15), is used to denote the approximation of at the th stage of time step

From (Equation 15), a complete time step requires the solution of a sequence of nonlinear systems of equation of size .

Given this exposition on the spatio-temporal discretization of the deforming domain conservation law, the expression for the quantity of interest must be discretized. The next section introduces a spatio-temporal discretization of the quantities of interest that is consistent with that of the conservation law itself.

2.4Solver-Consistent Discretization of Quantities of Interest

In this section, the high-order discretization of quantities of interest – space-time integrals of a nonlinear function of the solution of the conservation law – is considered. The output quantity takes the form

In the context of the optimization problem in (Equation 1), corresponds to either the objective or a constraint function.

Discretization of quantities of interest will introduce an additional discretization error, on top of that in the approximation of itself. To ensure the quantity of interest discretization does not dominate, thereby lowering the global order of the scheme, it is necessary that its discretization order matches that of the discretization of the governing equation. Clearly, it is wasteful to discretize this to a higher order than the state equation, using a similar argument.

For these reasons, discretization of (Equation 17) will be done in a solver-consistent manner, i.e. the spatial and temporal discretization used for the governing equation will also be used for the quantities of interest. Define as the approximation of using the DG shape functions from the spatial discretization of the governing equations. Then, the solver-consistent spatial discretization of (Equation 17) becomes

ensuring the spatial integration error in the quantity of interest exactly matches that of the governing equations. Solver-consistent temporal discretization requires the semi-discrete functional in (Equation 18) be converted to an ODE, which is accomplished via differentiation of (Equation 18) with respect to

Augmenting the semi-discrete governing equations with this ODE (Equation 19) yields the system of ODEs

Application of the DIRK temporal discretization introduced in Section 2.3 yields the fully discrete governing equations and corresponding solver-consistent discretization of the quantity of interest (Equation 17)

for , , and is defined in (Equation 16). Finally, the functional in (Equation 17) is evaluated at time to yield the solver-consistent approximation of

While the spatially solver-consistent discretization of quantities of interest is widely used, particularly in the context of finite element methods, temporal discretization is commonly done via low-order quadrature rules, usually the trapezoidal rule [22]. The main advantage of this solver-consistent discretization is the asymptotic discretization order of the governing equation and quantity of interest are guaranteed to exactly match, which ensures there is no wasted error in “over-integrating” one of the terms. Furthermore, solver-consistency extends the notion of adjoint-consistency that requires the discrete adjoint equations represent a consistent discretization of the continuous adjoint equations, but does not necessarily address the practical issue of evaluating the integrals that arise in each term of the finite-dimensional primal and adjoint residual. Solver-consistency ensures the integrals arising in the primal residual and quantity of interest are approximated identically, which in turn implies that each term in the adjoint equations uses the same integral approximation. The solver-consistent discretization also has the advantage of a natural and convenient implementation given the spatial and temporal discretization implementation. Finally, this method has the additional convenience of keeping a high-order accurate “current” value of the integral, i.e. at time step , to high-order accuracy. This property does not hold for high-order numerical quadrature since will involve , where depends on the quadrature rule used.

3Fully Discrete, Time-Dependent Adjoint Equations

The purpose of this section is to derive an expression for the total derivative of the discrete quantity of interest in (Equation 22), which can be expanded as

that does not depend on the sensitivities of the state variables, and . Each of the state variable sensitivities is the solution of a linear evolution equation of the same dimension and number of steps as the primal equation (Equation 15), rendering these quantities intractable to compute when is large. Elimination of the state variable sensitivities from (Equation 23) is accomplished through introduction of the adjoint equations corresponding to the functional , and the corresponding dual variables. From the derivation of the adjoint equation in Section 3.1, an expression for the reconstruction of the gradient of , independent of the state variables sensitivities, follows naturally. At this point, it is emphasized that represents any quantity of interest whose gradient is desired, such as the optimization objective function or a constraint. This section concludes with a discussion of the advantages of the fully discrete framework in the setting of the high-order numerical scheme.

Before proceeding to the derivation of the adjoint method, the following definitions are introduced for the Runge-Kutta stage equations and state updates

for and . Differentiation of these expressions with respect to gives rise to the fully discrete sensitivity equations

where , , and arguments have been dropped.

3.1Derivation

The derivation of the fully discrete adjoint equations corresponding to the quantity of interest, , begins with the introduction of test variables

for and . To eliminate the state sensitivities from the expression for in (Equation 23), multiply the sensitivity equations (Equation 25) by the test variables, integrate (i.e. sum in the discrete setting) over the time domain, and subtract from the expression for the gradient in (Equation 23) to obtain

The right side of the equality in (Equation 26) is an equivalent expression for for any value of the test variables since the terms in the brackets are zero, i.e. the sensitivity equations. Re-arrangement of terms in (Equation 26) leads to the following expression for , where the state variable sensitivities have been isolated

The dual variables, and , which have remained arbitrary to this point, are chosen as the solution to the following equations

for and . These are the fully discrete adjoint equations corresponding to the primal evolution equations in (Equation 24) and quantity of interest . Defining the dual variables as the solution of the adjoint equations in (Equation 28), the expression for in (Equation 27) reduces to

which is independent of the state sensitivities. Finally, elimination of the auxiliary terms, and , in equations (Equation 27) and (Equation 28) through differentiation of their expressions in (Equation 24) gives rise to the adjoint equations

for and and the expression for gradient reconstruction, independent of state sensitivities,

specialized to the case of a DIRK temporal discretization. From inspection of (Equation 31), it is clear that the initial condition sensitivity is the only sensitivity term required to reconstruct . The presence of this term does not destroy the efficiency of the adjoint method for two reasons:

In the first case, can be computed analytically once is known. The next section details efficient computation of using the adjoint method of the steady-state problem.

3.2Parametrization of Initial Condition

Suppose the initial condition is defined as the solution of the nonlinear system of equations – whose Jacobian is invertible at – which is most likely the fully discrete steady-state form of the governing equations

Differentiating with respect to the parameter leads to the expansion

where arguments have been dropped for brevity. Assuming the Jacobian matrix is invertible, multiply the preceding equation by the and rearrange to obtain

This reveals the term can be computed at the cost of one linear system solve of the form and an inner product . The only operation whose cost scales with the size of is the evaluation of and subsequent inner product. Given this exposition on the fully discrete, time-dependent adjoint method and the discrete adjoint method for computing , a discussion is provided detailing the advantages of the fully discrete framework when computing gradients of output quantities before discussing implementation details in Section 4.

3.3Benefits of Fully Discrete Framework

In the context of optimization, the fully discrete adjoint method is advantageous compared to the continuous or semi-discrete version as it is guaranteed that the resulting derivatives will be consistent with the quantity of interest, . This emanates from the fact that in the fully discrete setting, the discretization errors are also differentiated. This property is practically relevant as convergence guarantees and convergence rates of many black-box optimizers are heavily dependent on consistent gradients of optimization functionals.

Additionally, when Runge-Kutta schemes are chosen for the temporal discretization, the fully discrete framework is particularly advantageous since the stages are rarely invariant with respect to the direction of time, that is to say,

where is from the Butcher tableau. Temporal invariance of a Runge-Kutta scheme, as defined in (Equation 32) is significant when computing adjoint variables. During the primal solve, will be computed at for and its stage values at for and . If the same RK scheme is applied to integrate the semi-discrete adjoint equations backward in time, the primal solution will be required at for and . Due to condition (Equation 32), the solution to the primal problem was not computed during the forward solve. Obtaining the primal solution at this time requires interpolation, which complicates the implementation, degrades the accuracy of the computed adjoint variables, and destroys discrete consistency of the computed gradients. This issue does not arise in the fully discrete setting as only terms computed during the primal solve appear in the adjoint equations, by construction.

The next section is devoted to detailing an efficient and modular implementation of the fully discrete adjoint method on deforming domains.

4Implementation

Implementation of the fully discrete adjoint method introduced in Section 3 relies on the computation of the following terms from the spatial discretization

Here, is the mass matrix of the semi-discrete conservation law, and is the spatial residual vector with derivatives (Jacobian) and . As in the previous section, is the discretization of the spatial integral of the output quantity of interest with derivatives and . The mass matrix, spatial flux, Jacobian of spatial flux, and output quantity are standard terms required by an implicit solver and will not be considered further. The Jacobian transpose is explicitly mentioned as additional implementational effort is required when performing parallel matrix transposition. The derivatives with respect to are rarely required outside adjoint method computations and will be considered further in Section 4.1. As indicated in Section 2.2, all relevant derivatives of the mass matrix are zero since it is independent of time, parameter, and state variable, which is an artifact of the transformation to a fixed reference domain.

The parallel implementation of all semi-discrete quantities in (Equation 33) is performed using domain decomposition, where each processor contains a subset of the elements in the mesh, including a halo of elements to be communicated with neighbors [41]. Linear systems of the form

are solved in parallel using a GMRES solver with a block Incomplete-LU (ILU) preconditioner.

Given the availability of all terms in (Equation 33), the solution of the primal problem and integration of the output quantity is given in Algorithm ?. The solution of the corresponding fully discrete adjoint equation, and reconstruction of the gradient of , is given in Algorithm ?.

A well-documented implementational issue corresponding to the unsteady adjoint method pertains to storage and I/O demands. The adjoint equations are solved backward in time and require the solution of the primal problem at each of the corresponding steps/stages. Therefore, the adjoint computations cannot begin until all primal states have been computed. Additionally, this implies all primal states must be stored since they will be required in reverse order during the adjoint computation. For most problems, storing all primal states in memory will be infeasible, requiring disk I/O, which must be performed in parallel to ensure parallel scaling is maintained. There have been a number of strategies to minimize the required I/O operations, such as local-in-time adjoint strategies [42] and checkpointing [43]. For the DG-ALE method in this work, the cost of I/O was not significant compared to the cost of assembly and solving the linearized system of equations.

In this work, the 3DG software [46] was used for the high-order DG-ALE scheme. The temporal discretization and unsteady adjoint method were implemented in the Model Order Reduction Testbed (MORTestbed) [47] code-base, which was used to wrap 3DG such that all data structures, and thus all parallel capabilities, were inherited.

4.1Partial Derivatives of Residuals and Output Quantities

This section details computation of partial derivatives of the residual, , and the output quantity, , with respect to the parameter . The DG-ALE discretizations of Section 2.2, with and without GCL augmentation, are considered separately as the implicit dependence of on requires special treatment.

Without GCL Augmentation

When the GCL augmentation is not considered, the dependence of and on the parameter is solely due to the domain parametrization. Therefore, the following expansion of the partial derivatives with respect to is exploited

where and are determined solely from the domain parametrization in Section 4.2 and the terms

are determined from the form of the governing equations and spatial discretization outlined in Section 2. From the expressions in (Equation 34), the terms in (Equation 35) are not explicitly required in matrix form, rather matrix-vector products with and from Section 4.2 are required.

With GCL Augmentation

For the DG-ALE scheme with GCL augmentation, the dependence of and on the parameter arises from two sources, the domain parametrization and the implicit dependence of on . Therefore, the chain rule expansions in (Equation 34) must include an additional term to account for the dependence of on

Similar to the previous section, the terms and are determined solely from the domain parametrization in Section 4.2 and

are determined from the form of the governing equations and spatial discretization in Section 2. The only remaining term is defined as the solution of the following ODE

obtained by direct differentiation of (Equation 12). The last equality uses the fact that is independent of , which can be deduced from examination of the governing equation for (Equation 6). Equation (Equation 36) is discretized with the same DIRK scheme used for the temporal discretization of the state equation.

4.2Time-Dependent, Parametrized Domain Deformation

A crucial component of the fully discrete adjoint method on deforming domains is a time-dependent parametrization of the domain, amenable to parallel implementation. A parallel implementation is required as domain deformation will involve operations on the entire computational mesh and will be queried at every stage of each time step of both the primal and dual solves, according to Algorithms ? and ?. In this work, the domain parametrization is required to be sufficiently general to handle shape deformation, as well as kinematic motion. Additionally, the domain deformation must be sufficiently smooth to ensure sufficient regularity of the transformed solution, and the spatial and temporal derivatives must be analytically available for fast, accurate computation of the deformation gradient, , and velocity, , of the mapping, .

The domain deformation will be defined by the superposition of a rigid body motion and a spatially varying deformation. To avoid large mesh velocities at the far-field, which could arise from rigid rotations of the body, the blending maps of [17] are used. First, define a spatial configuration consisting of a rigid body motion (, ) and deformation () to the reference domain