Openmdao Overview 2019
Openmdao Overview 2019
Abstract Multidisciplinary design optimization (MDO) is lems. We also summarize a number of OpenMDAO applica-
concerned with solving design problems involving coupled tions previously reported in the literature, which include tra-
numerical models of complex engineering systems. While jectory optimization, wing design, and structural topology
various MDO software frameworks exist, none of them take optimization, demonstrating that the framework is effective
full advantage of state-of-the-art algorithms to solve coupled in both coupling existing models and developing new mul-
models efficiently. Furthermore, there is a need to facilitate tidisciplinary models from the ground up. Given the poten-
the computation of the derivatives of these coupled models tial of the OpenMDAO framework, we expect the number
for use with gradient-based optimization algorithms to en- of users and developers to continue growing, enabling even
able design with respect to large numbers of variables. In more diverse applications in engineering analysis and de-
this paper, we present the theory and architecture of Open- sign.
MDAO, an open-source MDO framework that uses Newton-
Keywords Multidisciplinary design optimization · Coupled
type algorithms to solve coupled systems and exploits prob-
systems · Complex systems · Sensitivity analysis ·
lem structure through new hierarchical strategies to achieve
Derivative computation · Adjoint methods · Python
high computational efficiency. OpenMDAO also provides a
framework for computing coupled derivatives efficiently and
in a way that exploits problem sparsity. We demonstrate the 1 Introduction
framework’s efficiency by benchmarking scalable test prob-
Numerical simulations of engineering systems have been
Justin S. Gray widely developed and used in industry and academia. Simu-
NASA Glenn Research Center, lations are often used within an engineering design cycle to
Cleveland, OH, USA
E-mail: [email protected]
inform design choices. Design optimization—the use of nu-
merical optimization techniques with engineering simulation—
John T. Hwang
University of California, San Diego,
has emerged as a way of incorporating simulation into the
San Diego, CA, USA design cycle.
E-mail: [email protected] Multidisciplinary design optimization (MDO) arose from
Joaquim R. R. A. Martins the need to simulate and design complex engineering sys-
Department of Aerospace Engineering, University of Michigan, tems involving multiple disciplines. MDO serves this need
Ann Arbor, MI, USA in two ways. First, it performs the coupled simulation of the
E-mail: [email protected]
engineering system, taking into account all the interdisci-
Kenneth T. Moore plinary interactions. Second, it performs the simultaneous
DB Consulting Group (NASA Glenn Research Center),
Cleveland, OH, USA
optimization of all design variables, taking into account the
E-mail: [email protected] coupling and the interdisciplinary design tradeoffs. MDO is
Bret A. Naylor
sometimes referred to as MDAO (multidisciplinary analysis
DB Consulting Group (NASA Glenn Research Center), and optimization) to emphasize that the coupled analysis is
Cleveland, OH, USA useful on its own. MDO was first conceived to solve aircraft
E-mail: [email protected] design problems, where disciplines such as aerodynamics,
2 Gray, Hwang, Martins, Moore, and Naylor
structures, and controls are tightly coupled and require de- lytic derivatives and their demonstrated benefits, they have
sign tradeoffs [39]. Since then, numerical simulations have not been widely supported in MDO frameworks because
advanced in all disciplines, and the power of computer hard- their implementation is complex and requires deeper access
ware has increased dramatically. These developments make to the analysis code than can be achieved through an ap-
it possible to advance the state-of-the-art in MDO, but other proach that treats all analyses as explicit functions. There-
more specific developments are needed. fore, users of MDO frameworks that follow this approach
There are two important factors when evaluating MDO are typically restricted to gradient-free optimization meth-
strategies: implementation effort and the computational effi- ods, or gradient-based optimization with derivatives com-
ciency. The implementation effort is arguably the most im- puted via finite differences.
portant because if the work required to implement a multi- The difficulty of implementing MDO techniques with
disciplinary model is too large, the model will simply never analytic derivatives creates a significant barrier to their adop-
be built. One of the main MDO implementation challenges tion by the wider engineering community. The OpenMDAO
is that each analysis code consists of a specialized solver that framework aims to lower this barrier and enable the widespread
is typically not designed to be coupled to other codes or to be use of analytic derivatives in MDO applications.
used for numerical optimization. Additionally, these solvers Like other frameworks, OpenMDAO provides a modular
are often coded in different programming languages and use environment to more easily integrate discipline analyses into
different interfaces. These difficulties motivated much of the a larger multidisciplinary model. However, OpenMDAO V2
early development of MDO frameworks, which provided improves upon other MDO frameworks by integrating dis-
simpler and more efficient ways to link discipline analy- cipline analyses as implicit functions, which enables it to
ses together. While these MDO frameworks introduce im- compute derivatives for the resulting coupled model via the
portant innovations in software design, modular model con- unified derivatives equation [67]. The computed derivatives
struction, and user interface design, they treat each disci- are coupled in that they take into account the full interaction
pline analysis as an explicit function evaluation; that is, they between the disciplines in the system. Furthermore, Open-
assume that each discipline is an explicit mapping between MDAO is designed to work efficiently in both serial and par-
inputs and outputs. This limits the efficiency of the nonlin- allel computing environments. Thus, OpenMDAO provides
ear solution algorithms that could be used to find a solu- a means for users to leverage the most efficient techniques,
tion to the coupled multidisciplinary system. Furthermore, regardless of problem size and computing architecture, with-
these MDO frameworks also present the combined multi- out having to incur the significant implementation difficulty
disciplinary model as an explicit function to the optimizer, typically associated with gradient-based MDO.
which limits the efficiency when computing derivatives for This paper presents the design and algorithmic features
gradient-based optimization of higher-dimensional design of OpenMDAO V2 and is structured to cater to different
spaces. Therefore, while these first framework developments types of readers. For readers wishing to just get a quick
addressed the most pressing issue by significantly lower- overview of what OpenMDAO is and what it does, reading
ing the implementation effort for multidisciplinary analysis, this introduction, the overview of applications (Sec. 7, es-
they did not provide a means for applying the most efficient pecially Table 13), and the conclusions (Sec. 8) will suffice.
MDO techniques. Potential OpenMDAO users should also read Sec. 3, which
The computational efficiency of an MDO implementa- explains the basic usage and features through a simple ex-
tion is governed by the efficiency of the coupled (multidisci- ample. The remainder of the paper provides a background on
plinary) analysis and the efficiency of the optimization. The MDO frameworks and the history of OpenMDAO develop-
coupled analysis method that is easiest to implement is a ment (Sec. 2), the theory behind OpenMDAO (Sec. 4), and
fixed-point iteration (also known as nonlinear block Gauss– the details of the major new contributions in OpenMDAO V2
Seidel iteration), but for strongly coupled models, Newton- in terms of multidisciplinary solvers (Sec. 5) and coupled
type methods are potentially more efficient [40, 44, 58, 15]. derivative computation (Sec. 6).
When it comes to numerical optimization, gradient-based
optimization algorithms scale much better with the number 2 Background
of design variables than gradient-free methods. The com-
putational efficiency of both Newton-type analysis methods The need for frameworks that facilitate the implementation
and gradient-based optimization is, in large part, dependent of MDO problems and their solution was identified soon
on the cost and accuracy with which the necessary deriva- after MDO emerged as a field. Various requirements have
tives are computed. been identified over the years. Early on, Salas and Townsend
One can always compute derivatives using finite differ- [88] detailed a large number of requirements that they cate-
ences, but analytic derivative methods are much more effi- gorized under software design, problem formulation, prob-
cient and accurate. Despite the extensive research into ana- lem execution, and data access. Later, Padula and Gillian
OpenMDAO 3
[81] more succinctly cited modularity, data handling, par- to address aircraft design challenges at NASA. Two years
allel processing, and user interface as the most important later, Gray et al. [31] implemented the first version of Open-
requirements. While frameworks that fulfill these require- MDAO (V0.1). An early aircraft design application using
ments to various degrees have emerged, the issue of com- OpenMDAO to implement gradient-free efficient global op-
putational efficiency or scalability has not been sufficiently timization was presented by Heath and Gray [43]. Gray et al.
highlighted or addressed. [32] later presented benchmarking results for various MDO
The development of commercial MDO frameworks dates architectures using gradient-based optimization with ana-
back to the late 1990s with iSIGHT [29], which is now owned lytic derivatives in OpenMDAO.
by Dassault Systèmes and renamed Isight/SEE. Various other As the pyMDO and OpenMDAO frameworks progressed,
commercial frameworks have been developed, such as Phoenix it became apparent that the computation of derivatives for
Integration’s ModelCenter/CenterLink, Esteco’s modeFRON- MDO presented a previously unforeseen implementation bar-
TIER, TechnoSoft’s AML suite, Noesis Solutions’ Optimus, rier that these frameworks needed to address. The methods
SORCER [59], and Vanderplaats’ VisualDOC [2]. These frame- available for computing derivatives are finite-differencing,
works have focused on making it easy for users to couple complex-step, algorithmic differentiation, and analytic meth-
multiple disciplines and to use the optimization algorithms ods. The finite-difference method is popular because it is
through graphical user interfaces (GUIs). They have also easy to implement and can always be used, even without
been providing wrappers to popular commercial engineering any access to source code, but it is subject to large inac-
tools. While this focus has made it convenient for users to curacies. The complex-step method [91, 70] is not subject
implement and solve MDO problems, the numerical meth- to these inaccuracies, but it requires access to the source
ods used to converge the multidisciplinary analysis (MDA) code to implement. Both finite-difference and complex-step
and the optimization problem are usually not state-of-the- methods become prohibitively costly as the number of de-
art. For example, these frameworks often use fixed-point it- sign variables increases because they require rerunning the
eration to converge the MDA. When derivatives are needed simulation for each additional design variable. Algorithmic
for a gradient-based optimizer, finite-difference approxima- differentiation (AD) uses a software tool to parse the code of
tions are used rather than more accurate analytic derivatives. an analysis tool to produce new code that computes deriva-
When solving MDO problems, we have to consider how tives of that analysis [38, 76]. Although AD can be effi-
to organize the discipline analysis models, the problem for- cient, even for large numbers of design variables, it does not
mulation, and the optimization algorithm in order to obtain handle iterative simulations efficiently in general. Analytic
the optimum design with the lowest computational cost pos- methods are the most desirable because they are both accu-
sible. The combination of the problem formulation and or- rate and efficient even for iterative simulations [67]. How-
ganizational strategy is called the MDO architecture. MDO ever, they require significant implementation effort. Ana-
architectures can be either monolithic (where a single opti- lytic methods can be implemented in two different forms:
mization problem is solved) or distributed (where the prob- the direct method and the adjoint method. The choice be-
lem is partitioned into multiple optimization subproblems). tween these two methods depends on how the number of
Martins and Lambe [69] describe this classification in more functions that we want to differentiate compares to the num-
detail and present all known MDO architectures. ber of design variables. In practice, the adjoint method tends
In an attempt to facilitate the exploration of the vari- to be the more commonly used method.
ous MDO architectures, Tedford and Martins [93] devel- Early development of the adjoint derivative computa-
oped pyMDO. This was the first object-oriented framework tion was undertaken by the optimal control community in
that focused on automating the implementation of different the 1960s and 1970s [11], and the structural optimization
MDO architectures [73]. In pyMDO, the user defined the community adapted those developments throughout the ’70s
general MDO problem once, and the framework would re- and ’80s [1]. This was followed by the development of ad-
formulate the problem in any architecture with no further joint methods for computational fluid dynamics [51], and
user effort. Tedford and Martins [94] used this framework aerodynamic shape optimization became a prime example
to compare the performance of various MDO architectures, of an application where the adjoint method has been partic-
concluding that monolithic architectures vastly outperform ularly successful [84, 12, 16]. When computing the deriva-
the distributed ones. Marriage and Martins [66] integrated tives of coupled systems, the same methods that are used
a semi-analytic method for computing derivatives based on for single disciplines apply. Sobieszczanski-Sobieski [90]
a combination of finite-differencing and analytic methods, presented the first derivation of the direct method for cou-
showing that the semi-analytic method outperformed the tra- pled systems, and Martins et al. [72] derived the coupled
ditional black-box finite-difference approach. adjoint method. One of the first applications of the coupled
The origins of OpenMDAO began in 2008, when Moore adjoint method was in high-fidelity aerostructural optimiza-
et al. [75] identified the need for a new MDO framework tion [71]. The results from the work on coupled derivatives
4 Gray, Hwang, Martins, Moore, and Naylor
highlighted the promise of dramatic computational cost re- tageous for large parallel models, this feature was ineffi-
ductions, but also showed that existing frameworks were cient for smaller serial models. The need to support both
not able to handle these methods. Their implementation re- serial and parallel computing architectures led to the devel-
quired linear solvers and support for distributed memory opment of OpenMDAO V2, a second rewrite of the frame-
parallelism that no framework had at the time. work, which is presented in this paper.
In an effort to unify the theory for the various meth- Recently, the value of analytic derivatives has also moti-
ods for computing derivatives, Martins and Hwang [67] de- vated the development of another MDO framework, GEMS,
rived the unified derivatives equation. This new generaliza- which is designed to implement bi-level distributed MDO
tion showed that all the methods for computing derivatives architectures that are more useful in some industrial settings
can be derived from a common equation. It also showed [27]. This stands in contrast to OpenMDAO, which is fo-
that when there are both implicitly and explicitly defined cused mostly on the monolithic MDO architectures for best
disciplines, the adjoint method and chain rule can be com- possible computational efficiency.
bined in a hybrid approach. Hwang et al. [50] then realized
that this theoretical insight provided a sound and convenient
mathematical basis for a new software design paradigm and 3 Overview of OpenMDAO V2
set of numerical solver algorithms for MDO frameworks.
In this section, we introduce OpenMDAO V2, present its
Using a prototype implementation built around the unified
overall approach, and discuss its most important feature—
derivatives equation [50, 68], they solved a large-scale satel-
efficient derivative computation. To help with the explana-
lite optimization problem with 25,000 design variables and
tions, we introduce a simple model and optimization prob-
over 2 million state variables. Later, Gray et al. [33] de-
lem that we use throughout Secs. 3 and 4.
veloped OpenMDAO V1, a complete rewrite of the Open-
MDAO framework based on the prototype work of Hwang
et al. with the added ability to exploit sparsity in a coupled 3.1 Basic description
multidisciplinary model to further reduce computational cost.
Collectively, the work cited above represented a signif- OpenMDAO is an open-source software framework for mul-
icant advancement of the state-of-the-art for MDO frame- tidisciplinary design, analysis, and optimization (MDAO),
works. The unified derivatives equation, combined with the also known as multidisciplinary design optimization (MDO).
new algorithms and framework design, enabled the solu- It is primarily designed for gradient-based optimization; its
tion of significantly larger and more complex MDO prob- most useful and unique features relate to the efficient and ac-
lems than had been previously attempted. In addition, Open- curate computation of the model derivatives. We chose the
MDAO had now integrated three different methods for com- Python programming language to develop OpenMDAO be-
puting total derivatives into a single framework: finite- cause it makes scripting convenient, it provides many op-
difference, analytic, and semi-analytic. However, this work tions for interfacing to compiled languages (e.g., SWIG and
was all done using serial discipline analyses and run on a Cython for C and C++, and F2PY for Fortran), and it is an
serial computing environment. The serial computing envi- open-source language. OpenMDAO facilitates the solution
ronment presented a significant limitation, because it pre- of MDO problems using distributed-memory parallelism and
cluded the integration of high-fidelity analyses into the cou- high-performance computing (HPC) resources by leverag-
pled models. ing MPI and the PETSc library [3].
To overcome the serial computing limitation, Hwang and
Martins [47] parallelized the data structures and solver al-
gorithms from their prototype framework, which led to the 3.2 A simple example
modular analysis and unified derivatives (MAUD) architec-
ture. Hwang and Martins [45] used the new MAUD proto- This example consists of a model with one scalar input, x,
type to solve a coupled aircraft allocation-mission-design two “disciplines” that define state variables (y1 , y2 ), and one
optimization problem. OpenMDAO V1 was then modified scalar output, f . The equations for the disciplines are
to incorporate the ideas from the MAUD architecture. Gray (Discipline 1) y1 = y22 (1)
et al. [36] presented an aeropropulsive design optimization
problem constructed in OpenMDAO V1 that combined a (Discipline 2) exp(−y1 y2 ) − xy2 = 0, (2)
high fidelity aerodynamics model with a low fidelity propul- where Discipline 1 computes y1 explicitly and Discipline 2
sion model, executed in parallel. One of the central fea- computes y2 implicitly. The equation for the model output f
tures of the MAUD architecture, enabling the usage of par- is
allel computing and high-fidelity analyses, was the use of
hierarchical, matrix-free linear solver design. While advan- f = y21 − y2 + 3. (3)
OpenMDAO 5
Figure 1 visualizes the variable dependencies in this model to automate tasks that are performed repeatedly when build-
using a design structure matrix. We show components that ing multidisciplinary models. Instances of the Component
compute variables on the diagonal and dependencies on the class provide the lowest-level functionality representing ba-
off-diagonals. From Fig. 1, we can easily see the feedback sic calculations. Each component instance maps input values
loop between the two disciplines, as well as the overall se- to output values via some calculation. A component could
quential structure with the model input, the coupled dis- be a simple explicit function, such as y = sin(x); it could in-
ciplines, and the model output. We will refer back to this volve a long sequence of code; or it could call an external
model and optimization problem periodically throughout Secs. 3 code that is potentially written in another language. In mul-
and 4. tidisciplinary models, each component can encapsulate just
To minimize f with respect to x using gradient-based op- a part of one discipline, a whole discipline, or even multi-
timization, we need the total derivative d f / dx. In Sec. 3.4 ple disciplines. In our simple example, visualized in Fig. 1,
we use this example to demonstrate how OpenMDAO com- there are four components: Discipline 1 and the model out-
putes the derivative. put are components that compute explicit functions, Disci-
pline 2 is a component that computes an implicit function,
and the model input is a special type of component with only
x=1
Model input
x outputs and no inputs.
2
Another fundamental class in OpenMDAO is Group, which
1y =y 2 y y
Discipline 1
1 1
contains components, other groups, or a mix of both. The
containment relationships between groups and components
y exp(−y y ) − xy = 0
1 2 2 y
2 2
Discipline 2 form a hierarchy tree, where a top-level group contains other
2
f =y −y +3
groups, which contain other groups, and so on, until we
1 2
Model output reach the bottom of the tree, which is composed only of
components. Group instances serve three purposes: (1) they
Fig. 1: Extended design structure matrix (XDSM) [62] for help to package sets of components together, e.g., the com-
the simple model. Components that compute variables are ponents for a given discipline; (2) they help create better-
on the diagonal, and dependencies are shown on the off- organized namespaces (since all components and variables
diagonals, where an entry above the diagonal indicates a are named based on their ancestors in the tree); and (3) they
forward dependence and vice versa. Blue indicates an in- facilitate the use of hierarchical nonlinear and linear solvers.
dependent variable, green indicates an explicit function, and In our simple example, the obvious choice is to create a
red indicates an implicit function. group containing Discipline 1 and Discipline 2, because these
two form a coupled pair that needs to be converged for any
given value of the model input. The hierarchy of groups and
components collectively form the model.
3.3 Approach and nomenclature Children of the Driver base class define algorithms that
iteratively call the model. For example, a sub-class of Driver
OpenMDAO uses an object-oriented programming paradigm might implement an optimization algorithm or execute de-
and an object composition design pattern. Specific function- sign of experiments (DOE). In the case of an optimization
ality via narrowly focused classes are combined to achieve algorithm, the design variables are a subset of the model in-
the desired functionality during execution. In this section, puts, and the objective and constraint functions are a subset
we introduce the four most fundamental types of classes of the model outputs.
in OpenMDAO: Component, Group, Driver, and Problem. Instances of the Problem class perform as a top-level
Note that for the Component class, the end user actually container, holding all other objects. A problem instance con-
works with one of its two derived classes, ExplicitCompo- tains both the groups and components that constitute the
nent or ImplicitComponent, which we describe later in this model hierarchy, and also contains a single driver instance.
section. In addition to serving as a container, a problem also provides
MDO has traditionally considered multiple “disciplines” the user interface for model setup and execution.
as the units that need to be coupled through coupling vari- Figure 2 illustrates the relationships between instances
ables. In OpenMDAO, we consider more general compo- of the Component, Group, and Driver classes, and intro-
nents, which can represent a whole discipline analysis or can duces the nomenclature for derivatives. The driver repeat-
perform a smaller sub-set of calculations representing only edly calls model (i.e., the top-level instance of Group, which
a portion of a whole discipline model. Components share a in turn contains groups that contain other groups that con-
common interface that allows them to be integrated to form tain the component instances). The derivatives of the model
a larger model. This modular approach allows OpenMDAO outputs with respect to the model inputs are considered to
6 Gray, Hwang, Martins, Moore, and Naylor
be total derivatives, while the derivatives of the component as discussed in Sec. 4.3. When integrating external analysis
outputs with respect to the component inputs are considered tools with built-in solvers, this means exposing the residuals
to be partial derivatives. This is not the only way to define and the corresponding state variable vector. Then, the total
the difference between partial and total derivatives, but this derivatives are computed in a two-step process: (1) compute
is a definition that suits the present context and is consistent the partial derivatives of each component; and (2) solve a
with previous work on the computation of coupled deriva- linear system of equations that computes the total deriva-
tives [72]. In the next section, we provide a brief explanation tives. The linear system in Step 2 can be solved in a forward
of how OpenMDAO computes derivatives. (direct) or a reverse (adjoint) form. As mentioned in the in-
troduction, the cost of the forward method scales linearly
with the number of inputs, while the reverse method scales
3.4 Derivative computation
linearly with the number of outputs. Therefore, the choice
of which form to use depends on the ratio of the number of
As previously mentioned, one of the major advantages of
outputs to the number of inputs. The details of the linear sys-
OpenMDAO is that it has the ability to compute total deriva-
tems are derived and discussed in Sec. 4. For the purposes of
tives for complex multidisciplinary models very efficiently
this section, it is sufficient to understand that the total deriva-
via a number of different techniques. Total derivatives are
tives are computed by solving these linear systems, and that
derivatives of model outputs with respect to model inputs.
the terms in these linear systems are partial derivatives that
In the example problem from Sec. 3.2, the total derivative
need to be provided.
needed to minimize the objective function is just the scalar
d f / dx. Here, we provide a high-level overview of the pro- In the context of OpenMDAO, partial derivatives are de-
cess for total derivative computation because the way it is fined as the derivatives of the outputs of each component
done in OpenMDAO is unique among computational mod- with respect to the component inputs. For an ExplicitCom-
eling frameworks. The mathematical and algorithmic details ponent, which is used when outputs can be computed as
of total derivative computation are described in Sec. 4. an analytic function of the inputs, the partial derivatives are
Total derivatives are difficult and expensive to compute the derivatives of these outputs with respect to the compo-
directly, especially in the context of a framework that must nent inputs. For an ImplicitComponent, which is used when
deal with user-defined models of various types. As men- a component provides OpenMDAO with residual equations
tioned in the introduction, there are various options for com- that need to be solved, the partial derivatives are of these
puting derivatives: finite differencing, complex step, algo- residuals with respect to the component input and output
rithmic differentiation, and analytic methods. The finite-dif- variables. Partial derivatives can be computed much more
ference method can always be used because it just requires simply and with lower computational cost than total deriva-
re-running the model with a perturbation applied to the in- tives. OpenMDAO supports three techniques for computing
put. However, the accuracy of the result depends heavily partial derivatives: full-analytic, semi-analytic, and mixed-
on the magnitude of the perturbation, and the errors can be analytic.
large. The complex-step method yields accurate results, but
When using the full-analytic technique, OpenMDAO ex-
it requires modifications to the model source code to work
pects each and every component in the model to provide par-
with complex numbers. The computational cost of these meth-
tial derivatives. These partial derivatives can be computed
ods scales with the number of input variables, since the model
either by hand differentiation or via algorithmic differentia-
needs to be re-run for a perturbation in each input. Open-
tion. For the example model in Sec. 3.2, the partial deriva-
MDAO provides an option to use either of these methods,
tives can easily be hand-derived. Discipline 1 is an Explic-
but their use is only recommended when the ease of imple-
itComponent defined as y1 = y22 (one input and one output),
mentation justifies the increase in computational cost and
so we only need the single partial derivative:
loss of accuracy.
As described in the introduction, analytic methods have
the advantage that they are both efficient and accurate. Open-
MDAO facilitates the derivative computation for coupled
systems using analytic methods, including the direct and ad- ∂ y1
= 2y2 . (4)
joint variants. To use analytic derivative methods in Open- ∂ y2
MDAO, the model must be built such that any internal im-
plicit calculations are exposed to the framework. This means
that the model must be cast as an implicit function of de-
sign variables and implicit variables with associated residu- Discipline 2 is an ImplicitComponent, so it is defined as a
als that must be converged. For explicit calculations, Open- residual that needs to be driven to zero, R = exp(−y1 y2 ) −
MDAO performs the implicit transformation automatically, xy2 = 0. In this case, we need the partial derivatives of this
OpenMDAO 7
Group
model
component outputs
Problem component inputs Component
partial derivatives
Group
Driver model
(e.g., optimizer) component outputs
component inputs Component outputs
model partial derivatives
inputs component outputs
component inputs Component total
partial derivatives derivatives
Group
Group model outputs component outputs
model inputs total derivatives component inputs Component
model partial derivatives
Fig. 2: Relationship between Driver, Group, and Component classes. An instance of Problem contains a Driver instance,
and the Group instance named “model”. The model instance holds a hierarchy of Group and Component instances. The
derivatives of a model are total derivatives, and the derivatives of a component are partial derivatives.
residual function with respect to all the variables: is used for some of the partial derivatives, the net result is ef-
fectively identical to the fully-analytic method. If finite dif-
∂R ferences are used to compute some of the partial derivatives,
= −y2 exp(−y1 y2 ), (5)
∂ y1 then some accuracy is lost, but overall the net result is still
∂R better than either the semi-analytic approach or finite differ-
= −y1 exp(−y1 y2 ) − x, (6) encing the coupled model to compute the total derivatives.
∂ y2
∂R
= −y2 . (7)
∂x
3.5 Implementation of the Simple Example
Finally, we also need the partial derivatives of the objective
function component: We now illustrate the use of the OpenMDAO basic classes
by showing the code implementation of the simple model
∂f ∂f we presented in Sec. 3.2.
= 2y1 , = −1. (8)
∂ y1 ∂ y2 The run script is listed in Fig. 3. In Block 1, we im-
port several classes from the OpenMDAO API, as well as
When using the semi-analytic technique, OpenMDAO the components for Discipline 1 and Discipline 2, which
automatically computes the partial derivatives for each com- we show later in this section. In Block 2, we instantiate the
ponent using either the finite-difference or complex-step meth- four components shown in Fig. 1, as well as a group that
ods. This is different from applying these methods to the combines the two disciplines, called states group. In this
whole model because it is done component by component, group, we connect the output of Discipline 1 to the input of
and therefore it does not require the re-convergence of the Discipline 2 and vice-versa. Since there is coupling within
coupled system. For instances of an ImplicitComponent, only this group, we also assign a Newton solver to be used when
partial derivatives of the residual functions are needed (e.g., running the model and a direct (LU) solver to be used for
Eqs. (5), (6), and (7) in the example). Since residual eval- the linear solutions required for the Newton iterations and
uations do not involve any nonlinear solver iterations, ap- the total derivative computation. For the model output, we
proximating their partial derivatives is much less expensive define a component “inline”, using a convenience class pro-
and more accurate. The technique is called “semi-analytic” vided by the OpenMDAO standard library. In Block 3, we
because while the partial derivatives are computed numeri- create the top-level group, which we appropriately name as
cally, the total derivatives are still computed analytically by model, and we add the relevant subsystems to it and make
solving a linear system. the necessary connections between inputs and outputs.
In the mixed-technique, some components provide ana- In Block 4, we specify the model inputs and model out-
lytic partial derivatives, while others approximate the par- puts, which in this case correspond to the design variable
tials with finite-difference or complex step methods. The and objective function, respectively, since we are setting up
mixed-technique offers great flexibility and is a good op- the model to solve an optimization problem. In Block 5, we
tion for building models that combine less costly analyses create the problem, assign the model and driver, and run
without analytic derivatives and computationally expensive setup to signal to OpenMDAO that the problem construc-
analyses that do provide them. If the complex-step method tion is complete so it can perform the necessary initializa-
8 Gray, Hwang, Martins, Moore, and Naylor
import numpy as np
# Block 1: OpenMDAO and component imports
from openmdao . api import Problem , Group , ScipyOptimizeDriver
from openmdao . api import IndepVarComp , ExecComp
from openmdao . api import NewtonSolver , DirectSolver
from disciplines import Discipline1 , Discipline2
# Block 2: creation of all the components and groups
# except the top -level group
input_comp = IndepVarComp ( ’x’ )
states_group = Group ( )
states_group . add_subsystem ( ’discipline1_comp ’ , Discipline1 ( ) )
states_group . add_subsystem ( ’discipline2_comp ’ , Discipline2 ( ) )
states_group . connect ( ’discipline1_comp.y1’ , ’discipline2_comp.y1’ )
states_group . connect ( ’discipline2_comp.y2’ , ’discipline1_comp.y2’ )
states_group . nonlinear_solver = NewtonSolver ( iprint = 0 )
states_group . linear_solver = DirectSolver ( iprint = 0 )
output_comp = ExecComp ( ’f=y1**2-y2+3.’ )
# Block 3: creation of the top -level group
model = Group ( )
model . add_subsystem ( ’input_comp ’ , input_comp )
model . add_subsystem ( ’states_group ’ , states_group )
model . add_subsystem ( ’output_comp ’ , output_comp )
model . connect ( ’input_comp.x’ , ’states_group.discipline2_comp.x’ )
model . connect ( ’states_group.discipline1_comp.y1’ , ’output_comp.y1’ )
model . connect ( ’states_group.discipline2_comp.y2’ , ’output_comp.y2’ )
# Block 4: specification of the model input (design variable)
# and model output (objective)
model . add_design_var ( ’input_comp.x’ )
model . add_objective ( ’output_comp.f’ )
# Block 5: creation of the problem and setup
prob = Problem ( )
prob . model = model
prob . driver = ScipyOptimizeDriver ( )
prob . setup ( )
# Block 6: set a model input; run the model; and print a model output
prob [ ’input_comp.x’ ] = 1 .
prob . run_model ( )
print ( prob [ ’output_comp.f’ ] )
# Block 7: solve the optimization problem and print the results
prob . run_driver ( )
print ( prob [ ’input_comp.x’ ] , prob [ ’output_comp.f’ ] )
Fig. 3: Run script for the simple example. This script depends on a disciplines file that defines the components for
Disciplines 1 and 2 (see Fig. 4)
tion. In Block 6, we illustrate how to set a model input, run The component that computes the objective function is
the model, and read the value of a model output, and in Block built using the inline ExecComp. ExecComp is a helper class
7, we run the optimization algorithm and print the results. in the OpenMDAO standard library that provides a conve-
In Fig. 4, we define the actual computations and par- nient shortcut for implementing an ExplicitComponent
tial derivatives for the components for the two disciplines. for simple and inexpensive calculations. This provides the
Both classes inherit from OpenMDAO base classes and im- user a quick mechanism for adding basic calculations like
plement methods in the component API, but they are dif- summing values or subtracting quantities. However, ExecComp
ferent because Discipline 1 is explicit while Discipline 2 is uses the complex-step method to compute the derivatives,
implicit. For both, setup is where the component declares so it should not be used for expensive calculations or where
its inputs and outputs, as well as information about the par- there is a large input array.
tial derivatives (e.g., sparsity structure and whether to use fi-
nite differences to compute them). In Discipline 1, compute
maps inputs to outputs, and compute partials is respon-
Figure 5 shows a visualization of the model generated
sible for providing partial derivatives of the outputs with re-
automatically by OpenMDAO. The hierarchy structure of
spect to inputs. In Discipline 2, apply nonlinear maps in-
the groups and components is shown on the left, and the de-
puts and outputs to residuals, and linearize computes the
pendency graph is shown on the right. This diagram is useful
partial derivatives of the residuals with respect to inputs and
for understanding how data is exchanged between compo-
outputs. More details regarding the API can be found in the
nents in the model. Any connections above the diagonal in-
documentation on the OpenMDAO website 1 .
dicate feed-forward data relationships, and connections be-
low the diagonal show feedback relationships that require a
1 http://www.openmdao.org/docs nonlinear solver.
OpenMDAO 9
import numpy as np
from openmdao . api import ExplicitComponent , ImplicitComponent
class Discipline1 ( ExplicitComponent ) :
def setup ( self ) :
self . add_input ( ’y2’ )
self . add_output ( ’y1’ )
self . declare_partials ( ’y1’ , ’y2’ )
def compute ( self , inputs , outputs ) :
outputs [ ’y1’ ] = inputs [ ’y2’ ] ∗∗ 2
def compute_partials ( self , inputs , partials ) :
partials [ ’y1’ , ’y2’ ] = 2 ∗ inputs [ ’y2’ ]
class Discipline2 ( ImplicitComponent ) :
def setup ( self ) :
self . add_input ( ’x’ )
self . add_input ( ’y1’ )
self . add_output ( ’y2’ )
self . declare_partials ( ’y2’ , ’x’ )
self . declare_partials ( ’y2’ , ’y1’ )
self . declare_partials ( ’y2’ , ’y2’ )
def apply_nonlinear ( self , inputs , outputs , residuals ) :
residuals [ ’y2’ ] = ( np . exp(−inputs [ ’y1’ ] ∗ outputs [ ’y2’ ] ) −
inputs [ ’x’ ] ∗ outputs [ ’y2’ ] )
def linearize ( self , inputs , outputs , partials ) :
partials [ ’y2’ , ’x’ ] = −outputs [ ’y2’ ]
partials [ ’y2’ , ’y1’ ] = (−outputs [ ’y2’ ] ∗ np . exp(−inputs [ ’y1’ ] ∗
outputs [ ’y2’ ] ) )
partials [ ’y2’ , ’y2’ ] = (−inputs [ ’y1’ ] ∗
np . exp(−inputs [ ’y1’ ] ∗ outputs [ ’y2’ ] ) − inputs [ ’x’ ] )
Fig. 4: Definition of the components for Discipline 1 and Discipline 2 for the simple example, including the computation of
the partial derivatives.
df ∂ F ∂ F dy
= + , (9)
dx ∂x ∂ y dx
Fig. 5: Visualization of the simple model generated auto-
where we distinguish the quantity f from the function F that
matically by OpenMDAO. In the hierarchy tree on the left,
computes it using lowercase and uppercase, respectively. Us-
the darker blue blocks are groups, the lighter blue blocks
ing this notation, total derivatives account for the implicit
are components, pink blocks are component inputs, and grey
relation between variables, while the partial derivatives are
blocks are component outputs.
just explicit derivatives of a function [47]. The only deriva-
tive in the right-hand side of Eq. (9) that is not partial is
dy/ dx, which captures the change in the converged values
4 Theory for y with respect to x. Noting the implicit dependence by
R(x, y) = 0, we can differentiate it with respect to x to obtain
As previously mentioned, one of the main goals in Open-
MDAO is to efficiently compute the total derivatives of the dr ∂ R ∂ R dy
model outputs ( f ) with respect to model inputs (x), and we = + = 0. (10)
dx ∂ x ∂ y dx
stated that we could do this using partial derivatives com-
puted with analytic methods. For models consisting purely Re-arranging this equation, we get the linear system
of explicit functions, the basic chain rule can be used to
achieve this goal. However, when implicit functions are present ∂ R dy ∂R
=− . (11)
in the model (i.e., any functions that require iterative nonlin- ∂ y dx ∂x
| {z } |{z}
ear solvers), the chain rule is not sufficient. In this section,
| {z }
m×m m×n m×n
10 Gray, Hwang, Martins, Moore, and Naylor
Now dy/ dx can be computed by solving this linear system, framework to combine the chain rule, direct, and adjoint
which is constructed using only partial derivatives. This lin- methods into a single implementation that works even when
ear system needs to be solved n times, once for each compo- using models that utilize distributed memory parallelism,
nent of x, with the column of ∂ R/∂ x that corresponds to the such as computational fluid dynamics (CFD) and finite el-
element of x as the right-hand side. Then, dy/ dx can be used ement analysis (FEA) codes.
in Eq. (9) to compute the total derivatives. This approach is
known as the direct method.
There is another way to compute the total derivatives 4.2 Nonlinear Problem Formulation
based on these equations. If we substitute the linear sys-
tem (11) into the total derivative equation (9), we obtain OpenMDAO V1 and V2 were designed based on the algo-
rithms and data structures of MAUD, but V2 includes sev-
m×m
1×m
z}|{ z }| { eral additions to the theory and algorithms to enable more
df ∂F ∂ F ∂ R −1 ∂ R efficient execution for serial models. In this section, we sum-
= − . (12) marize the key MAUD concepts and present the new ad-
dx ∂x ∂y ∂y ∂x
| {z } ditions in OpenMDAO V2 that make the framework more
ψT
efficient for serial models. The core idea of MAUD is to
By grouping the terms [∂ R/∂ y]−1 and ∂ F/∂ y, we get an formulate any model (including multidisciplinary models)
m-vector, ψ, which is the adjoint vector. Instead of solving as a single nonlinear system of equations. This means that
for dy/ dx with Eq. (11) (the direct method), we can instead we concatenate all variables—model inputs and outputs, and
solve a linear system with [∂ F/∂ y]T as the right-hand side both explicit and implicit component variables—into a sin-
to compute ψ: gle vector of unknowns, u. Thus, in all problems, we can
represent the model as R(u) = 0, where R is a residual func-
T
∂F T
∂R tion defined in such a way that this system is equivalent to
ψ = . (13)
∂ y |{z} ∂y the original model.
| {z } m×1 | {z }
m×m
For the simple example from Sec. 3.2, our vector of un-
m×1
knowns would be u = (x, y1 , y2 , f ), and the correct residual
This linear system needs to be solved once for each function function is
of interest f . If f is a vector variable, then the right-hand
side for each solution is the corresponding row of ∂ F/∂ y. rx x − x∗
r y1 − y22
The transpose of the adjoint vector, ψ T , can then be used to R(u) = y1 = = 0.
(15)
compute the total derivative, ry2 exp(−y1 y2 ) − xy2
rf f − (y21 − y2 + 3)
df ∂F ∂R
= − ψT . (14)
dx ∂x ∂x Although the variable x is not an “unknown” (it has a value
that is set explicitly), we reformulate it into an implicit form
This is the adjoint method, and the derivation above shows
by treating it as an unknown and adding a residual that forces
why the computational cost of this method is proportional
it to the expected value of x∗ . Using this approach, any com-
to the number of outputs and independent of the number of
putational model can be written as a nonlinear system of
inputs. Therefore, if the number of inputs exceeds the num-
equations such that the solution of the system yields the
ber of outputs, the adjoint method is advantageous, while if
same outputs and intermediate values as running the orig-
the opposite is true, then the direct method has the advan-
inal computational model.
tage. The main idea of these analytic methods is to compute
Users do not actually need to re-formulate their prob-
total derivatives (which account for the solution of the mod-
lems in this fully implicit form because OpenMDAO han-
els) using only partial derivatives (which do not require the
dles the translation automatically via the ExplicitComponent
solution of the models).
class, as shown in the code snippet in Fig. 4. However, the
As mentioned in Sec. 2, these analytic methods have
framework does rely on the fully implicit formulation for its
been extended to MDO applications [90, 72, 67]. All of
internal representation.
these methods have been used in MDO applications, but
The key benefit of representing the whole model as a
as was discussed in Sec. 2, the implementations tend to be
single monolithic nonlinear system is that we can use the
highly application specific and not easily integrated into an
unified derivatives equation [67, 47], which generalizes all
MDO framework.
analytic derivative methods. The unified derivatives equa-
To overcome the challenge of application-specific deriva-
tion can be written as
tive computations, Hwang and Martins [47] developed the
T T
modular analysis and unified derivatives (MAUD) archi-
∂ R du ∂R du
tecture, which provides the mathematical and algorithmic =I = , (16)
∂u dr ∂u dr
OpenMDAO 11
where u is a vector containing inputs, implicitly defined vari- – linearize(p, u): Perform any one-time linearization op-
ables, and outputs, and R represents the corresponding resid- erations, e.g., computing partial derivatives of the resid-
ual functions. The matrix du/ dr contains a block with the uals or approximating them via finite differences.
total derivatives that we ultimately want (i.e., the derivatives – apply linear(du, dr): Compute a Jacobian-vector prod-
of the model outputs with respect to the inputs, d f / dx). uct, and place the result in the storage vector. For the
Again, we use lowercase and uppercase to distinguish be- forward mode this product is
tween quantities and functions, as well as the convention for
total and partial derivatives introduced earlier. For the sim- ∂R
dr = du , (18)
ple example in Sec. 3.2 the total derivative matrix is ∂u
dx dx dx dx
and for the reverse mode, it is
drx dry1 dry2 dr f 1 0 0 0
dy1 dy1 dy1 dy1 dy
1 dy1 dy1 T
du 0 ∂R
drx dry1 dry2 dr f dx dry1 dry2 du = dr . (19)
= dy2 dy2 dy2 dy2 = dy2 dy2 dy2 , (17) ∂u
dr drx dry dry dr dx dry dry 0
1 2 f d f d f1 d f2
df df df df
dr dr
x y1 dr y2dr f dx dry dry 1
1 2
– solve linear(du, dr): Multiply the inverse of the Jaco-
bian with the provided right-hand side vector (or solve a
where the middle term shows the expanded total derivative linear system to compute the product without explicitly
matrix and the right-most term simplifies these derivatives. computing the inverse), and place the result in the stor-
The middle term is obtained by inserting u = [x, y1 , y2 , f ]T age vector. For the forward mode,
and r = [rx , ry1 , ry2 , r f ]T . The simplification in the right-most −1
term is possible because from Eq. (15), we know that for ∂R
du = dr , (20)
example, ∂u
directly implement any of the five basic API methods. In- central features in OpenMDAO that enables the framework
stead, the user implements the compute and compute partialsto work efficiently with a range or models that have vastly
methods that the ExplicitComponent base class uses to im- different structures and computational needs.
plement the necessary lower level methods, as shown in Al-
gorithm (1). The negative sign in line 8 of Algorithm (1)
indicates that the partial derivatives for the implicit trans- 5 Monolithic and Hierarchical Solution Strategies
formation are the negative of the partial derivatives for the
original explicit function. As shown in Eq. (15), the implicit OpenMDAO uses a hierarchical arrangement of groups and
transformation for the explicit output f is given by components to organize models, define execution order, and
control data passing. This hierarchical structure can also be
r f = f − (y21 − y2 + 3) , (22) used to define nonlinear and linear solver hierarchies for
which explains the negative sign. models. While in some cases it is better to match the solver
hierarchy closely to that of the model structure, in most cases,
better performance is achieved when the solver structure
Algorithm 1 ExplicitComponent API
is more monolithic than the associated model. The frame-
1: function apply nonlinear(p, u, r)
2: r ← u − compute(p)
work provides options for both nonlinear and linear solvers,
3: return r and allows the user to mix them at the various levels of
4: function solve nonlinear(u) the model hierarchy to customize the solver strategy for any
5: u ← compute(p) given model.
6: return u
7: function
The hierarchical model structure and solver structure used
h i linearize(p)
∂R in OpenMDAO were first proposed as part of the MAUD ar-
8: ∂ u ← −compute partials(p)
h i chitecture [47]. In addition, MAUD also included several al-
9: return ∂∂ Ru gorithms that implement monolithic and hierarchical solvers
in the model hierarchy that OpenMDAO also adopted: mono-
lithic Newton’s method, along with hierarchical versions of
For subclasses of ImplicitComponent, such as Discipline2 nonlinear block Gauss–Seidel, nonlinear block Jacobi, lin-
in Fig. 4, only apply nonlinear is strictly required, and ear block Gauss–Seidel, and linear block Jacobi. In addition
solve nonlinear is optional. (The base class implements to these solvers, OpenMDAO V2 implements a new hierar-
a method that does not perform any operations.) For many chical nonlinear solver that improves performance for very
models, such as the example in Fig. 3, it is sufficient to rely tightly coupled models (e.g., hierarchical Newton’s method).
on one of the nonlinear solvers in OpenMDAO’s standard li- It also includes a monolithic linear solver strategy that en-
brary to converge the implicit portions of a model. Alterna- ables much greater efficiency for serial models.
tively, a component that wraps a complex discipline analysis This section describes the new contributions in Open-
can use solve nonlinear to call the specialized nonlinear MDAO, along with a summary of the relevant strategies and
solver built into that analysis code. solver algorithms adopted from the MAUD architecture.
In the following section, we discuss the practical matter
of using the API methods to accomplish the nonlinear and
linear solutions required to execute OpenMDAO models. In 5.1 Nonlinear Solution Strategy
both the nonlinear and linear cases, there are two strategies
employed, depending on the details of the underlying model Although the user may still implement any explicit analy-
being worked with: monolithic and hierarchical. While in ses in the traditional form using ExplicitComponent, Open-
our discussion we recommend using each strategy for cer- MDAO internally transforms all models into the implicit
tain types of models, in actual models, the choice does not form defined by MAUD, i.e., R(u) = 0. For the simple ex-
need to be purely one or the other. Different strategies can be ample problem from Sec. 3.2, this transformation is given
employed at different levels of the model hierarchy to match by Eq. (15). While the transformation is what makes it pos-
the particular needs of any specific model. sible to leverage the unified derivatives equation to com-
In addition, it is important to note that the usage of one pute total derivatives, it also yields a much larger implicit
strategy for the nonlinear solution does not prescribe that system that now represents the complete multidisciplinary
same strategy for the linear solution. In fact, it is often the model including all intermediate variables. The larger sys-
case that a model using the hierarchical nonlinear strategy tem is more challenging to converge, and may not be solv-
would also use the monolithic linear strategy. The converse able in monolithic form. OpenMDAO provides a hierarchi-
is also true: Models that use the monolithic nonlinear strat- cal nonlinear strategy that allows individual subsystems in
egy will often use the hierarchical linear strategy. This asym- the model to be solved first, which makes the overall prob-
metry of nonlinear and linear solution strategies is one of the lem more tractable. The hierarchical nonlinear strategy rep-
OpenMDAO 13
resents a trade-off between solution robustness and solution Algorithm 2 Pure Newton’s Method
efficiency because it is typically more robust and more ex- 1: r ← apply nonlinear(p, u, r)
pensive. h ||r||
2: while i > ε do
∂R
3: ∂u ← linearize(p, u)
5.1.1 Monolithic Nonlinear Strategy 4: ∆ u ← solve linear(−r)
5: u ← u+∆u
6: r ← apply nonlinear(p, u, r)
In some cases, treating the entire model as a single mono-
lithic block provides a simple and efficient solution strategy.
This is accomplished with a pure Newton’s method that iter-
to the individual components, contains a subset of the un-
atively applies updates to the full u vector until the residual
knowns vector, uchild , and the corresponding residual equa-
vector is sufficiently close to zero, via
tions, Rchild (uchild ) = 0. For any level of the hierarchy, a
given subsystem (which can be a component or group of
∂R
∆ u = −r . (23)components) is a self-contained nonlinear system, where any
∂u
variables from external components or groups are inputs that
In practice, pure Newton’s method is usually used together are held constant for that subsystem’s solve nonlinear.
with a globalization technique, such as a line search, to im- Therefore, we can apply a nonlinear solver to any subsys-
prove robustness for a range of initial guesses. OpenMDAO’s tem in the hierarchy to converge that specific subset of the
Newton solver uses these methods in its actual implemen- nonlinear model. The hierarchical nonlinear strategy takes
tation. For simplicity, we omit the globalization techniques advantage of this subsystem property to enable more robust
from the following descriptions. Since these techniques do top level solvers.
not change the fundamentals of Newton’s method, we can OpenMDAO implements a number of nonlinear solu-
do this without loss of generality. tion algorithms that employ a hierarchical strategy. The most
Algorithm 2 shows the pseudocode a pure Newton’s method basic two algorithms are the nonlinear block-Gauss–Seidel
implemented using the OpenMDAO API. All variables are and nonlinear block-Jacobi algorithms used by Hwang and
treated as implicit and updated in line 4, which uses Martins [47]. Both of these algorithms use simple iterative
solve linear to implement Eq. (23). Note that strategies that repetitively call solve nonlinear for all the
solve nonlinear is never called anywhere in Algorithm 2; child subsystems in sequence, until the residuals are suffi-
only apply nonlinear is called to compute the residual ciently converged.
vector, r. This means that no variables—not even outputs OpenMDAO V2 introduces a new hierarchical Newton’s
of an ExplicitComponent—have their values directly set by method solver that extends the use of this strategy to multi-
their respective components. When the pure Newton’s method disciplinary models composed of a set of more tightly cou-
works, as is the case for the states group in the example pled subsystems. Compared to the pure Newton’s method of
model shown in Fig. 5, it is a highly efficient algorithm for Algorithm (2), the hierarchical Newton algorithm adds an
solving a nonlinear system. The challenge with pure New- additional step that recursively calls solve nonlinear on
ton’s method is that even with added globalization techniques, all child subsystems of the parent system, as shown in Algo-
it still may not converge for especially complex models with rithm 3.
large numbers of states. Pure Newton’s method is particu-
larly challenging to apply to large multidisciplinary models
built from components that wrap disciplinary analyses with Algorithm 3 Hierarchical Newton’s Methods
their own highly customized nonlinear solver algorithms. 1: for all child in subsystems do
This is because some specialized disciplinary solvers include 2: uchild ← child.solve nonlinear(pchild , uchild )
customized globalization schemes (e.g., pseudo time contin- 3: r ← apply nonlinear(p, u, r)
uation) and linear solver preconditioners that a pure New- h ||r||
4: while i > ε do
∂R
ton’s method applied at the top level of the model cannot
5: ∂ u ← linearize(p, u)
6: ∆ u ← solve linear(−r)
directly take advantage of. 7: u ← u+∆u
8: for all child in subsystems do
5.1.2 Hierarchical Nonlinear Strategy 9: uchild ← child.solve nonlinear(pchild , uchild )
10: r ← apply nonlinear(p, u, r)
For some models, the monolithic nonlinear strategy may be
numerically unstable and fail to converge on a solution. In
those cases, the hierarchical strategy may provide more ro- We refer to Algorithm 3 as the hierarchical Newton’s
bust solver behavior. Consider that each level of the model method, because although each child subsystem solves for
hierarchy, from the top level model group all the way down its own unknowns (uchild ), the parent groups are responsible
14 Gray, Hwang, Martins, Moore, and Naylor
for those same unknowns as well. Since each level of the 5.2 Linear Solution Strategy
hierarchy sees the set of residuals from all of its children,
the size of the Newton system (the number of state variables As discussed above, some nonlinear solvers require their
it is converging) increases as one moves higher up the hier- own linear solvers to compute updates for each iteration.
archy, making it increasingly challenging to converge. The OpenMDAO also uses a linear solver to compute total deriva-
recursive solution of subsystems acts as a form of nonlinear tives via Eq. (16). The inclusion of linear solvers in the frame-
preconditioning or globalization to help stabilize the solver, work, and the manner in which they can be combined, is one
but fundamentally, the top level Newton solver is dealing of the unique features of OpenMDAO.
with the complete set of all residual equations from the en- There are two API methods that are useful for imple-
tire model. menting linear solvers: apply linear and solve linear.
In an analogous fashion to the nonlinear solvers, the linear
There is another, arguably more common, formulation solvers can employ either a monolithic or hierarchical strat-
for applying Newton’s method to nested models where the egy. In this context, a monolithic strategy is one that works
solver at any level of the model hierarchy sees only the sub- with the entire partial derivatives Jacobian (∂ R/∂ u) as a sin-
set of the implicit variables that it alone is responsible for. gle block in-memory. A hierarchical linear strategy is one
In this formulation, the Newton system at any level is much that leverages a matrix-free approach.
smaller because it does not inherit the states and residuals
from any child systems. Instead, it treats any child calcu- 5.2.1 Hierarchical Linear Strategy
lations as if they were purely explicit. We refer to this for-
mulation as the “reduced-space Newton’s method”. In Ap- The hierarchical linear solver strategy is an approach that
pendix B, we prove that the application of the hierarchi- relies on the use of the apply linear and solve linear
cal Newton’s method yields the exact same solution path as methods in the OpenMDAO API. As such, it is a matrix-free
that of a reduced-space Newton’s method. The proof demon- strategy. This strategy was originally proposed by Hwang
strates that exact recursive solutions for uchild (i.e., Rchild (uchild ) =and Martins [47], and we refer the reader to that work for
0) (lines 1, 2, 8, and 9 in Algorithm 3) reduce the search a more detailed presentation of these concepts, including
space for the parent solver to only the subset of the u vector an extensive treatment of how parallel data passing is in-
that is owned exclusively by the current system and not by tegrated into this linear strategy. OpenMDAO implements
any of the solvers from its children. the hierarchical linear solver strategy proposed by MAUD
to support integration with computationally expensive anal-
While perfect sub-convergence is necessary to satisfy
yses, i.e., parallel distributed memory analyses such as CFD
the conditions of the proof, in practice, it is not necessary
and FEA. Models that benefit from this strategy tend to have
to fully converge the child subsystems for every top level
fewer than ten components that are computationally expen-
hierarchical Newton iteration. Once the nonlinear system
sive, with at least one component having on the order of a
has reached a sufficient level of convergence, the recursion
hundred thousand unknowns. The linear block-Gauss–Seidel
can be turned off, reverting the solver to the more efficient
and linear block-Jacobi solvers are the two solvers in the
monolithic strategy.
OpenMDAO standard library that use the hierarchical strat-
A hybrid strategy that switches between monolithic and egy. Algorithms (4) and (5) detail the forward (direct) for-
hierarchical strategies was investigated by Chauhan et al. mulation of the two hierarchical linear solvers. There are
[15] in a study where they found that the best performing also separate reverse (adjoint) formulations for these solvers,
nonlinear solver algorithm changes with the strength of the which are explained in more detail by Hwang and Martins
multidisciplinary coupling. Their results underscore the need [47]. For integration with PDE solvers, the forward and re-
for OpenMDAO to support both hierarchical and monolithic verse forms of these algorithms allow OpenMDAO to lever-
nonlinear solver architectures, because they show that dif- age existing, highly specialized linear solvers used by dis-
ferent problems require different treatments. The mixture of cipline analyses as part of a recursive preconditioner for a
the two often yields the best compromise between stability top-level OpenMDAO Krylov subspace solver in a coupled
and performance. multidisciplinary model [47].
Algorithm 5 Linear Block Jacobi (forward mode) Monolithic and hierarchical linear solver strategies can
1: while ||r|| > ε do be used in conjunction with each other as part of a larger
2: for all childi in subsystems do model. At any level of the model hierarchy, a monolithic
3: duchildi ← childi .solve linear(dri ) strategy can be used, which causes all components below
4: for all childi in subsystems do that level to store their partial derivatives in the assembled
5: for all child j in subsystems : i 6= j do
6: dri ← dri − child j .apply linear(duchild j ) Jacobian matrix. Above that level, however, a hierarchical
linear solver strategy can still be used. This mixed linear
solver strategy is crucial for achieving good computational
efficiency for larger models. Aeropropulsive design optimiza-
5.2.2 Monolithic Linear Strategy tion is good example where this is necessary. Gray et al. [36]
coupled a RANS CFD analysis to a 1-D propulsion model
Although the hierarchical linear solver strategy is an effi- using OpenMDAO with a hierarchical linear solver strat-
cient approach for models composed of computationally ex- egy to combine the matrix-free Krylov subspace from the
pensive analyses, it can introduce significant overhead for CFD with the monolithic direct solver used for the propul-
models composed of hundreds or thousands of computation- sion analysis.
ally simple components. The hierarchical linear solver strat-
egy relies on the use of the apply linear and solve linear
5.3 Performance Study for Mixed Linear Solver Strategy
methods, which only provide linear operators that must be
recursively called on the entire model hierarchy. While re- The specific combination of hierarchical and monolithic lin-
cursion is generally expensive in and of itself, the cost is ear solvers that will give the best performance is very model-
exacerbated because OpenMDAO is written in Python, an specific, which is why OpenMDAO’s flexibility to allow dif-
interpreted language where loops are especially costly. For ferent combinations is valuable.
many models, it is feasible to assemble the entire partial This sensitivity of computational performance to solver
derivative Jacobian matrix in memory, which then allows the strategy can be easily demonstrated using an example model
use of a direct factorization to solve the linear system more built using the OpenAeroStruct [53] library. OpenAeroStruct
efficiently. As long as the cost of computing the factorization is a modular, lower-fidelity, coupled aerostructural model-
is reasonable, this approach is by far the simplest and most ing tool which is built on top of OpenMDAO V2. Consider
efficient way to implement the solve linear method. This a notional model that computes the average drag coefficient
represents a significant extension from the previously devel- for a set of aerostructural wing models at different angles
oped hierarchical formulation [47], and as we will show in of attack, as shown in Fig. 6. The derivatives of average
Sec. 5.3, this approach is crucial for good computational per- drag with respect to the shape design variables can be com-
formance on models with many components. puted via a single reverse model linear solution. This re-
The matrix assembly can be done using either a dense verse mode solution was tested with two separate solver
matrix or a sparse matrix. In the sparse case, OpenMDAO strategies: (1) pure monolithic with a direct solver at the top
relies on the components to declare the nonzero partial deriva- of the model hierarchy; (2) mixed hierarchical/monolithic
tives, as shown in Fig. 4. Broadly speaking, at the model with a linear block Gauss–Seidel solver at the top and direct
level, the partial derivative Jacobian is almost always very solver angle of attack case. Figure 7 compares the computa-
sparse, even for simple models. Figure 5, which includes a tional costs of these two linear solver strategies and exam-
T
visualization of ∂ R/∂ u , shows that even a small model ines how the computational cost scales with increasing num-
has a very sparse partial derivative Jacobian. In the vast ma- ber of components. For this problem, the scaling is achieved
jority of cases, the factorization is more efficient when using by increasing the number of angle of attack conditions in-
a sparse matrix assembly. cluded in the average. As the number of angle of attack cases
The monolithic linear solver strategy is primarily de- increases, the number of components and number of vari-
signed to be used with a direct linear solver. A direct fac- ables goes up as well, since each case requires its own aero-
torization is often the fastest, and certainly the simplest type structural analysis group. The data in Fig. 7 shows that both
of linear solver to apply. However, this strategy can also be linear solver strategies scale nearly linearly with an increas-
used with a Krylov subspace solver, assuming we either do ing number of variables, which indicates very good scaling
not need to use a preconditioner or want to use a precon- for the direct solver. This solver relies on the sparse LU fac-
ditioner that is also compatible with the monolithic strategy torization in SciPy [80]. However, the difference between
(e.g., incomplete LU factorization). Krylov subspace solvers the purely monolithic and the mixed hierarchical/monolithic
are unique because they can be used with both the hierar- linear solver strategies is 1.5 to 2 orders of magnitude in to-
chical and monolithic linear solver strategies, depending on tal computation time. This difference in computational cost
what type of preconditioner is applied. is roughly independent of problem size, which demonstrates
16 Gray, Hwang, Martins, Moore, and Naylor
α0
Multi-case
Group
α0
Multi-case
a b c0 c1 c2 c3 c4 a b c
g0 g0
g1 g1
g2 g2
g3 g3
g4 g4
f f
computing the total derivative sparsity. More details are in- ory allocation; thus, it is a highly efficient technique for re-
cluded in Appendix A. ducing the computational cost of solving for total deriva-
tives, even when running in serial. However, it is not pos-
sible to use that approach for all models. In particular, a
4 common model structure that relies on parallel execution
10
for computational efficiency, which we refer to as “quasi-
decoupled”, prevents the use of combined linear solutions
103 and demands a different approach to exploit its sparsity. In
No coloring 2.0
Normalized this section, we present a method for performing efficient
time linear solutions for derivatives of quasi-decoupled systems
102
that enables the efficient use of parallel computing resources
1.5 for reverse mode linear solutions.
101 A quasi-decoupled model is one with an inexpensive se-
With coloring rial calculation bottleneck at the beginning, followed by a
more computationally costly set of parallel calculations for
100
independent model outputs. The data passing in this model
102 103 104
is such that one set of outputs gets passed to multiple down-
Number of design variables
stream components that can run in parallel. A typical ex-
ample of this structure can be found in multipoint models,
Fig. 10: Comparison of total derivatives computation time
where the same analysis is run at several different points,
computed with (blue) and without (orange) the combined
e.g., multiple CFD analyses that are run for the same geom-
linear solution feature.
etry, but at different flow conditions [85, 77, 26]. In these
cases, the geometric calculations that translate the model in-
puts to the computational grid are the serial bottleneck, and
6.1.1 Computational Savings from Combined Linear the multiple CFD analyses are the decoupled parallel com-
Solutions putations, which can be solved in an embarrassingly parallel
fashion. This model can be run efficiently in the forward
The combined linear solution feature is leveraged by the direction for nonlinear solutions—making it practically for-
Dymos optimal control library, which is built using Open- ward decoupled—but the linear reverse mode solutions to
MDAO. To date, Dymos has been used to solve a range of compute total derivatives can no longer be run in parallel.
optimal control problems[24], including canonical problems One possible solution to address this challenge is to em-
such as Bryson’s minimum time to climb problem [10], as ploy a constraint aggregation approach [60, 63]. This ap-
well as the classic brachistochrone problem posed by Bernoulli proach allows the adjoint method to work efficiently because
[6]. It has also been used to solve more complex optimal tra- it collapses many constraint values into a single scalar, hence
jectory problems for electric aircraft [23, 89]. recovering the adjoint method efficiency. Though this may
To demonstrate the computational improvement, we present work in some cases, constraint aggregation is not well-suited
results showing how the cost of solving for the total deriva- to problems where the majority of the constraints being ag-
tives Jacobian scales with and without the combined linear gregated are active at the optimal solution, as is the case for
solutions feature for Bryson’s minimum time to climb prob- equality constraints. In these situations, the conservative na-
lem implemented in the Dymos example problem library. ture of the aggregations function is problematic because it
Figure 10 shows the variation of the total derivatives compu- prevents the optimizer from tightly satisfying all the equal-
tation time as a function of the number of time steps used in ities. Kennedy and Hicken [57] developed improved aggre-
the model. The greater the number of time steps, the greater gation methods that offer a less conservative and more nu-
the number of constraints in the optimization problem for merically stable formulation, but despite the improvements,
which we need total derivatives. We can see that the com- aggregation is still not appropriate for all applications. In
bined linear solution offers significant reductions in the total these cases, an alternate approach is needed to maintain ef-
derivative computation time, and more importantly, shows ficient parallel reverse mode (adjoint) linear solutions.
superior computational scaling.
When aggregation cannot be used, OpenMDAO uses a
solution technique that retains the parallel efficiency at the
6.2 Sparsity from Quasi-decoupled Parallel Models cost of a slight increase in required memory. First, the mem-
ory allocated for the serial bottleneck calculations in the
Combining multiple linear solutions offers significant com- right-hand side and solution vectors is duplicated across all
putational savings with no requirement for additional mem- processors. Only the variables associated with the bottleneck
OpenMDAO 19
Design
Problem Model structure Objective Constraints
variables
Optimizer
CubeSat MDO
Attitude
dynamics
Solar panel angle,
Thermal antenna angle,
num. radiators, Batt. charge rate,
Data downloaded
Figure 1: CADRE CubeSat geometry.
Solar power distribution, batt. charge level
power
user interfaces (GUI) that significantly enhance usability. For these tools, the approach is to make user attitude profile,
interaction with the framework as streamlined as possible, allowing the user’s knowledge and experience
to work together with the framework’s optimization capability. However, as was the case with the single-
discipline studies, all of these computational design tools use optimizers or design techniques that do not Energy solar panel controls
use gradients, which limits the number of design variables that can be considered. Without gradients,
algorithms must rely on sampling the design space at a cost that grows exponentially with the number of storage
design variables, and in practice, this becomes prohibitive when there are more than O(10) variables. Wu
et al. [14] used a gradient-based approach to solve a satellite MDO problem with collaborative optimization
(CO) [15, 16], but the cost of computing coupled derivatives limited the number of design variables to O(10)
here as well.
Given the existing body of work, this paper seeks to address the question whether multidisciplinary design Comm.
optimization (MDO) can handle the full set of design variables in the satellite design problem simultaneously,
even when there are tens of thousands of them. The high-level approach is gradient-based optimization in
combination with adjoint-based derivative computation, with a modular implementation of the disciplinary
models in an integrated framework. The full small-satellite design problem is simultaneously considered,
Fuel burn
3
Aerodynamics
(3-D CFD) Shape Drag coefficient Lift coefficient
Functionals
(CL , CD )
Allocation-design optimization
Optimizer
Aerodynamics
(3-D CFD)
Dynamics
Wing shape, Wing geometry,
altitude profiles, thrust
Profit
Aerodynamic
cruise Mach, demand
surrogate allocation fleet limits
Propulsion
Profit
Optimizer
Aero-propulsive optimization
Aerodynamics
(3-D CFD)
Inlet shape Fuel burn Trim
Propulsion
(CEA)
Functionals
Table 2: Boundary layer height at the nacelle lip for podded and BLI configurations.
Optimizer
configuration
podded 4.0
(ft) % change
Structures
Element densities Compliance Mass fraction
BLI, FPR 1.35 4.6 15%
BLI, FPR 1.2 5.0 25% (2-D FEA)
between the podded and BLI configurations is show in Figure 7. The primary conclusion from Figure 7 is that BLI
offers an additional 5 to 6 of net force counts. That represents a 33% improvement in performance relative to the
conventional podded configuration. The data also shows two key trends: first, for the podded configuration net force is
insensitive to FPR; second, the BLI configuration clearly performs better for lower FPR designs. We can gain further
insight into these trends by breaking CF -x down into CF -fuse and CF -prop components to look at the contributions from
each discipline.
implementation of a mixed hierarchical-monolithic linear for these types of problems: They offer the required compu-
solver strategy in the coupled model. tational efficiency and accuracy that could not be achieved
Jasa et al. [53] developed OpenAeroStruct, a low-fidelity using monolithic finite differencing.
wing aerostructural library whose development was moti- Other work that used OpenMDAO V2 includes a frame-
vated by the absence of a tool for fast wing design optimiza- work for the solution of ordinary differential equations [48],
tion. OpenAeroStruct implements a vortex lattice model for a conceptual design model for aircraft electric propulsion [9],
the aerodynamic analysis and a beam finite-element model and a mission planning tool for the X-57 aircraft [89].
for structural analysis. These analyses are coupled, enabling Application-focused work has included the design of a
the aerostructural analysis of lifting surfaces. Each of the next-generation airliner considering operations and economics
models was implemented in OpenMDAO from the ground [87], design and trajectory optimization of a morphing wing
up, making use of the hierarchical representation and solvers aircraft [52], and trajectory optimization of an aircraft with a
for the best possible coupled solution efficiency (seconds us- fuel thermal management system [54]. OpenMDAO is also
ing one processor) as demonstrated in Fig. 7. As a result, being used extensively by the wind energy community for
OpenAeroStruct efficiently computes aerostructural deriva- wind turbine design [79, 5, 98, 99, 74, 30, 22] and wind
tives through the coupled-adjoint method, enabling fast aero- farm layouts [95, 92].
structural analysis (solutions in seconds) and optimizations
(converged results in minutes). OpenAeroStruct has already
been used in a number of applications, including the multi- 8 Conclusions
fidelity optimization under uncertainty of a tailless aircraft [25,
13, 82, 20, 21, 8, 61, 96, 19, 4, 83, 14]. Significant com- The OpenMDAO framework was developed to facilitate the
putational efficiency was achieved for OpenAeroStruct by multidisciplinary analysis and design optimization of com-
using the sparse-assembled Jacobian matrix feature with a plex engineering systems. While other frameworks exist that
monolithic linear solver strategy, thanks to the highly sparse address the same purpose, OpenMDAO has evolved in the
nature of many of the underlying calculations. last few years to incorporate state-of-the-art algorithms that
Chung et al. [17] developed a framework for setting up enable it to address optimization problems of unprecedented
structural topology optimization problems and formulations. scale and complexity.
Using this platform, they implemented three popular topol- Two main areas of development made this possible: al-
ogy optimization approaches. Even though structural topol- gorithms for the solution of coupled systems and methods
ogy optimization involves only one discipline, they found for derivative computation. The development of efficient deriva-
that the framework benefited from the modularity and the tive computation was motivated by the fact that gradient-
more automated derivative computation. The increased mod- based optimization is our only hope for solving large-scale
ularity made it easier to restructure and extend the code, al- problems that involve computationally expensive models;
lowing the authors to quickly change the order of operations thus, efficient gradient computations that are scalable are re-
in the process to demonstrate the importance of correct se- quired. Because most models and coupled systems exhibit
quencing. This structural topology optimization framework some degree of sparsity in their problem structure, Open-
is expected to facilitate future developments in multiscale MDAO takes advantage of the sparsity for both storage and
and multidisciplinary topology optimization. This work, in computation.
addition to OpenAeroStruct, provides an excellent example To achieve the efficient solution of coupled systems, Open-
of using OpenMDAO as a low-level software layer to de- MDAO implements known state-of-the-art monolithic meth-
velop new disciplinary analysis solvers. ods and has developed a flexible hierarchical approach that
Hwang and Ning [49] developed and integrated low-fidelity enables users to group models according to the problem struc-
propeller, aerodynamic, structural, and mission analysis mod- ture so that computations can be nested, parallelized, or both.
els using OpenMDAO for NASA’s X-57 Maxwell research To compute total coupled derivatives efficiently in a scal-
aircraft, which features distributed electric propulsion. They able way, OpenMDAO uses analytic methods in two modes:
solved MDO problems with up to 101 design variables and forward and reverse. The forward mode is equivalent to the
74 constraints that converged in a few hundred model eval- coupled direct method, and its cost scales linearly with the
uations. Numerical experiments showed the scaling of the number of design variables. The reverse mode is equivalent
optimization time with the number of mission points was, at to the coupled adjoint method, and its cost scales linearly
worst, linear. The inclusion of a fully transient mission anal- with the number of functions of interest—but it is indepen-
ysis model of the aircraft performance was shown to offer dent of the number of design variables. This last character-
significantly different results than a basic multipoint opti- istic is particularly desirable because many problems have
mization formulation. The need to include the transient anal- a number of design variables that is larger than the number
ysis is an example of why analytic derivatives are needed of functions of interest (objective and constraints). Further-
OpenMDAO 23
more, in cases with large numbers of constraints, these can the satellite MDO (github.com/OpenMDAO/CADRE)—is also
often be aggregated. available.
Problem sparsity is also exploited in the coupled deriva-
tive computation by a new approach we developed that uses
graph coloring. We also discussed a few other techniques to Acknowledgements
increase the efficiency of derivative computations using the
hierarchical problem representation. The authors would like to thank the NASA ARMD Trans-
The algorithms in OpenMDAO work best if the residu- formational Tools and Technologies project for their sup-
als of the systems involved are available, but when they are port of the OpenMDAO development effort. Joaquim Mar-
not available, it is possible to formulate the models solely in tins was partially supported by the National Science Foun-
terms of their inputs and outputs. dation (award number 1435188). We would like to acknowl-
The efficiency and scalability of OpenMDAO was demon- edge the invaluable advice from Gaetan Kenway and Charles
strated in several examples. We also presented an overview Mader on efficient implementation of OpenMDAO for high-
of various previously published applications of OpenMDAO fidelity work. Also invaluable was the extensive feedback
to engineering design problems, including satellite, wing, provided by Eric Hendricks, Rob Falck, and Andrew Ning.
and aircraft design. Some of these problems involved tens Finally, we would also like to thank Nicolas Bons, Benjamin
of thousands of design variables and similar number of con- Brelje, Anil Yildirim, and Shamsheer Chauhan for their re-
straints. Other problems involved costly high-fidelity mod- view of this manuscript and their helpful suggestions.
els, such as CFD and finite element structural analysis with
millions of degrees of freedom. While the solution of the
Conflict of interest statement: On behalf of all authors, the
problems in these applications would have been possible
corresponding author states that there is no conflict of inter-
with single purpose implementations, OpenMDAO made it
est.
possible to use state-of-the-art methods with a much lower
development effort.
Based on the experience of these applications, we con-
References
clude that while OpenMDAO can handle traditional disci-
plinary analysis models effectively, it is most efficient when
1. Arora J, Haug EJ (1979) Methods of design sensitiv-
these models are developed from the ground up using Open-
ity analysis in structural optimization. AIAA Journal
MDAO with a fine-grained modularity to take full advan-
17(9):970–974, doi:10.2514/3.61260
tage of the problem sparsity, lower implementation effort,
2. Balabanov V, Charpentier C, Ghosh DK, Quinn G, Van-
and built-in derivative computation.
derplaats G, Venter G (2002) Visualdoc: A software
system for general purpose integration and design op-
Replication of Results timization. In: 9th AIAA/ISSMO Symposium on Mul-
tidisciplinary Analysis and Optimization, Atlanta, GA
Most of the codes required to replicate the results in this 3. Balay S, Abhyankar S, Adams M, Brown J, Buschel-
paper are available under open source licenses and are main- man K, Dalcin L, Dener A, Eijkhout V, Gropp W,
tained in version control repositories. The OpenMDAO frame- Karpeyev D, Kaushik D, Knepley M, May D, McInnes
work is available from GitHub (github.com/OpenMDAO). LC, Mills R, Munson T, Rupp K, Sanan P, Smith B,
The OpenMDAO website (openmdao.org) provides installa- Zampini S, Zhang H (2018) PETSc users manual. Tech.
tion instructions and a number of examples. The code for Rep. ANL-95/11 - Revision 3.10, Argonne National
the simple example of Sec. 3 is listed in Figs. 3 and 4 and Laboratory
can be run once OpenMDAO is installed as is. The scripts 4. Baptista R, Poloczek M (2018) Bayesian optimization
used to produce the scaling plots in Figs. 7 and 10 are avail- of combinatorial structures. In: Dy J, Krause A (eds)
able as supplemental material in this paper. In addition to Proceedings of the 35th International Conference on
requiring OpenMDAO to be installed, these scripts require Machine Learning, PMLR, Stockholmsmässan, Stock-
OpenAeroStruct (github.com/mdolab/OpenAeroStruct) and holm Sweden, Proceedings of Machine Learning Re-
Dymos (github.com/OpenMDAO/dymos). The scaling plots search, vol 80, pp 462–471, URL http://proceedings.
for the AMD problem (Fig. 12) involve a complex frame- mlr.press/v80/baptista18a.html
work that includes code that is not open source, and there- 5. Barrett R, Ning A (2018) Integrated free-form method
fore, we are not able to provide scripts for these results. Fi- for aerostructural optimization of wind turbine blades.
nally, although no results are shown in applications men- Wind Energy 21(8):663–675, doi:10.1002/we.2186
tioned in Sec. 7, the code for two of these applications— 6. Bernoulli J (1696) A new problem to whose solution
OpenAeroStruct (github.com/mdolab/OpenAeroStruct) and mathematicians are invited. Acta Eruditorum 18:269
24 Gray, Hwang, Martins, Moore, and Naylor
7. Betts JT, Huffman WP (1991) Trajectory optimization 20. Cook LW, Jarrett JP, Willcox KE (2017) Extending
on a parallel processor. Journal of Guidance, Control, horsetail matching for optimization under probabilis-
and Dynamics 14(2):431–439, doi:10.2514/3.20656 tic, interval, and mixed uncertainties. AIAA Journal pp
8. Bons NP, He X, Mader CA, Martins JRRA (2017) 849–861, doi:10.2514/1.J056371
Multimodality in aerodynamic wing design opti- 21. Cook LW, Jarrett JP, Willcox KE (2017) Horse-
mization. In: 18th AIAA/ISSMO Multidisciplinary tail matching for optimization under probabilistic,
Analysis and Optimization Conference, Denver, CO, interval and mixed uncertainties. In: 19th AIAA
doi:10.2514/6.2017-3753 non-deterministic approaches conference, p 0590,
9. Brelje BJ, Martins JRRA (2018) Development of a doi:10.2514/6.2017-0590
conceptual design model for aircraft electric propul- 22. Dykes K, Damiani R, Roberts O, Lantz E (2018) Anal-
sion with efficient gradients. In: Proceedings of the ysis of ideal towers for tall wind applications. In: 2018
AIAA/IEEE Electric Aircraft Technologies Sympo- Wind Energy Symposium, AIAA, doi:10.2514/6.2018-
sium, Cincinnati, OH, doi:10.2514/6.2018-4979 0999
10. Bryson AE (1999) Dynamic optimization. Addison 23. Falck RD, Chin JC, Schnulo SL, Burt JM, Gray
Wesley Longman Menlo Park, CA JS (2017) Trajectory optimization of electric aircraft
11. Bryson AE, Ho YC (1975) Applied Optimal Control: subject to subsystem thermal constraints. In: 18th
Optimization, Estimation, and Control. John Wiley & AIAA/ISSMO Multidisciplinary Analysis and Opti-
Sons mization Conference, Denver, CO
12. Carrier G, Destarac D, Dumont A, Méheut M, Din 24. Falck RD, Gray JS, Naylor B (2019) Optimal control
ISE, Peter J, Khelil SB, Brezillon J, Pestana M within the context of multidisciplinary design, analy-
(2014) Gradient-based aerodynamic optimization with sis, and optimization. In: AIAA SciTech Forum, AIAA,
the elsA software. In: 52nd Aerospace Sciences Meet- doi:10.2514/6.2019-0976
ing, doi:10.2514/6.2014-0568 25. Friedman S, Ghoreishi SF, Allaire DL (2017) Quan-
13. Chaudhuri A, Lam R, Willcox K (2017) Multifidelity tifying the impact of different model discrepancy for-
uncertainty propagation via adaptive surrogates in cou- mulations in coupled multidisciplinary systems. In:
pled multidisciplinary systems. AIAA Journal pp 235– 19th AIAA non-deterministic approaches conference, p
249, doi:10.2514/1.J055678 1950, doi:10.2514/6.2017-1950
14. Chauhan SS, Martins JRRA (2018) Low-fidelity aero- 26. Gallard F, Meaux M, Montagnac M, Mohammadi
structural optimization of aircraft wings with a simpli- B (2013) Aerodynamic aircraft design for mission
fied wingbox model using openaerostruct. In: Proceed- performance by multipoint optimization. In: 21st
ings of the 6th International Conference on Engineer- AIAA Computational Fluid Dynamics Conference,
ing Optimization, EngOpt 2018, Springer, Lisbon, Por- American Institute of Aeronautics and Astronau-
tugal, pp 418–431, doi:10.1007/978-3-319-97773-7 38 tics, doi:10.2514/6.2013-2582, URL http://dx.doi.org/
15. Chauhan SS, Hwang JT, Martins JRRA (2018) An au- 10.2514/6.2013-2582
tomated selection algorithm for nonlinear solvers in 27. Gallard F, Lafage R, Vanaret C, Pauwels B, Guénot
MDO. Structural and Multidisciplinary Optimization D, Barjhoux PJ, Gachelin V, Gazaix A (2017)
58(2):349–377, doi:10.1007/s00158-018-2004-5 GEMS: A python library for automation of multidis-
16. Chen S, Lyu Z, Kenway GKW, Martins JRRA (2016) ciplinary design optimization process generation. In:
Aerodynamic shape optimization of the Common Re- 18th AIAA/ISSMO Multidisciplinary Analysis and Op-
search Model wing-body-tail configuration. Journal of timization Conference
Aircraft 53(1):276–293, doi:10.2514/1.C033328 28. Gebremedhin AH, Manne F, Pothen A (2005) What
17. Chung H, Hwang JT, Gray JS, Kim HA (2018) Imple- color is your Jacobian? Graph coloring for computing
mentation of topology optimization using OpenMDAO. derivatives. SIAM Review 47(4):629–705
In: 2018 AIAA/ASCE/AHS/ASC Structures, Structural 29. Golovidov O, Kodiyalam S, Marineau P, Wang L, Rohl
Dynamics, and Materials Conference, AIAA, AIAA, P (1998) Flexible implementation of approximation
Kissimmee, FL, doi:10.2514/6.2018-0653 concepts in an MDO framework. In: 7th AIAA/USAF/-
18. Coleman TF, Verma A (1998) The efficient computa- NASA/ISSMO Symposium on Multidisciplinary Anal-
tion of sparse Jacobian matrices using automatic dif- ysis and Optimization, American Institute of Aeronau-
ferentiation. SIAM Journal of Scientific Computing tics and Astronautics, doi:10.2514/6.1998-4959
19(4):1210–1233 30. Graf P, Dykes K, Damiani R, Jonkman J, Veers P (2018)
19. Cook LW (2018) Effective formulations of optimiza- Adaptive stratified importance sampling: hybridization
tion under uncertainty for aerospace design. PhD thesis, of extrapolation and importance sampling monte carlo
University of Cambridge, doi:10.17863/CAM.23427 methods for estimation of wind turbine extreme loads.
OpenMDAO 25
Wind Energy Science (Online) 3(2), doi:10.5194/wes- ence, part of AIAA Aviation 2016 (Washington, DC),
3-475-2018 doi:10.2514/6.2016-4297
31. Gray J, Moore KT, Naylor BA (2010) OpenMDAO: 43. Heath C, Gray J (2012) OpenMDAO: Framework for
An open source framework for multidisciplinary anal- flexible multidisciplinary design, analysis and optimiza-
ysis and optimization. In: Proceedings of the 13th tion methods. In: Proceedings of the 53rd AIAA Struc-
AIAA/ISSMO Multidisciplinary Analysis Optimization tures, Structural Dynamics and Materials Conference,
Conference, Fort Worth, TX, aIAA 2010-9101 Honolulu, HI, aIAA-2012-1673
32. Gray J, Moore KT, Hearn TA, Naylor BA (2013) 44. Heil M, Hazel AL, Boyle J (2008) Solvers for large-
Standard platform for benchmarking multidisciplinary displacement fluid–structure interaction problems: seg-
design analysis and optimization architectures. AIAA regated versus monolithic approaches. Computa-
Journal 51(10):2380–2394, doi:10.2514/1.J052160 tional Mechanics 43(1):91–101, doi:10.1007/s00466-
33. Gray J, Hearn T, Moore K, Hwang JT, Martins 008-0270-6
JRRA, Ning A (2014) Automatic evaluation of mul- 45. Hwang JT, Martins JRRA (2015) Parallel allocation-
tidisciplinary derivatives using a graph-based prob- mission optimization of a 128-route network. In: Pro-
lem formulation in OpenMDAO. In: Proceedings ceedings of the 16th AIAA/ISSMO Multidisciplinary
of the 15th AIAA/ISSMO Multidisciplinary Anal- Analysis and Optimization Conference, Dallas, TX
ysis and Optimization Conference, Atlanta, GA, 46. Hwang JT, Martins JRRA (2016) Allocation-mission-
doi:10.2514/6.2014-2042 design optimization of next-generation aircraft us-
34. Gray JS, Martins JRRA (2018) Coupled aeropropul- ing a parallel computational framework. In: 57th
sive design optimization of a boundary layer ingestion AIAA/ASCE/AHS/ASC Structures, Structural Dynam-
propulsor. The Aeronautical Journal (In press) ics, and Materials Conference, American Institute
35. Gray JS, Chin J, Hearn T, Hendricks E, Lavelle T, of Aeronautics and Astronautics, doi:10.2514/6.2016-
Martins JRRA (2017) Chemical equilibrium analysis 1662
with adjoint derivatives for propulsion cycle analy- 47. Hwang JT, Martins JRRA (2018) A computational ar-
sis. Journal of Propulsion and Power 33(5):1041–1052, chitecture for coupling heterogeneous numerical mod-
doi:10.2514/1.B36215 els and computing coupled derivatives. ACM Transac-
36. Gray JS, Mader CA, Kenway GKW, Martins JRRA tions on Mathematical Software (In press)
(2017) Modeling boundary layer ingestion using a 48. Hwang JT, Munster DW (2018) Solution of or-
coupled aeropropulsive analysis. Journal of Aircraft dinary differential equations in gradient-based
doi:10.2514/1.C034601 multidisciplinary design optimization. In: 2018
37. Gray JS, Kenway GKW, Mader CA, Martins JRRA AIAA/ASCE/AHS/ASC Structures, Structural Dy-
(2018) Aeropropulsive design optimization of a turbo- namics, and Materials Conference, AIAA, AIAA,
electric boundary layer ingestion propulsion. In: 2018 Kissimmee, FL, doi:10.2514/6.2018-1646
Aviation Technology, Integration, and Operations Con- 49. Hwang JT, Ning A (2018) Large-scale multidisciplinary
ference, AIAA, Atlanta, GA, doi:10.2514/6.2018-3976 optimization of an electric aircraft for on-demand
38. Griewank A (2000) Evaluating Derivatives. SIAM, mobility. In: 2018 AIAA/ASCE/AHS/ASC Structures,
Philadelphia Structural Dynamics, and Materials Conference, AIAA,
39. Haftka RT (1977) Optimization of flexible wing struc- AIAA, Kissimmee, FL, doi:10.2514/6.2018-1384
tures subject to strength and induced drag constraints. 50. Hwang JT, Lee DY, Cutler JW, Martins JRRA (2014)
AIAA Journal 15(8):1101–1106, doi:10.2514/3.7400 Large-scale multidisciplinary optimization of a small
40. Haftka RT, Sobieszczanski-Sobieski J, Padula SL satellite’s design and operation. Journal of Spacecraft
(1992) On options for interdisciplinary analysis and and Rockets 51(5):1648–1663, doi:10.2514/1.A32751
design optimization. Structural Optimization 4:65–74, 51. Jameson A (1988) Aerodynamic design via control the-
doi:10.1007/BF01759919 ory. Journal of Scientific Computing 3(3):233–260
41. He P, Mader CA, Martins JRRA, Maki KJ (2018) An 52. Jasa JP, Hwang JT, Martins JRRA (2018) Design and
aerodynamic design optimization framework using a trajectory optimization of a morphing wing aircraft.
discrete adjoint approach with openfoam. Computers In: 2018 AIAA/ASCE/AHS/ASC Structures, Structural
& Fluids doi:10.1016/j.compfluid.2018.04.012, ¡p¿in Dynamics, and Materials Conference; AIAA SciTech
press¡/p¿ Forum, Orlando, FL
42. Hearn DT, Hendricks E, Chin J, Gray JS, Moore DKT 53. Jasa JP, Hwang JT, Martins JRRA (2018) Open-
(2016) Optimization of turbine engine cycle analy- source coupled aerostructural optimization using
sis with analytic derivatives. In: 17th AIAA/ISSMO Python. Structural and Multidisciplinary Optimization
Multidisciplinary Analysis and Optimization Confer- 57:1815–1827, doi:10.1007/s00158-018-1912-8
26 Gray, Hwang, Martins, Moore, and Naylor
54. Jasa JP, Mader CA, Martins JRRA (2018) Trajectory doi:10.1007/s00158-016-1495-1
optimization of supersonic air vehicle with thermal 64. Lyu Z, Kenway GK, Paige C, Martins JRRA (2013)
fuel management system. In: AIAA/ISSMO Multidisci- Automatic differentiation adjoint of the Reynolds-
plinary Analysis and Optimization Conference, Atlanta, averaged Navier–Stokes equations with a turbulence
GA, doi:10.2514/6.2018-3884 model. In: 21st AIAA Computational Fluid Dynamics
55. Jones M, Plassmann P (1993) A parallel graph color- Conference, San Diego, CA, doi:10.2514/6.2013-2581
ing heuristic. SIAM Journal on Scientific Computing 65. Mader CA, Martins JRRA, Alonso JJ, van der Weide E
14(3):654–669, doi:10.1137/0914041 (2008) ADjoint: An approach for the rapid development
56. Karp RM, Wigderson A (1985) A fast parallel algorithm of discrete adjoint solvers. AIAA Journal 46(4):863–
for the maximal independent set problem. Journal of the 873, doi:10.2514/1.29123
Association for Computing Machinery 32(4):762–773 66. Marriage CJ, Martins JRRA (2008) Reconfigurable
57. Kennedy GJ, Hicken JE (2015) Improved constraint- semi-analytic sensitivity methods and mdo architec-
aggregation methods. Computer Methods in Ap- tures within the πMDO framework. In: Proceedings of
plied Mechanics and Engineering 289:332–354, the 12th AIAA/ISSMO Multidisciplinary Analysis and
doi:10.1016/j.cma.2015.02.017 Optimizaton Conference, Victoria, British Columbia,
58. Keyes DE, McInnes LC, Woodward C, Gropp W, Canada, AIAA 2008-5956
Myra E, Pernice M, Bell J, Brown J, Clo A, Con- 67. Martins JRRA, Hwang JT (2013) Review and unifi-
nors J, Constantinescu E, Estep D, Evans K, Farhat cation of methods for computing derivatives of mul-
C, Hakim A, Hammond G, Hansen G, Hill J, Isaac T, tidisciplinary computational models. AIAA Journal
Jiao X, Jordan K, Kaushik D, Kaxiras E, Koniges A, 51(11):2582–2599, doi:10.2514/1.J052184
Lee K, Lott A, Lu Q, Magerlein J, Maxwell R, Mc- 68. Martins JRRA, Hwang JT (2016) Multidisciplinary de-
Court M, Mehl M, Pawlowski R, Randles AP, Reynolds sign optimization of aircraft configurations—Part 1:
D, Riviere B, Rude U, Scheibe T, Shadid J, Shee- A modular coupled adjoint approach. Lecture series,
han B, Shephard M, Siegel A, Smith B, Tang X, Wil- Von Karman Institute for Fluid Dynamics, Rode Saint
son C, Wohlmuth B (2013) Multiphysics simulations: Genèse, Belgium, iSSN0377-8312
Challenges and opportunities. International Journal of 69. Martins JRRA, Lambe AB (2013) Multidisciplinary
High Performance Computing Applications 27(1):4– design optimization: A survey of architectures. AIAA
83, doi:10.1177/1094342012468181 Journal 51(9):2049–2075, doi:10.2514/1.J051895
59. Kolonay RM, Sobolewski M (2011) Service oriented 70. Martins JRRA, Sturdza P, Alonso JJ (2003) The
computing environment (SORCER) for large scale, dis- complex-step derivative approximation. ACM Trans-
tributed, dynamic fidelity aeroelastic analysis. In: Op- actions on Mathematical Software 29(3):245–262,
timization, International Forum on Aeroelasticity and doi:10.1145/838250.838251, september
Structural Dynamics, IFASD 2011, 26–30 71. Martins JRRA, Alonso JJ, Reuther JJ (2004) High-
60. Kreisselmeier G, Steinhauser R (1979) Systematic fidelity aerostructural design optimization of a super-
control design by optimizing a vector performance sonic business jet. Journal of Aircraft 41(3):523–530,
index. In: International Federation of Active Con- doi:10.2514/1.11478
trols Symposium on Computer-Aided Design of Con- 72. Martins JRRA, Alonso JJ, Reuther JJ (2005)
trol Systems, Zurich, Switzerland, doi:10.1016/S1474- A coupled-adjoint sensitivity analysis method
6670(17)65584-8 for high-fidelity aero-structural design. Op-
61. Lam R, Poloczek M, Frazier P, Willcox KE (2018) timization and Engineering 6(1):33–62,
Advances in bayesian optimization with applica- doi:10.1023/B:OPTE.0000048536.47956.62
tions in aerospace engineering. In: 2018 AIAA 73. Martins JRRA, Marriage C, Tedford NP (2009)
Non-Deterministic Approaches Conference, p 1656, pyMDO: An object-oriented framework for mul-
doi:10.2514/6.2018-1656 tidisciplinary design optimization. ACM Transac-
62. Lambe AB, Martins JRRA (2012) Extensions to the de- tions on Mathematical Software 36(4):20:1–20:25,
sign structure matrix for the description of multidis- doi:10.1145/1555386.1555389
ciplinary design, analysis, and optimization processes. 74. McWilliam MK, Zahle F, Dicholkar A, Verelst D,
Structural and Multidisciplinary Optimization 46:273– Kim T (2018) Optimal aero-elastic design of a rotor
284, doi:10.1007/s00158-012-0763-y with bend-twist coupling. Journal of Physics: Confer-
63. Lambe AB, Martins JRRA, Kennedy GJ (2017) ence Series 1037(4):042009, URL http://stacks.iop.org/
An evaluation of constraint aggregation strate- 1742-6596/1037/i=4/a=042009
gies for wing box mass minimization. Structural 75. Moore K, Naylor B, Gray J (2008) The develop-
and Multidisciplinary Optimization 55(1):257–277, ment of an open-source framework for multidisci-
OpenMDAO 27
plinary analysis and optimization. In: Proceedings of AIAA/ASCE/AHS/ASC Structures, Structural Dy-
the 10th AIAA/ISSMO Multidisciplinary Analysis and namics, and Materials Conference, Kissimmee, FL,
Optimization Conference, Victoria, BC, Canada, aIAA doi:10.2514/6.2018-1647
2008-6069 88. Salas AO, Townsend JC (1998) Framework require-
76. Naumann U (2011) The Art of Differentiating Com- ments for MDO application development. In: 7th
puter Programs—An Introduction to Algorithmic Dif- AIAA/USAF/NASA/ISSMO Symposium on Multidis-
ferentiation, SIAM ciplinary Analysis and Optimization, pp 98–4740
77. Nemec M, Zingg DW, , Pulliam TH (2004) Multipoint 89. Schnulo SL, Jeff Chin RDF, Gray JS, Papathakis
and multi-objective aerodynamic shape optimization. KV, Clarke SC, Reid N, Borer NK (2018) Develop-
AIAA Journal 42(6):1057–1065, doi:10.2514/1.10415 ment of a multi-segment mission planning tool for
78. Nielsen EJ, Kleb WL (2006) Efficient construction of SCEPTOR X-57. In: 2018 Multidisciplinary Analy-
discrete adjoint operators on unstructured grids us- sis and Optimization Conference, AIAA, Atlanta, GA,
ing complex variables. AIAA Journal 44(4):827–836, doi:10.2514/6.2018-3738
doi:10.2514/1.15830 90. Sobieszczanski-Sobieski J (1990) Sensitivity of com-
79. Ning A, Petch D (2016) Integrated design of downwind plex, internally coupled systems. AIAA Journal
land-based wind turbines using analytic gradients. Wing 28(1):153–160, doi:10.2514/3.10366
Energy 19(12):2137–2152, doi:10.1002/we.1972 91. Squire W, Trapp G (1998) Using complex variables to
80. Oliphant TE (2007) Python for scientific comput- estimate derivatives of real functions. SIAM Review
ing. Computing in Science and Engineering 9(3):10, 40(1):110–112
doi:10.1109/MCSE.2007.58 92. Stanley APJ, Ning A (2018) Coupled wind turbine de-
81. Padula SL, Gillian RE (2006) Multidisciplinary envi- sign and layout optimization with non-homogeneous
ronments: A history of engineering framework develop- wind turbines. Wind Energy Science doi:10.5194/wes-
ment. In: Proceedings of the 11th AIAA/ISSMO Mul- 2018-54
tidisciplinary Analysis and Optimization Conference, 93. Tedford NP, Martins JRRA (2006) On the common
doi:doi.org/10.2514/6.2006-7083, aIAA 2006-7083 structure of MDO problems: A comparison of architec-
82. Palar PS, Shimoyama K (2017) Polynomial- tures. In: Proceedings of the 11th AIAA/ISSMO Mul-
chaos-kriging-assisted efficient global optimiza- tidisciplinary Analysis and Optimization Conference,
tion. In: Computational Intelligence (SSCI), Portsmouth, VA, AIAA 2006-7080
2017 IEEE Symposium Series on, IEEE, pp 1–8, 94. Tedford NP, Martins JRRA (2010) Benchmark-
doi:10.1109/SSCI.2017.8280831 ing multidisciplinary design optimization algo-
83. Peherstorfer B, Beran PS, Willcox KE (2018) Multi- rithms. Optimization and Engineering 11(1):159–183,
fidelity monte carlo estimation for large-scale uncer- doi:10.1007/s11081-009-9082-6
tainty propagation. In: 2018 AIAA Non-Deterministic 95. Thomas J, Gebraad P, , Ning A (2017) Improv-
Approaches Conference, p 1660, doi:10.2514/6.2018- ing the FLORIS wind plant model for compatibility
1660 with gradient-based optimization. Wind Engineering
84. Peter JEV, Dwight RP (2010) Numerical sensitiv- 41(5):313–329, doi:10.1177/0309524X17722000
ity analysis for aerodynamic optimization: A survey 96. Tracey BD, Wolpert D (2018) Upgrading from gaus-
of approaches. Computers and Fluids 39(3):373–391, sian processes to student’s t processes. In: 2018 AIAA
doi:10.1016/j.compfluid.2009.09.013 Non-Deterministic Approaches Conference, p 1659,
85. Reuther JJ, Jameson A, Alonso JJ, Rimlinger MJ, doi:10.2514/6.2018-1659
, Saunders D (1999) Constrained multipoint aerody- 97. Welsh DJA, Powell MB (1967) An upper bound
namic shape optimization using an adjoint formula- for the chromatic number of a graph and its appli-
tion and parallel computers, part 1. Journal of Aircraft cation to timetabling problems. Computing Journal
36(1):51–60, doi:https://doi.org/10.2514/2.2413 10(1):85–=86, doi:10.1093/comjnl/10.1.85
86. Roy S, Crossley WA, Moore KT, Gray JS, Martins 98. Zahle F, Tibaldi C, Pavese C, McWilliam MK, Blasques
JRRA (2018) Next generation aircraft design con- JPAA, Hansen MH (2016) Design of an aeroelastically
sidering airline operations and economics. In: 2018 tailored 10 mw wind turbine rotor. Journal of Physics:
AIAA/ASCE/AHS/ASC Structures, Structural Dynam- Conference Series 753(6):062008, URL http://stacks.
ics, and Materials Conference, AIAA, AIAA, Kissim- iop.org/1742-6596/753/i=6/a=062008
mee, FL, doi:10.2514/6.2018-1647 99. Zahle F, Sørensen NN, McWilliam MK, Barlas A
87. Roy S, Crossley WA, Moore KT, Gray JS, Mar- (2018) Computational fluid dynamics-based surrogate
tins JRRA (2018) Next generation aircraft design optimization of a wind turbine blade tip extension for
considering airline operations and economics. In: maximising energy production. In: Journal of Physics:
28 Gray, Hwang, Martins, Moore, and Naylor
A Coloring of Total Derivative Jacobians where some of the total derivatives in the model are inciden-
tally zero, although they will take nonzero values elsewhere
A.1 Determining Total Derivative Coloring in the design space. The incidental zero would potentially
result in an incorrect coloring, and so it is to be avoided
In the toy problem illustrated in Fig. 8, the forward color- if possible. Instead of using the actual Jacobian values, we
ing of the model inputs is obvious. However, for a large generate a randomized total derivative Jacobian that is statis-
problem with an unordered total Jacobian matrix, it is not tically highly unlikely to create any incidental zero values.
easy to identify coloring. There are a wide variety of se- From Eq. (16), we know that the total derivative Jaco-
rial coloring algorithms [97, 56, 55, 18, 28], originally de- bian matrix, du/ dr, is equal to the inverse of the partial
veloped for coloring partial derivative Jacobians. There are derivative Jacobian matrix, ∂ R/∂ u. Our task reduces to the
also a set of parallel coloring algorithms that have been de- general mathematical problem of determining the sparsity
veloped for parallel distributed memory applications, such structure of a matrix inverse given the sparsity structure of
as CFD [55, 78, 65, 64, 41]. What we propose here is that the matrix itself, assuming that the matrix is large. First,
these coloring algorithms are now also applicable for color- we note that we do not need the sparsity structure of all of
ing total derivative Jacobian calculations based on the uni- du/ dr; we only need the rows and columns corresponding
fied derivatives equations (16). to d f / dx, which is almost always a much smaller matrix be-
To apply a coloring algorithm, we need to know the total cause du/ dr contains all the intermediate model variables as
derivative Jacobian sparsity pattern a priori, but this infor- well as the model inputs and model outputs. From Cramer’s
mation is not easily available. However, the sparsity pattern rule, we know that
of the partial derivative Jacobian matrix, ∂ R/∂ u, is known " #
to OpenMDAO a priori from the combination of user-declared ∂ R −1 adj(∂ R/∂ u)i j
= , (25)
partial derivatives and the connections made during model ∂u det(∂ R/∂ u)
ij
construction. Therefore, we developed a method in Open-
MDAO that computes the total derivative sparsity given the which is to say that the (i, j)th entry of the inverse of the Ja-
partial derivative Jacobian sparsity. cobian is the quotient of the (i, j)th entry of the adjugate of
To determine the total Jacobian sparsity pattern for a the Jacobian and the determinant of the Jacobian. The Jaco-
given state, OpenMDAO computes a randomized partial deriva- bian is invertible for well-posed models, so the determinant
tive matrix using linear solutions of Eq. (16) with random- is always nonzero. Thus, only the numerator in Eq. (25) de-
ized values for ∂ R/∂ u. Intuitively, one can understand how termines if a particular term is nonzero. If the matrix is n×n,
using randomized partial derivatives would yield a relatively where n is large, each term in the adjugate and, thus, the in-
robust estimate of the total derivative sparsity pattern, but we verse is the sum of a large number of terms that are products
provide a more detailed logical argument for why this ap- of n − 1 partial derivatives.
proach is appropriate here. A single randomized total deriva- If we produce a partial derivative Jacobian matrix with
tive Jacobian is likely to give the correct sparsity pattern, random values for all nonzero terms, it would be highly im-
but we can reduce the likelihood of errors in the sparsity by probable that this sum of a large number of terms would be
summing the absolute value of multiple randomly generated incidentally zero. Therefore, we assume that each entry of
total derivative Jacobians. the inverse of the random partial derivative Jacobian that is
Once we have the total derivative sparsity pattern, Open- zero is actually a zero in the sparsity structure of the true
MDAO applies a coloring algorithm based on the work of total derivative Jacobian.
Coleman and Verma [18] to identify the reduced set of lin-
ear solutions needed to compute the total derivative Jaco-
bian. As we demonstrate in Sec. 5.3, coloring can offer sig-
B Equivalence between Hierarchical and
nificant performance improvements for problems that have
Reduced-Space Newton’s Methods
sparse total derivative Jacobians.
OpenMDAO introduces a new hierarchical Newton’s method
formulation for the sake of improved numerical flexibility;
A.2 Justification for Coloring with Randomized Total however, this proof shows that the new full-space method
Derivative Jacobians can be made mathematically identical to the more traditional
reduced-space Newton’s method if one always fully con-
In theory, it would be possible to color the total derivative verges the internal nonlinear system associated with Ry . Con-
Jacobian based on the actual Jacobian computed around the sider the residual of an arbitrary implicit function, Rx (x) =
initial condition of the model. There is a risk, however, that rx , at some non-converged value of x. If rx is actually a func-
the initial condition of the model will happen to be at a point tion of the nonlinear system Ry (x, y) = 0, converged for a
30 Gray, Hwang, Martins, Moore, and Naylor
specific value of x: Solving Eq. (35) for ∆ y and back substituting yields
" #
Rx (x) = C(x, y) = rx (26) ∂C ∂C ∂ D −1 ∂ D
− ∆ x = −rx . (36)
Ry (y) = D(x, y) = 0, (27) ∂x ∂y ∂y ∂x
then we call Rx the reduced-space residual function, with the Now note that Eqs. (32) and (36) are identical, and therefore
full space composed of the vectors x and y of lengths n and applying the hierarchical Newton algorithm to the full-space
m, respectively. model (size n + m) gives the exact same ∆ x as the reduced-
Our ultimate goal is to solve for x such that Rx (x) = 0, space Newton algorithm applied to the smaller, reduced-
using Newton’s method. The traditional Newton’s method space model (size n). Since the updates to ∆ x are the same,
iteration consists in computing ∆ x by solving a linear system then assuming complete convergence of all child subsys-
of size n: tems, the path that the hierarchical Newton’s method takes
∂ Rx
on the size (n + m) formulation will be identical to the path
∆ x = −rx . (28) the reduced-space Newton’s method takes on the smaller
∂x
size n formulation.
Expanding ∂ Rx /∂ x to account for the intermediate calcula-
tion of y gives
C Equivalence between Recursive and Hierarchical
∂C ∂C dy Broyden’s Methods
+ ∆ x = −rx . (29)
∂ x ∂ y dx
Broyden’s Second Method computes an approximate update
By differentiating Eq. (27) with respect to x, we find that
of the inverse Jacobian via
dRy ∂ Ry ∂ Ry dy
= + = 0, (30)
dx ∂x ∂ y dx
∂ Rx
−1 −1
∆ xn − Jn−1 ∆ rn T
dy
∂ D −1 ∂ D
= Jn−1 ∼ −1
= Jn−1 + 2
∆ rn . (37)
=− . (31) ∂x n k∆ rn k
dx ∂y ∂x
Then, the Newton update is applied using the approximate
Combining Eqs. (29) and (31) gives a formula for the New- inverse Jacobian via
ton update of the reduced-space function as
" # ∆ x = −Jn−1 rx . (38)
∂C ∂C ∂ D −1 ∂ D
− ∆ x = −rx . (32)
∂x ∂y ∂y ∂x
C.1 Reduced-space Broyden
Now, instead of the reduced-space form of Eq. (26), con-
sider a full-space form that deals with both x and y simulta- Consider the same composite model structure given in Eqs. (26)
neously as one vector: and (27). From Eq. (32), we know that
" #
∂C ∂C ∂ D −1 ∂ D
" # " #
C(x, y) r ∂ Rx
Ru (u) = R(x, y) = = ru = x . (33) = − . (39)
D(x, y) ry ∂x n ∂x ∂y ∂y ∂x
n
This full-space form is the mathematical representation used To simplify the algebra, we now define a new variable, β , as
by OpenMDAO for any system—or subsystem, as demon-
strated in Eq. (15). The Newton update for the flattened sys- −1 " #−1
∂C ∂C ∂ D −1 ∂ D
∂ Rx
tem must be solved for via a linear system of size (n + m): β= = − . (40)
∂x n−1 ∂x ∂y ∂y ∂x
n−1
∂ Ru Now we can substitute this into Eq. (37) to calculate the
∆ u = −ru . (34)
∂u Broyden update to the inverse Jacobian as
If we apply the hierarchical Newton algorithm to the full- ∆ x − β ∆ rx T
Jn−1 = β + ∆ rx . (41)
space formulation, then we can assume that any time Eq. (34) k∆ rx k2
is solved, y has first been found such that that ry = 0. Ex-
Finally, if we substitute this into Eq. (38), we get the com-
panding ∂ Ru /∂ u and setting ry = 0 in Eq. (34) yields
manded update to the state value:
" #" # " #
∂C ∂C
∆x r
∆ x − β ∆ rx T
∂x ∂y
∂D ∂D =− x . (35) ∆ xn = − β + ∆ rx rx . (42)
∂x ∂y
∆ y 0 k∆ rx k2
OpenMDAO 31
and then,
−1 ∂C
−1 β − ∂C
∂x ∂y γ
Jn−1 = −1 . (48)
− ∂∂Dy ∂∂Dx β γ
k∆ ru k = k∆ rx k. (50)