Gradient-Based Optimization

Gradient-based optimization is a mathematical method for finding the local optimum (often minimum) of an objective function, while varying a set of parameters of interest.

For real industrial cases, it is often impossible to find analytical expressions of the objective function with respect to the geometric parameters you want to investigate. When no derivatives can be computed directly from analytical expressions, you iteratively find the path towards the minimum based on the local discrete values.

Sequential Quadratic Programming (SQP) uses the discrete 1st-order derivatives (gradients) to find the path to the local minimum of the objective function. In Simcenter STAR-CCM+, the Adjoint solver computes the necessary gradient of the objective function (cost function) with respect to the geometric parameters.

For example, to obtain the minimum pressure drop of a duct system with respect to geometric parameters, you compute the necessary gradient information through the Adjoint solver and launch the SQP search in Design Manager to move to the point with minimal pressure drop from your initial guess. For more details of the workflow, see Setting up a Gradient-Based Optimization Study.

Gradient Calculation

The gradient information of the objective function is defined as follows:

Figure 1. EQUATION_DISPLAY

\frac{d L^{T}}{d D} = \frac{d X_{s u r f a c e}^{T}}{d D} \cdot \frac{d L^{T}}{d X_{s u r f a c e}}

(5176)

where

$D$ is the vector of design parameters.
$L$ is the objective you want to optimize with or without constraints.
$X_{s u r f a c e}$ is the surface mesh.
$\frac{d X_{s u r f a c e}^{T}}{d D}$ is the geometric sensitivity in Simcenter STAR-CCM+. See Geometric Sensitivity.
$\frac{d L^{T}}{d X_{s u r f a c e}}$ is the surface sensitivity in Simcenter STAR-CCM+ . See Surface Sensitivity.

The gradient $\frac{d L^{T}}{d D}$ is computed for each boundary surface as the product of the geometric sensitivity and the surface sensitivity. After this step, you obtain a field of gradient sensitivity on the boundaries.

Sequential Quadratic Programming (SQP) Method

The SQP approach searches the path to the optimum value in an iterative way from the initial point $x_{0}$ you select.

Consider a gradient-based optimization with only one geometric parameter as input parameter in Design Manager. The objective function is then $f (x)$ . The minimum of the function satifies $f ´ (x) = 0$ . SQP calculates the path $Δ x$ towards the minimum of the objective function at each iteration using an approximation of $f (x)$ .

Starting from the initial point $x_{0}$ , $f (x)$ can be approximated using a 2nd-order Taylor expansion around $x_{0}$ :

Figure 2. EQUATION_DISPLAY

f (x) \approx f (x_{0}) + f ´ (x_{0}) (x - x_{0}) + \frac{1}{2} f ´´ (x_{0}) {(x - x_{0})}^{2}

(5177)

The minimum of the approximated objective function is at the point where $f ´ (x) = 0$ . Calculate the first derivative of equation

Eqn. (5177):

Figure 3. EQUATION_DISPLAY

f ´ (x) = f ´ (x_{0}) + f ´´ (x_{0}) (x - x_{0}) = 0

(5178)

From the equation above, you can find the next point $x_{1} = x_{0} - \frac{f ´ (x_{0})}{f ´´ (x_{0})}$ .

In a general notation, starting from the point $x_{k}$ the next point is $x_{k + 1} = x_{k} - \frac{f ´ (x_{k})}{f ´´ (x_{k})}$ .

You iteratively build a sequence $x_{0}, x_{1}, \dots x_{k}, x_{k + 1}, \dots x_{*}$ , with which you move towards the point $x_{*}$ —the converged value of the parameter where the minimum of $f (x)$ appears.

The 1st-order derivative $f ´ (x_{k})$ is provided by the gradient sensitivity from Eqn. (5176).

Since the 2nd-order derivative is computationally expensive, in Design Manager, it is numerically approximated from the 1st-order derivative as follows:

$f ´´ (x_{k}) = \frac{f ´ (x_{k}) - f ´ (x_{k - 1})}{x_{k} - x_{k - 1}}$

In the optimization science, the direct calculation of $x_{k + 1} = x_{k} - \frac{f ´ (x_{k})}{f ´´ (x_{k})}$ using 1st and 2nd-order derivatives is called the Newton Method. The interpolation of the 2nd-order derivative from the 1st-order derivatives of the last time step is called Quasi-Newton Method.

In an optimization with

n

parameters, you scale the equation to:

Figure 4. EQUATION_DISPLAY

[\begin{matrix} x_{1, k + 1} \\ \begin{matrix} x_{2, k + 1} \\ ⋮ \\ x_{n, k + 1} \end{matrix} \end{matrix}] = [\begin{matrix} x_{1, k} \\ \begin{matrix} x_{2, k} \\ ⋮ \\ x_{n, k} \end{matrix} \end{matrix}] - {(\nabla^{2} f)}^{- 1} \cdot \nabla f

(5179)

$\nabla f$ is the 1st-order derivative of the objective function, which is also called the Jacobian matrix:

Figure 5. EQUATION_DISPLAY

J = \nabla f = [\begin{matrix} \begin{matrix} \frac{\partial f}{\partial x_{1}} & \dots \end{matrix} & \frac{\partial f}{\partial x_{n}} \end{matrix}]

(5180)

$\nabla^{2} f$ is the 2nd-order derivative of the objective function, which is also called the Hessian matrix:

Figure 6. EQUATION_DISPLAY

H = \nabla^{2} f = [\begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} \frac{\partial^{2} f}{{\partial x_{1}}^{2}} & \frac{\partial^{2} f}{\partial x_{1} \partial x_{2}} & \dots \end{matrix} & \frac{\partial^{2} f}{\partial x_{1} \partial x_{n}} \end{matrix} \end{matrix} \\ \begin{matrix} \begin{matrix} \frac{\partial^{2} f}{\partial x_{2} \partial x_{1}} & \frac{\partial^{2} f}{{\partial x_{2}}^{2}} & \dots \end{matrix} & \frac{\partial^{2} f}{\partial x_{2} \partial x_{n}} \end{matrix} \\ \begin{matrix} \begin{matrix} ⋮ & ⋮ & ⋱ \end{matrix} & ⋮ \end{matrix} \\ \frac{\partial^{2} f}{\partial x_{n} \partial x_{1}} \begin{matrix} \frac{\partial^{2} f}{\partial x_{n} \partial x_{2}} & \dots & \frac{\partial^{2} f}{{\partial x_{n}}^{2}} \end{matrix} \end{matrix}]

(5181)

For an optimization with constraints, you replace the simple objective function $f (x)$ with the Lagrangian of the problem:

Figure 7. EQUATION_DISPLAY

L (x, λ, σ) = f (x) - λ h (x) - σ g (x)

(5182)

where

$L$ in the Lagrangian.
$f (x)$ is the objective function.
$h (x)$ is the inequality constraint of the optimization problem, which satisfies $h (x) \geq 0$ .
$g (x)$ is the equality constraint of the optimization problem, which satisfies $g (x) = 0$ .
$λ$ is the Lagrange multiplier for the inequality constraint $h (x)$ .
$σ$ is the Lagrange multiplier for the equality constraint $g (x)$ .

The example of Lagrangian is based on the assumption of one geometric parameter. You can scale the equation to matrices including multiple geometric parameters.

The next point in the design space is then:

Figure 8. EQUATION_DISPLAY

[\begin{matrix} x_{k + 1} \\ \begin{matrix} λ_{k + 1} \\ σ_{k + 1} \end{matrix} \end{matrix}] = [\begin{matrix} x_{k} \\ \begin{matrix} λ_{k} \\ σ_{k} \end{matrix} \end{matrix}] - {(\nabla^{2} L)}^{- 1} \cdot \nabla L

(5183)