Least Squares

Least squares linear and quadratic models create a polynomial that approximates the data sample globally as a best-fit.

For example, for one design study containing two input parameters $x_{1}$ and $x_{2}$ , a linear polynomial for predicted value $\hat{y}$ including constant and linear terms is defined as:

Figure 1. EQUATION_DISPLAY

\hat{y} (x) = β_{0} + β_{1} x_{1} + β_{2} x_{2}

(5151 5152 5153)

A quadratic polynomial including constant, linear terms, interaction, and squared terms is defined as:

Figure 2. EQUATION_DISPLAY

\hat{y} (x) = β_{0} + β_{1} x_{1} + β_{2} x_{2} + β_{3} {x_{1}}^{2} + β_{4} x_{1} x_{2} + β_{5} {x_{2}}^{2}

(5151 5152 5153)

The fit quality of a polynomial to a data point is measured by its residual $ε_{i}$ defined as the difference between the actual value $y_{i}$ and the predicted value $\hat{y_{i}}$ :

Figure 3. EQUATION_DISPLAY

ε_{i} = y_{i} - {\hat{y}}_{i}

(5151 5152 5153)

In matrix notation, the equation is:

Figure 4. EQUATION_DISPLAY

ε = y - \hat{y} = y - X β

(5154)

Here, $X$ represents the design matrix. For example, the design matrix of a quadratic model with two input parameters $x_{1}$ and $x_{2}$ from m simulations is:

Figure 5. EQUATION_DISPLAY

X = (\begin{matrix} \begin{matrix} 1 & \begin{matrix} x_{1, 1} & \begin{matrix} x_{2, 1} & \begin{matrix} {x_{1, 1}}^{2} & \begin{matrix} x_{1, 1} x_{2, 1} & {x_{2, 1}}^{2} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} \begin{matrix} 1 & \begin{matrix} x_{1, 2} & \begin{matrix} x_{2, 2} & {x_{1, 2}}^{2} & x_{1, 2} x_{2, 2} & {x_{2, 2}}^{2} \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} \begin{matrix} ⋮ & \begin{matrix} ⋮ & \begin{matrix} ⋮ & \begin{matrix} \begin{matrix} ⋮ \end{matrix} & \begin{matrix} ⋮ & ⋮ \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \\ \begin{matrix} 1 & \begin{matrix} \begin{matrix} x_{1, m} \end{matrix} & \begin{matrix} x_{2, m} & \begin{matrix} {x_{1, m}}^{2} & \begin{matrix} x_{1, m} x_{2, m} & {x_{2, m}}^{2} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix} \end{matrix})

(5155)

The vector $β$ of the unknown coefficients is:

Figure 6. EQUATION_DISPLAY

β = (\begin{matrix} β_{0} \\ \begin{matrix} \begin{matrix} β_{1} \end{matrix} \\ \begin{matrix} ⋮ \\ β_{m} \end{matrix} \end{matrix} \end{matrix})

(5156)

Best possible fit means that the residual sum of squares $L$ is minimized for the sample data:

Figure 7. EQUATION_DISPLAY

L = \sum_{i = 1}^{m} {ε_{i}}^{2} = {(y_{1} - {\hat{y}}_{1})}^{2} + {(y_{2} - {\hat{y}}_{2})}^{2} + \dots + {(y_{m} - {\hat{y}}_{m})}^{2}

(5157)

In the form of linear approximation:

Figure 8. EQUATION_DISPLAY

L = {(y - X β)}^{t} (y - X β)

(5158)

After expanding the linear approximation:

Figure 9. EQUATION_DISPLAY

L = y^{t} y - β^{t} X^{t} y - y^{t} X β + β^{t} X^{t} X β

(5159)

Due to the dimensions of the vector $y$ , $β$ , and the matrix $X$ , $β^{t} X^{t} y$ and $y^{t} X β$ are two scalars. Therefore, $y^{t} X β = {(β^{t} X^{t} y)}^{t} = β^{t} X^{t} y$ . The equation can be re-written as:

Figure 10. EQUATION_DISPLAY

L = y^{t} y - {2 \cdot β}^{t} X^{t} y + β^{t} X^{t} X β

(5160)

The minimum of the sum of squares is found when the gradient of linear approximation is zero:

Figure 11. EQUATION_DISPLAY

\frac{\partial L}{\partial β} = - 2 X^{t} y + 2 X^{t} X β \frac{\partial L}{\partial β} = 0 \Rightarrow X^{t} X β = X^{t} y β = {(X^{t} X)}^{- 1} X^{t} y

(5161)

Least squares surrogates usually do not pass through the sample points directly. They go beside the sample points in the design space to obtain a best global fit. The least squares solution can be computed using the singular value decomposition of the matrix $X$ . This method is efficient with a large number of sample data.

Number of Evaluations

To determine the vector $β$ of unknown coefficients, you are advised to run twice as many simulations as the dimension of the vector $β$ .

For a linear model, the dimension of $β$ is n+1. n indicates the number of input parameters in the design study.

Figure 12. EQUATION_DISPLAY

$\hat{y} = β_{0} + \sum_{i = 1}^{n} {β_{i} \cdot x}_{i}$

(5162)
For a quadratic model, the dimension of $β$ is 1+2n+n(n-1)/2. n indicates the number of input parameters in the design study.

Figure 13. EQUATION_DISPLAY

$\hat{y} = β_{0} + \sum_{i = 1}^{n} {β_{i} \cdot x}_{i} + \sum_{j = 1}^{n} {β_{j} \cdot x}_{j} + \sum_{i = 1}^{n - 1} \sum_{j = i + 1}^{n} β_{i j} \cdot x_{i} x_{j}$

(5163)