Electron microscopy
 
Multivariate Gaussian Distribution and Standard Gaussian Distribution
- Python and Machine Learning for Integrated Circuits -
- An Online Book -
Python and Machine Learning for Integrated Circuits                                                           http://www.globalsino.com/ICs/        


Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix

=================================================================================

The Gaussian distribution, also known as the normal distribution, is one of the most widely used probability distributions in statistics. It is characterized by its bell-shaped curve and is often used to model continuous random variables in various fields of science, engineering, and social sciences.

For a standard Gaussian distribution (also known as a multivariate normal distribution), the covariance matrix is equal to the identity matrix. This is a fundamental property of the standard Gaussian distribution, and it's related to the fact that the variables in a standard Gaussian distribution are independent and have unit variance:

  1. Independence: In a standard Gaussian distribution, the variables in the multivariate vector are independent of each other. In other words, there is no linear relationship (covariance) between them. This means that off-diagonal elements in the covariance matrix are zero.

  2. Unit Variance: In a standard Gaussian distribution, each variable has a variance of 1. This is why the diagonal elements of the covariance matrix are all equal to 1.

In mathematical terms, the covariance matrix for a standard 2D Gaussian distribution looks like this:

          covariance matrix for a standard 2D Gaussian distribution -------------------------------------- [3864a]

To satisfy the properties required for a valid multivariate normal distribution, the covariance matrix should be symmetric and positive definite. That is, valid covariance matrices should satisfy the following criteria:

  1. Symmetry: A valid covariance matrix must be symmetric. This means that for all elements.

  2. Positive Definiteness: A valid covariance matrix must be positive definite. This means that all of its eigenvalues should be positive.

We can have a diagonal covariance matrix with positive values on the diagonal. This is a valid covariance matrix, and we can adjust the diagonal values to control the variances of each dimension.

          covariance matrix for a standard 2D Gaussian distribution -------------------------------------- [3864b]

Here, and are the variances of the first and second dimensions, respectively.

When dealing with multivariate Gaussian distributions and estimating parameters like the covariance matrix, the issue of singularity becomes more relevant. If the covariance matrix is singular, it means that the variables are not linearly independent, and there is some linear dependency among them. In the context of maximum likelihood estimation (MLE) for multivariate Gaussian distributions, a singular covariance matrix can lead to challenges.

Specifically, the inverse of a singular covariance matrix does not exist, making it non-invertible. In linear regression or multivariate analysis, this singularity can cause problems, such as multicollinearity. Multicollinearity occurs when two or more variables in a regression model are highly correlated, leading to instability in the estimation of coefficients. It can result in large standard errors, making it difficult to assess the significance of individual predictors.

In a univariate Gaussian distribution (single-variable), the covariance matrix is a scalar (variance), and the issues of singularity related to a covariance matrix are not applicable. The singularity concern typically arises in the context of multivariate Gaussian distributions with a covariance matrix involving multiple variables.

In the given probability density function (PDF) for a multivariate Gaussian distribution:

          covariance matrix for a standard 2D Gaussian distribution -------------------- [3864c]

where:

  • is the dimensionality of the multivariate distribution.
  • is the mean vector.
  • is the covariance matrix.
  • |ϵ| denotes the determinant of the covariance matrix.

The term |ϵ| is the determinant of the covariance matrix, and it will be zero if and only if the covariance matrix is singular. In other words, the covariance matrix being singular means that the variables are linearly dependent, and there is some redundancy in the information provided by them. If , the term in the denominator of the PDF becomes undefined, and the PDF itself becomes unbounded. This situation is often associated with issues of multicollinearity in statistil modeling. Therefore, |ϵ| is zero, it can lead to problems in the calculation of the PDF, and it suggests that the covariance matrix is singular, indicating linear dependence among the variables in the multivariate distribution.

The probability density function (PDF) of a multivariate Gaussian distribution is given by,

          P(x) = p(x1, x2) -------------------------------------------- [3864cb]

Then, the density function is expressed by,

          covariance matrix for a standard 2D Gaussian distribution ----------------------------- [3864cc]

where,

is the vector of random variables. In the current case, we have,

                 covariance matrix for a standard 2D Gaussian distribution

is the mean vector.

is the covariance matrix.

is the determinant of the covariance matrix.

is the transpose of the vector .

is the inverse of the covariance matrix.

Equation 3864cc describes the probability of the random vector following a multivariate Gaussian distribution with mean and covariance matrix . The term in the denominator ensures normalization, and the exponential term in the middle is the multivariate generalization of the standard Gaussian distribution.

For d = 2 ( two-dimensional case), Equation 3864cc can be expanded to,

          covariance matrix for a standard 2D Gaussian distribution --------- [3864cd]

If , where is the number of observed data points and is the dimensionality of the distribution (number of variables), it generally means that there are more variables than observations, which can lead to issues of singularity and non-invertibility of the covariance matrix. In the given PDF, the term |ϵ| is the determinant of the covariance matrix , and if , it increases the likelihood that is singular; therefore, there is an increased risk of singularity in the covariance matrix, which can lead to numerical instability and issues in the calculation of the PDF. If , the sample covariance matrix is almost guaranteed to be singular because the maximum rank of a sample covariance matrix is .

For a conditional distribution of x3 given in a normal distribution, we have,

          conditional probability ------------------------------------------- [3818d]

          conditional probability ----------------------- [3818e]

          conditional probability ----------------------- [3818f]

Expression 3818d indicates that the conditional distribution of x given is a normal distribution with mean and covariance matrix . Equation 3818e is the mean of the conditional distribution, which appears to be a linear combination involving the mean of x(), the covariance between x and x(), and the inverse of the covariance of x22,2−1). Equation 3818f is the covariance matrix of the conditional distribution. It involves the covariance of x3(), the covariance between and x23,2), and the inverse of the covariance matrix of x22,2−1). The expressions resemble those found in the multivariate normal distribution, and the use of these equations would typically assume that the joint distribution of and is multivariate normal.

Figure 3864a shows Gaussian contours with stretched covariance matrix where the covariance matrix is singular. Since the covariance matrix is singular, the contours will show that the distribution is more stretched along one or more axes. Notice how these contours may exhibit distortions or even collapse in certain directions. This is a result of using a poorly conditioned covariance matrix, which can lead to numerical instability and unreliable estimates of the underlying distribution.

Gaussian Contours with Stretched Covariance Matrix

Figure 3864a. Gaussian Contours where the covariance matrix is singular (code).

Some ways to handle or mitigate issues related to a singular covariance matrix include:

  1. Constrain ϵ to be diagonal. Constraining the covariance matrix () to be diagonal is a common approach to mitigate issues related to singularity, especially in the context of Gaussian Mixture Models (GMMs) and other statistical models. A diagonal covariance matrix implies that the variables are uncorrelated, and it can help address problems associated with collinearity or linear dependence among variables. When the covariance matrix is constrained to be diagonal, it simplifies the estimation process and can prevent numerical instability, especially when dealing with a limited number of observations compared to the number of variables.

  2. When the covariance matrix is diagonal, it means that the off-diagonal elements are zero, and only the variances of individual variables are considered. For a 2-dimensional case with variables and , the diagonal covariance matrix would look like:

              2-dimensional case with variables X1​ and X2 --------------------------------------- [3864d]

    where and are the variances of respectively. Each variable's variance is on the diagonal, and there are no covariance terms.

    In general, for an -dimensional case, the diagonal covariance matrix would have the form:

              2-dimensional case with variables X1​ and X2 --------------------------------------- [3864e]

    This form simplifies the estimation process, and it is useful when there are concerns about numerical stability or when there is limited data, potentially leading to a singular covariance matrix.

    Figure 3864b shows Gaussian contours with diagonal covariance matrices. Constraining a Gaussian distribution to have diagonal covariance matrices results in axis-aligned contours. This means that the covariance between different variables is assumed to be zero, and the distribution is oriented along the coordinate axes. The ellipses representing the contours of the Gaussian distribution will be aligned with the axes. The contours for the easy-to-compute case should be well-behaved.

    Gaussian Contours with Stretched Covariance Matrix

    Figure 3864b. Gaussian contours with diagonal covariance matrices (code).

  3. Constrain ϵ to be ϵ = σ2I. Constraining the covariance matrix () to be proportional to the identity matrix (σ2I) is a common approach to handle or mitigate issues related to a singular covariance matrix, and it is often referred to as isotropic or diagonal covariance. In this case, we have MLE given by,

  4.           2-dimensional case with variables X1​ and X2 --------------------------------------- [3864f]

    where,:

    • is the number of samples.
    • is the number of features (dimensions).
    • is the -th feature of the -th sample.
    • is the mean of the -th feature across all samples.
    • σ2 is a scalar.

    This expression represents the mean of the squared differences between each data point and the mean along each feature dimension. It's a way of estimating a common variance term (σ2) when assuming isotropic or diagonal covariance.

    However, the biggest problem with the current option is it assumes the features are uncorrelated. That is, constraining the covariance matrix to be proportional to the identity matrix (σ2I) assumes that the features are uncorrelated. This is because the covariance matrix, which describes the relationships between different features, becomes a scaled identity matrix, implying that there are no off-diagonal elements representing covariances between different features.

    One thing we can do is to modify the MLE of a covariance matrix by adding a small diagonal value. This is a common technique to address numerical stability issues, as it ensures that the matrix remains invertible. By adding a small diagonal value to the MLE, the resulting matrix becomes guaranteed to be invertible. This is important for computations involving covariance matrices. However, this method may not be the best solution, for instance, in some cases, Factor Analysis can be a better way for such ML problems. In high-dimensional settings, estimating the covariance matrix accurately becomes challenging, and it may become singular or nearly singular. Factor analysis, by constraining the model parameters, can help mitigate these issues.

  5. Regularization: Use regularization techniques, such as ridge regression, which adds a penalty term to the objective function to prevent overfitting and stabilize the estimation.

  6. Dimensionality Reduction: If the variables are highly correlated, consider reducing the dimensionality of the data using techniques like principal component analysis (PCA).

  7. Data Cleaning: Ensure that there are no linearly dependent variables in the dataset. If there are, consider removing one of the variables or transforming them.

Table 3864. Applications of Gaussian Distribution and Standard Gaussian Distribution.

Applications Details
Multiple Parameter Estimation page3843

============================================

         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         

 

 

 

 

 



















































 

 

 

 

 

=================================================================================