Electron microscopy
 
Difference between Estimation and Approximation Errors
- Python Automation and Machine Learning for ICs -
- An Online Book -
Python Automation and Machine Learning for ICs                                                           http://www.globalsino.com/ICs/        


Chapter/Index: Introduction | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | Appendix

=================================================================================

Table 3764. Difference between estimation and approximation errors.

  Estimation Error Approximation Error
Definition Estimation error, also known as bias, measures how well a machine learning algorithm's model approximates the underlying true relationship between input and output. Approximation error, also known as variance, measures how much the model's predictions vary for different training datasets.
Nature It is the error introduced by approximating a real-world problem, which is often complex, with a simplified model. It is the error introduced by using a specific set of training data to train a model, and the model may not generalize well to new, unseen data.
Causes Estimation error can arise when the chosen model is too simple and cannot capture the true underlying patterns in the data. Approximation error can arise when the model is too complex and captures noise or random fluctuations in the training data, leading to poor generalization to new data.
Impact High estimation error can lead to a model that systematically underestimates or overestimates the true values. High approximation error can result in a model that performs well on the training data but poorly on new, unseen data.

The expected risk (error) of a hypothesis hs ∈H, which is selected based on the training dataset  S from a hypothesis class H and is the output of the ERM learner (under the hypothesis class H ) (ERMH), can be decomposed into the approximation error, εapp, and the estimation error, εest, as following,

          LD(hs) = εapp + εest ------------------------------------- [3764a]

                    = εapp + (LD(hs) - εapp)------------------------------------- [3764b]

                 εapp + (LD(hs) - εapp) -------------- [3764c]

where,

represents the empirical risk of a hypothesis on a dataset . Empirical risk is the average loss over the dataset, where the loss is a measure of how well the hypothesis approximates the true distribution of the data.

is the minimum empirical risk over all hypotheses ℎ in the hypothesis space . In other words, it represents the best achievable empirical risk among all possible hypotheses in the given hypothesis space.

represents the excess risk. The excess risk is the difference between the empirical risk of the specific hypothesis ℎ and the best achievable empirical risk.

With the theory of the hypothesis class (page3982), we have,

         properties of variance ----------------------------------- [3764d]

where,

         L(h^) represents the expected loss of the learned hypothesis ℎ^ on unseen data.

         L(h*) represents the expected loss of the best possible hypothesis ℎ* in the hypothesis class on unseen data.

         The term on the righ-hand side is a term related to the complexity of the hypothesis class and the sample size. Here:

          is the size of the hypothesis class (the number of possible hypotheses).

          is a parameter representing the confidence level.

       is the sample size.

Figure 3764a shows the relationship between these terms in Equation 3764a. The red points are specific hypotheses.  The best hypothesis (the Bayes hypothesis) lies outside the chosen hypothesis class H. The distance between the risk of  h^ and the risk of h* is the estimation error, while the distance between  ℎ* and Bayes hypothesis is the approximation error.

Some properties are:

  • The larger  H is, the smaller this error is, because it's more likely that a larger hypothesis class contains the actual hypothesis we are looking for. Therefore, if  H does not contain the actual hypothesis we are searching for, then this error could not be zero.

  • This error does not depend on the training data since in Equation 3764a, there's no S (the training dataset).
  • If we increase the size and complexity of the hypothesis class, the approximation error decreases, but the estimation error may increase, resulting in overfitting.

expected risk (error) of a hypothesis

Figure 3764a. Relationship between these terms in Equation 3764a. The enclosed blue area represents the hypothesis class H.

Figure 3764b shows the estimation and approximation errors with noisy data. The estimation error refers to the difference between the true function and the estimated function. The estimated function is the one which is obtained based on the hypothesis or model. The difference between the true function and the observed (noisy) data (blue scattered points) represents the approximation error. Both estimation error and approximation error can be positive or negative.

expected risk (error) of a hypothesis

Figure 3764b. Estimation and approximation errors. (Code)

============================================

         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         
         

 

 

 

 

 



















































 

 

 

 

 

=================================================================================