Probability and Statistics

What is Probability ?

Probability is a measure of how likely an event is to occur

Covariance Matrix:

Covariance matrix contains all linear pairwise interactions. A covariance is simply a non-normalized Pearson correlation coefficient, and thus a covariance matrix is a correlation matrix that retains the scale of the data.

Regularization:

Regularization techniques help prevent overfitting by adding a penalty term to the loss function, which discourages overly complex models. i.e to discourage the coefficients from having large magnitudes. This improves the model's ability to generalize well to unseen data.

Screenshot 2024-07-02 at 9.23.49 AM.png

MLE (Maximum likelihood estimation):

We have a bunch of data and some possible models that could have generated the data. For each one of this models, you calculate the probability that the data appears based on Model1, Model2 and Model3. The one that gives the highest probability is the winner. You find the model that most likely produced that data.

Screenshot 2024-07-02 at 8.45.58 AM.png

Noise:

The ability to make accurate predictions on previously unseen inputs in a key goal in machine learning and is know as generalization.
Most real-world data that we collect and want to analyze has some kind of pattern or regularity underlying it. However, the individual data points we observe are "corrupted" or distorted by random noise or variability.

Why is there noise?

Sometimes randomness is just part of the process: Random noise refers to unpredictable or random fluctuations in the data that can obscure the underlying pattern. This noise could be caused by factors that are genuinely random which we can’t control.
There are hidden factors we don't see: More often, it comes from other variables that affect the data but aren't directly observed or measured. Often, there are hidden factors affecting our measurements that we don't even see. These hidden things are like another kind of noise.

The goal is to learn the real pattern, even with all the noise. By understanding the underlying regularity, we can make better predictions and understand how things work.

Linear Regression:

The image below illustrates a linear regression model with data points that are sampled from a Gaussian distribution. It visually represents the model's predictions at various x-values, the residuals (errors) for each point, and the variability around these predictions indicated by shaded areas, which represent the standard deviation of the distribution. This setup emphasizes the statistical assumption of normally distributed residuals in linear regression analysis.

Untitled