본문 바로가기

Coursera/Mathematics for ML and Data Science

Probability & Statistics for Machine Learning & Data Science (8)

728x90

Introduction to Probability and Probability Distributions

Probability Distributions

Uniform Distribution

012

Uniform distribution has all possible values in an interval with the same frequency of occurrence.

All intervals have the same probability.

012

It has two parameters:

  • a: beginning of an interval
  • b: end of the interval
728x90

Normal Distribution

01

Normal distribution (aka Gaussian distribution) is a distribution that follows a bell curve under the assumption that n is huge.

As n gets very large, the probabilities are mostly scattered in the center as they spread out from the center point.

012345
A way to fit the curve to a normal distribution

Here, we are trying to fit some random curve with a bell shape, $e^{-{x^2\over2}}$, to the data in blue.

It has 2 parameters:

  • Mean = $\mu$
  • Standard deviation = $\sigma$

The distribution shifts as we change the mean (here, we subtract $x$ by 2).

The width of the distribution changes as we change the standard deviation because the standard deviation can either widen or narrow the distribution (here, we divide $x$ by 3).

The height changes when we divide the distribution by the area ($\sigma\sqrt{2\pi}$) (here, we divide the formula by $3\sqrt{2\pi}$ to make the area become 1).

012
Normal distribution
01
Standard normal distribution

The standard normal distribution has a mean of 0 and a standard deviation of 1.

Standardization changes the normal distribution to the standard normal distribution.

By standardizing, we can compare different variables with different ranges.

012

We get the help of the software to compute the area instead of doing it by hand.

 

Chi-Squared Distribution

01234

We assume that the noise follows the standard normal distribution.

The noise power is roughly modeled by the square of the noise (Z), which is associated with the dispersion of the noise and determines how hard it is to interpret the received signal correctly.

Each value of W can be achieved with two different values of Z ($-\sqrt w$ and $\sqrt w$) and we can get the CDF for W by finding these areas for each possible value of W.

CDF increases at a larger rate when the W is small because the distribution is concentrated around 0.

CDF is the integral of the PDF, so we can find the PDF by taking the derivative of the CDF.

As k increases, the PDF is more spread and becomes more symmetrical.

The chi-square distribution with k degrees of freedom is the sum of the independent standard normal variables squared.

Sampling from a Distribution

To have more data, we can create some synthetic data that looks similar to the original one by constructing a distribution out of the synthetic one and sampling out of it.

Sampling means picking points that have the probabilities given by the original distribution.

012

CDF makes it easy to get the samples and their locations from the distribution.

 

All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI

728x90