Probability & Statistics for Machine Learning & Data Science (8)

728x90

▤ 목차

Introduction to Probability and Probability Distributions

Probability Distributions

Uniform Distribution

Uniform distribution has all possible values in an interval with the same frequency of occurrence.

All intervals have the same probability.

It has two parameters:

a: beginning of an interval
b: end of the interval

728x90

Normal Distribution

Normal distribution (aka Gaussian distribution) is a distribution that follows a bell curve under the assumption that n is huge.

As n gets very large, the probabilities are mostly scattered in the center as they spread out from the center point.

A way to fit the curve to a normal distribution

Here, we are trying to fit some random curve with a bell shape, $e^{-{x^2\over2}}$, to the data in blue.

It has 2 parameters:

Mean = $\mu$
Standard deviation = $\sigma$

The distribution shifts as we change the mean (here, we subtract $x$ by 2).

The width of the distribution changes as we change the standard deviation because the standard deviation can either widen or narrow the distribution (here, we divide $x$ by 3).

The height changes when we divide the distribution by the area ($\sigma\sqrt{2\pi}$) (here, we divide the formula by $3\sqrt{2\pi}$ to make the area become 1).

The standard normal distribution has a mean of 0 and a standard deviation of 1.

Standardization changes the normal distribution to the standard normal distribution.

By standardizing, we can compare different variables with different ranges.

We get the help of the software to compute the area instead of doing it by hand.

Chi-Squared Distribution

We assume that the noise follows the standard normal distribution.

The noise power is roughly modeled by the square of the noise (Z), which is associated with the dispersion of the noise and determines how hard it is to interpret the received signal correctly.

Each value of W can be achieved with two different values of Z ($-\sqrt w$ and $\sqrt w$) and we can get the CDF for W by finding these areas for each possible value of W.

CDF increases at a larger rate when the W is small because the distribution is concentrated around 0.

CDF is the integral of the PDF, so we can find the PDF by taking the derivative of the CDF.

As k increases, the PDF is more spread and becomes more symmetrical.

The chi-square distribution with k degrees of freedom is the sum of the independent standard normal variables squared.

Sampling from a Distribution

To have more data, we can create some synthetic data that looks similar to the original one by constructing a distribution out of the synthetic one and sampling out of it.

Sampling means picking points that have the probabilities given by the original distribution.

CDF makes it easy to get the samples and their locations from the distribution.

All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI

728x90

저작자표시 비영리 변경금지 (새창열림)

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

Probability & Statistics for Machine Learning & Data Science (10) (2)	2024.09.19
Probability & Statistics for Machine Learning & Data Science (9) (2)	2024.09.16
Probability & Statistics for Machine Learning & Data Science (7) (2)	2024.09.09
Probability & Statistics for Machine Learning & Data Science (6) (1)	2024.09.08
Probability & Statistics for Machine Learning & Data Science (5) (0)	2024.09.07

안녕하세요

Probability & Statistics for Machine Learning & Data Science (8)

Introduction to Probability and Probability Distributions

Probability Distributions

Uniform Distribution

Normal Distribution

Chi-Squared Distribution

Sampling from a Distribution

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

티스토리툴바

Probability & Statistics for Machine Learning & Data Science (8)

Introduction to Probability and Probability Distributions

Probability Distributions

Uniform Distribution

Normal Distribution

Chi-Squared Distribution

Sampling from a Distribution

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

관련글

티스토리툴바