Probability & Statistics for Machine Learning & Data Science (15)

728x90

▤ 목차

Describing probability distributions and probability distributions with multiple variables

Probability Distributions with Multiple Variables

728x90

Joint Distribution (Discrete) - Part 1

Here we have two histograms of children’s age and height (two variables).

Given the following dataset, what is the probability that a child is 9 years old and 49 inches tall?

0.3

Getting the count and probabilities in a table format like the one below is easier.

Joint Distribution (Discrete) - Part 2

Independent joint distributions of discrete variables

When the discrete random variables are independent, we multiply the probabilities.

Dependent joint distributions of discrete variables

If they are not independent, we can use a histogram to get the count and probabilities.

We divide the count by the total number of possible outcomes for dependent joint distributions.

Joint Distribution (Continuous)

Remember that we have to put the data in ranges or intervals as continuous values can have an infinite number of values that cannot be measured when treated individually.

We can visualize joint distributions of continuous variables in the form of histograms, heat maps, scatter plots, and density plots.

Getting the expected value and variance of the dataset

What is the difference between preparing a joint distribution for discrete versus continuous variables?

Discrete joint distributions assign probabilities to individual outcomes, while continuous joint distributions assign probabilities to ranges or intervals of values.
Discrete joint distributions have a countable or finite number of possible outcomes, while continuous joint distributions have an uncountably infinite number of possible outcomes.
Discrete joint distributions are typically represented by scatter plots, while continuous joint distributions are represented by bar charts.

Answer

1, 2

Marginal and Conditional Distribution

Marginal distribution is when we aggregate all the data on certain feature(s).

It’s like reducing a dimension or fixing (focusing on) one variable.

Conditional distribution is when we slice the data to a certain value of a certain feature(s).

It’s like filtering the data.

Create a table to represent the marginal distribution for Height(Y).

Height (Y)	45	46	47	48	49	50
Probability	1/10	2/10	2/10	0	3/10	2/10

What is the formula for $p_Y(50)$?

$p_Y(50)=\sum_ip_{XY}(x_i, 50)={2\over10}$

The way we get the marginal distribution for the continuous variables is the same, we aggregate the data along the axis we want to focus on.

Conditional Distribution

Conditional Distribution is used when we want to find the distribution across one variable when the other one is given.

To do that we have to normalize the data by the marginal distribution of the given variable.

What is the value of P(Y=46 | X=7)?

2/3

The above formula is like a specific probability (the probability we want) divided by the sum of row or column probabilities.

We are applying the conditional probability rule when we normalize.

Another example of conditional distributions of discrete variables

The formula for the continuous conditional distribution stays the same, except we use the PDF (Probability Density Function) instead of PMF (Probability Mass Function).

All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI

728x90

저작자표시 비영리 변경금지 (새창열림)

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

Probability & Statistics for Machine Learning & Data Science (17) (2)	2024.10.20
Probability & Statistics for Machine Learning & Data Science (16) (5)	2024.10.09
Probability & Statistics for Machine Learning & Data Science (14) (1)	2024.09.27
Probability & Statistics for Machine Learning & Data Science (13) (0)	2024.09.26
Probability & Statistics for Machine Learning & Data Science (12) (2)	2024.09.24

안녕하세요

Probability & Statistics for Machine Learning & Data Science (15)

Describing probability distributions and probability distributions with multiple variables

Probability Distributions with Multiple Variables

Joint Distribution (Discrete) - Part 1

Joint Distribution (Discrete) - Part 2

Joint Distribution (Continuous)

Marginal and Conditional Distribution

Conditional Distribution

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

티스토리툴바

Probability & Statistics for Machine Learning & Data Science (15)

Describing probability distributions and probability distributions with multiple variables

Probability Distributions with Multiple Variables

Joint Distribution (Discrete) - Part 1

Joint Distribution (Discrete) - Part 2

Joint Distribution (Continuous)

Marginal and Conditional Distribution

Conditional Distribution

'Coursera > Mathematics for ML and Data Science' 카테고리의 다른 글

관련글

티스토리툴바