본문 바로가기

Coursera/Mathematics for ML and Data Science

Probability & Statistics for Machine Learning & Data Science (15)

728x90

Describing probability distributions and probability distributions with multiple variables

Probability Distributions with Multiple Variables

728x90

Joint Distribution (Discrete) - Part 1

Here we have two histograms of children’s age and height (two variables).

01

Given the following dataset, what is the probability that a child is 9 years old and 49 inches tall?

더보기

0.3

01

Getting the count and probabilities in a table format like the one below is easier.

012

Joint Distribution (Discrete) - Part 2

0123
Independent joint distributions of discrete variables

When the discrete random variables are independent, we multiply the probabilities.

0123456
Dependent joint distributions of discrete variables

If they are not independent, we can use a histogram to get the count and probabilities.

We divide the count by the total number of possible outcomes for dependent joint distributions.

Joint Distribution (Continuous)

0123456

Remember that we have to put the data in ranges or intervals as continuous values can have an infinite number of values that cannot be measured when treated individually.

We can visualize joint distributions of continuous variables in the form of histograms, heat maps, scatter plots, and density plots.

0123
Getting the expected value and variance of the dataset

What is the difference between preparing a joint distribution for discrete versus continuous variables?

  1. Discrete joint distributions assign probabilities to individual outcomes, while continuous joint distributions assign probabilities to ranges or intervals of values.
  2. Discrete joint distributions have a countable or finite number of possible outcomes, while continuous joint distributions have an uncountably infinite number of possible outcomes.
  3. Discrete joint distributions are typically represented by scatter plots, while continuous joint distributions are represented by bar charts.

Answer

더보기

1, 2

Marginal and Conditional Distribution

Marginal distribution is when we aggregate all the data on certain feature(s).

  • It’s like reducing a dimension or fixing (focusing on) one variable.

Conditional distribution is when we slice the data to a certain value of a certain feature(s).

  • It’s like filtering the data.

Create a table to represent the marginal distribution for Height(Y).

더보기
Height (Y) 45 46 47 48 49 50
Probability 1/10 2/10 2/10 0 3/10 2/10
01

What is the formula for $p_Y(50)$?

더보기

$p_Y(50)=\sum_ip_{XY}(x_i, 50)={2\over10}$

01234
Examples of marginal distributions

The way we get the marginal distribution for the continuous variables is the same, we aggregate the data along the axis we want to focus on.

Conditional Distribution

Conditional Distribution is used when we want to find the distribution across one variable when the other one is given.

To do that we have to normalize the data by the marginal distribution of the given variable.

0123

What is the value of P(Y=46 | X=7)?

The above formula is like a specific probability (the probability we want) divided by the sum of row or column probabilities.

We are applying the conditional probability rule when we normalize.

Another example of conditional distributions of discrete variables
012

The formula for the continuous conditional distribution stays the same, except we use the PDF (Probability Density Function) instead of PMF (Probability Mass Function).

 

All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI

728x90