본문 바로가기
Coursera/Mathematics for ML and Data Science

Probability & Statistics for Machine Learning & Data Science (15)

by Fresh Red 2024. 9. 28.
728x90
반응형

▤ 목차

    Describing probability distributions and probability distributions with multiple variables

    Probability Distributions with Multiple Variables

    728x90

    Joint Distribution (Discrete) - Part 1

    Here we have two histograms of children’s age and height (two variables).

    01

    Given the following dataset, what is the probability that a child is 9 years old and 49 inches tall?

    더보기

    0.3

    01

    Getting the count and probabilities in a table format like the one below is easier.

    012

    Joint Distribution (Discrete) - Part 2

    0123
    Independent joint distributions of discrete variables

    When the discrete random variables are independent, we multiply the probabilities.

    0123456
    Dependent joint distributions of discrete variables

    If they are not independent, we can use a histogram to get the count and probabilities.

    We divide the count by the total number of possible outcomes for dependent joint distributions.

    Joint Distribution (Continuous)

    0123456

    Remember that we have to put the data in ranges or intervals as continuous values can have an infinite number of values that cannot be measured when treated individually.

    We can visualize joint distributions of continuous variables in the form of histograms, heat maps, scatter plots, and density plots.

    0123
    Getting the expected value and variance of the dataset

    What is the difference between preparing a joint distribution for discrete versus continuous variables?

    1. Discrete joint distributions assign probabilities to individual outcomes, while continuous joint distributions assign probabilities to ranges or intervals of values.
    2. Discrete joint distributions have a countable or finite number of possible outcomes, while continuous joint distributions have an uncountably infinite number of possible outcomes.
    3. Discrete joint distributions are typically represented by scatter plots, while continuous joint distributions are represented by bar charts.

    Answer

    더보기

    1, 2

    Marginal and Conditional Distribution

    Marginal distribution is when we aggregate all the data on certain feature(s).

    • It’s like reducing a dimension or fixing (focusing on) one variable.

    Conditional distribution is when we slice the data to a certain value of a certain feature(s).

    • It’s like filtering the data.

    Create a table to represent the marginal distribution for Height(Y).

    더보기
    Height (Y) 45 46 47 48 49 50
    Probability 1/10 2/10 2/10 0 3/10 2/10
    01

    What is the formula for $p_Y(50)$?

    더보기

    $p_Y(50)=\sum_ip_{XY}(x_i, 50)={2\over10}$

    01234
    Examples of marginal distributions

    The way we get the marginal distribution for the continuous variables is the same, we aggregate the data along the axis we want to focus on.

    Conditional Distribution

    Conditional Distribution is used when we want to find the distribution across one variable when the other one is given.

    To do that we have to normalize the data by the marginal distribution of the given variable.

    0123

    What is the value of P(Y=46 | X=7)?

    The above formula is like a specific probability (the probability we want) divided by the sum of row or column probabilities.

    We are applying the conditional probability rule when we normalize.

    Another example of conditional distributions of discrete variables
    012

    The formula for the continuous conditional distribution stays the same, except we use the PDF (Probability Density Function) instead of PMF (Probability Mass Function).

     

    All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI

    728x90
    반응형

    home top bottom
    }