▤ 목차
Describing probability distributions and probability distributions with multiple variables
Describing Distributions
Quantiles and Box-Plots



We need to look at the data not only numerically but also visually.



Visualizing data: Box-Plots



We can visualize the data using the quantiles and we need minimum, maximum, 25%, 50%, and 75% quartiles, and the interquartile range (IQR) which is the third quartile minus the first quartile.
Visualizing data: Kernel density estimation


These are continuous data and can be plotted with histograms, but histograms don’t have a smooth curve.
So one way to approximate PDF is by using kernel density estimation (KDE).




The blue Gaussian distributions are called kernels.
Visualizing data: Violin Plots


Violin plots are useful as they are formed with KDE and box plots.
Visualizing data: QQ plots


Histograms are useful in visualizing the distribution of the data.

With the example newspaper dataset, we can see that the data doesn’t form a Gaussian distribution, and also more data on the left compared to the right tells us that the dataset is right-skewed.

With the sales data, we can see that the data forms a Gaussian distribution by looking at the histograms and QQ plots.
All the information provided is based on the Probability & Statistics for Machine Learning & Data Science | Coursera from DeepLearning.AI
}