Machine Learning 101: Normal Distribution Vs Uniform Distribution

Table of Contents show

Normal Distribution Vs. Uniform Distribution Python Code Using Pandas

import numpy as np
import pandas as pd

# generate an array of normal values
normal_distribution = np.random.normal(size=100000)


# generate an array of uniform values
uniform_distribution = np.random.uniform(size=100000)

# create a dataframe to plot them
df = pd.DataFrame({'Normal': normal_distribution,
                 'Uniform': uniform_distribution})


print(df.hist(bins=50))

What is a Normal Distribution?

A normal distribution, also known as a bell curve, or gaussian, is a probability distribution that is used in statistics and machine learning.

The normal distribution is centered around its mean with a tail stemming out in each direction.

This distribution is extremely common and describes common things like heights and IQ.

While the Normal distribution can have any mean and standard deviation, the standard normal distribution is a distribution that always has a mean of zero and a standard deviation (and variance) of one. This is a key difference that many confuse.

What is a Uniform Distribution?

A uniform distribution is a probability distribution where all possible values have the same probability of occurring.

Since all values are equally likely to show up, as n increases, the continuous uniform distribution takes on a rectangular shape once graphed.

Hints why this distribution is part of the rectangular distribution family.

A discrete uniform distribution occurs when an event has the same chance of happening as any other event within a given set of events, but the choices are finite, like dice rolls.

A continuous uniform probability occurs when there is an infinite number of choices with equal chances of happening. The simplest example is choosing a random variable between [0,1].

In data science, where do we see the normal distribution?

The normal distribution is ubiquitous in data science. It appears in many forms, including the bell curve.

The normal distribution is often used to model real-world phenomena, such as IQ scores and height.

Many data scientists will work to transform their data, using different transformations to make their dataset resemble a normal distribution.

In data science, where do you we the uniform distribution?

In data science, you will often see a uniform distribution when working with random numbers. This is because uniform distribution is a good way to generate random numbers that are evenly distributed between a given range.

One sampling method, called reservoir sampling, uses random sampling to draw samples from streaming data.

The idea is that once you’ve drawn enough random samples from your streamed data, you’ll find that the distribution will resemble the original (and unknown) distribution.

Machine Learning 101: Normal Distribution vs Uniform Distribution

Normal Distribution Vs. Uniform Distribution Python Code Using Pandas

What is a Normal Distribution?

What is a Uniform Distribution?

In data science, where do we see the normal distribution?

In data science, where do you we the uniform distribution?

Other Articles In Our Machine Learning 101 Series