Generating data using np.random.normal()

Einblick Content Team - April 18th, 2023

At times, it is necessary to generate data according to a particular distribution when implementing various machine learning models. For example, we've previously shown how to generate random uniform data using np.random.uniform(). In this post, we’ll examine how to use the NumPy’s random.normal() function, which will generate n values from a normal distribution, given a particular mean and standard deviation.

Check out the code examples in the canvas below, or keep reading for a step-by-step walkthrough.

Basic Syntax: np.random.normal(loc, scale, size)

NOTE: since NumPy version 1.17.0, it has been recommended to first instantiate a Generator, such as default_rng(), and then call the numpy.random.normal() function on the Generator object. In this case, we've set the random seed for the Generator so that the results are reproducible.

The normal() function takes three arguments:

  • loc: the desired mean of the data
  • scale: the desired standard deviation of the data
  • size: the number of values to generate

If we use seaborn's histplot() function to create histograms of the distribution of our two samples, we'll see that they are roughly normal, as expected.

Example 1: np.normal(loc = 0, scale = 1, size = 500)

# Sample 1: mean = 0, std devation = 1, 500 numbers
mu1, sigma1 = 0, 1
sample1 = rng.normal(loc = mu1, scale = sigma1, size = 500)
print("Sample 1:")
print(sample1)

Output:

Sample 1:
[ 0.34558419  0.82161814  0.33043708 ... -2.13583105  0.23237325
  0.02812631]
NumPy random normal example 1NumPy random normal example 1

Example 2: np.normal(loc = -5, scale = 3, size = 500)

# Sample 2: mean = -1, std devation = 3, 500 numbers
mu2, sigma2 = -5, 3
sample2 = rng.normal(loc = mu2, scale = sigma2, size = 500)
print("Sample 2:")
print(sample2)

Output:

Sample 2:
[-9.11102074  1.52679377 -9.16223969 ... -5.22859825 -6.191401
 -4.17513103]
NumPy random normal example 2NumPy random normal example 2

About

Einblick is an agile data science platform that provides data scientists with a collaborative workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick customers include Cisco, DARPA, Fuji, NetApp and USDA. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.

Start using Einblick

Pull all your data sources together, and build actionable insights on a single unified platform.

  • All connectors
  • Unlimited teammates
  • All operators