Generating data using np.random.normal()

Einblick Content Team - April 18th, 2023

At times, it is necessary to generate data according to a particular distribution when implementing various machine learning models. For example, we've previously shown how to generate random uniform data using np.random.uniform(). In this post, we’ll examine how to use the NumPy’s random.normal() function, which will generate n values from a normal distribution, given a particular mean and standard deviation.

Check out the code examples in the canvas below, or keep reading for a step-by-step walkthrough.

Basic Syntax: np.random.normal(loc, scale, size)

NOTE: since NumPy version 1.17.0, it has been recommended to first instantiate a Generator, such as default_rng(), and then call the numpy.random.normal() function on the Generator object. In this case, we've set the random seed for the Generator so that the results are reproducible.

The normal() function takes three arguments:

  • loc: the desired mean of the data
  • scale: the desired standard deviation of the data
  • size: the number of values to generate

If we use seaborn's histplot() function to create histograms of the distribution of our two samples, we'll see that they are roughly normal, as expected.

Example 1: np.normal(loc = 0, scale = 1, size = 500)

# Sample 1: mean = 0, std devation = 1, 500 numbers
mu1, sigma1 = 0, 1
sample1 = rng.normal(loc = mu1, scale = sigma1, size = 500)
print("Sample 1:")
print(sample1)

Output:

Sample 1:
[ 0.34558419  0.82161814  0.33043708 ... -2.13583105  0.23237325
  0.02812631]
NumPy random normal example 1NumPy random normal example 1

BONUS: Try using Generative AI

In Einblick, we've implemented an AI agent, called Einblick Prompt, which can create data workflows from as little as one sentence. Prompt is powered by OpenAI and LangChain, but tailored for the data domain, so you can ask Prompt to complete data tasks, like generating synthetic data, for you. Then, you can run the code immediately in our AI-native data notebooks. Check out how we used Prompt below:

Using Generative AI in Einblick

  1. Open an Einblick canvas (you can fork the one below)
  2. Right click anywhere in the canvas > Select Prompt
  3. Type in: "Generate 500 random normally distributed values with mean 0, standard deviation 1, and plot the values."
  4. Select the Python cell > Shift + Enter to run the code

Test out different prompts, and get results in seconds!

Example 2: np.normal(loc = -5, scale = 3, size = 500)

# Sample 2: mean = -1, std devation = 3, 500 numbers
mu2, sigma2 = -5, 3
sample2 = rng.normal(loc = mu2, scale = sigma2, size = 500)
print("Sample 2:")
print(sample2)

Output:

Sample 2:
[-9.11102074  1.52679377 -9.16223969 ... -5.22859825 -6.191401
 -4.17513103]
NumPy random normal example 2NumPy random normal example 2

About

Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.