Learn Docker With My Newest Course

Dive into Docker takes you from "What is Docker?" to confidently applying Docker to your own projects. It's packed with best practices and examples. Start Learning Docker →

Generating Fake Data in Development to Populate Your Database


Being able to create dozens or thousands of records to populate your database in development has a lot of advantages.

Quick Jump: Populating Fake Data VS Seeding | Why Is Generating Fake Data Worth It? | Generating Fake Data with Most Languages

After I get my database schema to a somewhat stable state, one of the first things I do is create scripts to automatically generate accurate fake data for most of my tables and fields.

I’ve been doing that now for about 5+ years and I always find it to be worth it.

Populating Fake Data VS Seeding

Before we continue on, let’s go over the difference between these 2 terms.

To me, seeding is something you would do in development and production, such as seeding an initial admin user so you don’t have to go into the database to manually create one.

But, in this article, we’re not talking about seeding (which is worth doing btw).

Today we’re talking about generating heaps of random / fake data which is useful in development, but it’s also useful to make an app feel alive for demoing it to clients.

Why Is Generating Fake Data Worth It?

While the amount of extra code you’ll need to write to pull this off isn’t a lot, it still is extra code you’ll need to write, but it’s so worth it.

Wishful Thinking

Imagine if you could type something like myapp add users and 5 seconds later you have 187 fake users generated in your database.

Each user would have their own random created at time, username, name, email address and whatever profile information you want.

But it wouldn’t be fully random like having a name of Jud6Hn-z. Instead, each field would be what it is. So when it came to email addresses, you would get a real looking email address.

Then, if you ran myapp add users again, the previously generated users would be removed and a new set of users would be generated.

Or perhaps you could run myapp add all and now instead of generating only users, it generated users, invoices, course lessons or whatever you need for your application.

Viewing Your App in Different States

This is really useful because you can go from a totally empty database to being able to see what your app looks and feels like with a bunch of data.

On the flip side, you can also reset your database and then look at how your app behaves when it’s empty, which is really important because when no data exists for a resource, it’s a good idea to give friendly hints and links on how to add data for that resource.

But, unless you routinely view your app in these states, it’s easy to overlook doing this and without an ability to generate fake data on the fly, it becomes really tedious, which is why it’s often overlooked.

This strategy is especially useful when you’re designing your app because now you can generate enough data to trigger things like pagination, and it may also help you uncover UI issues, such as maybe you didn’t anticipate 38 comments to be loaded in a sidebar which makes it look weird, so now you limit it to 10 with a “read more” link.

Generating Fake Data with Most Languages

Most popular programming language have a “faker” library to help generate fake data.

For example with Python’s Faker library you could put in fake.past_date(start_date="-30d") to generate a date between today and 30 days ago.

You can generate everything from address fields to license plates to lorem ipsum to entire profiles, and it’s easy to create your own types if you need something very specific. There’s over 20+ existing types to choose from with the Python library.

Generating an entire random profile with 1 line of code:
# { "address": "86523 Steven Square\nBurnston, IN 67952",
#   "birthdate": datetime.date(1916, 8, 28),
#   "blood_group": "B-",
#   "company": "Stanley, Mitchell and Collins",
#   "current_location": (Decimal("-38.2116335"), Decimal("145.281777")),
#   "job": "Engineer, production",
#   "mail": "kerrjoel@hotmail.com",
#   "name": "John Trujillo",
#   "residence": "889 Benjamin Islands Suite 753\nWest Craig, VT 89686",
#   "sex": "M",
#   "ssn": "208-74-8526",
#   "username": "annemoss",
#   "website": ["http://www.example.com/"] }

Pretty neat!

We use the Python Faker package in my Build a SAAS App with Flask course to generate fake users, payment invoices and bets (part of a game we create).

What’s your favorite faker library? Let me know below!

Never Miss a Tip, Trick or Tutorial

Like you, I'm super protective of my inbox, so don't worry about getting spammed. You can expect a few emails per month (at most), and you can 1-click unsubscribe at any time. See what else you'll get too.