We had yet another hackathon at work. This time around, I wanted to do something with Python. Since we have a gap in test data at work, I decided to create a script to generate oodles of fake test data using a Python library called Faker. It has a number of default providers for generating different types of data. It can generate fake addresses, names, dates, phone numbers, etc.

This simple code block:

import faker

fake = faker.Faker()

print("Name:", fake.name())
print("Address:", fake.address())
print("Phone:", fake.phone_number())
print("Email:", fake.email())
print("Job:", fake.job())
print("Company:", fake.company())
print("Color:", fake.color_name())
print("Barcode:", fake.ean13())

Will output this:

Name: Maria Abbott
Address: 49011 Estes Underpass Apt. 467
West James, TN 88410
Phone: 867-249-9467x07995
Email: treynolds@yahoo.com
Job: Bookseller
Company: Cruz Group
Color: MediumSpringGreen
Barcode: 5396015004856

I also used images from https://thispersondoesnotexist.com for sample portraits. The images on this website are generated by a GAN (generative adversarial network) and are not real people. This is ideal for test data that cannot have any real personal information. The following image is an example of one of these fake portraits:

fakeperson

Between Faker and the generated portraits, I was able to easily generate sets of fake test data that could be used with our system. Much better than trying to come up with random test data manually. This gave me an opportunity to get more familiar with Python and help with our lack of test data at work.