Snowfakery vs Smock-It: What’s Best for Salesforce Test Data Generation?
Preparing large-scale test data for Salesforce environments isn’t just tedious—it can take 4 to 6 hours, sometimes even longer.
Imagine your team is stuck generating relational or conditional test data while crucial deadlines slip by. That’s valuable time lost, time that could be spent optimizing application performance, testing new features, or speeding up deployments.
But wait, it's not just about time, you also have to ensure data security and compliance via your generated data. You certainly don’t want to put your customer’s data at risk by hosting it on a third-party server or cloning on higher level sandboxes.
Thankfully, there are online services, tools, utilities, and frameworks through which you can generate realistic look-alike test data ( bulk data too) for your testing needs, and that too for FREE. Today, we’re going to compare two such tools - Snowfakery and Smock-It.
While Snowfakery, an open-source tool from Salesforce, has been a go-to for some time, Smock-It is a new CLI sf-plugin designed specifically for generating privacy-compliant, realistic test data for Salesforce environments.
Let’s check how these two tools fare against one another.
Quick Outline
What is Snowfakery?
How does Snowfakery generate test data?
What is Smock-It?
How does Smock-It generate test data?
Feature comparison: Snowfakery vs Smock-It
Making the right choice
What is Snowfakery?
Snowfakery is an open-source test data generation tool ( part of the Cumulus CI project), perfect for anyone developing apps and needing synthetic test data to validate app functionalities. Unlike many tools, Snowfakery takes a template-driven approach using YAML files, making it easy to define complex data patterns, hierarchical relationships, and custom rules.
How to Install Snowfakery?
Getting started with Snowfakery is straightforward. First, you need to have Python and PIP (Python's package manager) installed on your machine. Note that while Snowfakery is compatible with various IDEs, for this guide, we'll use Code Builder, which comes preconfigured with the necessary tools.
Prerequisites
Python: Ensure you have Python 3 installed. You can download it from the official Python website.
PIP: This should come with your Python installation, but if not, you can install it separately.
Steps to Generate Synthetic Test Data With Snowfakery
1. Create a YAML Template
The YAML template acts as a blueprint for the data generation process. You specify the object name, the fields you want to populate, and the number of records to create. You can also use random dummy data generators for names, dates, numbers, etc.
Let’s check a quick example to understand test data generation using Snowfakery.
In the above template:
count: 10 specifies that 10 Account records will be generated.
fake: company generates a random company name for each record.
2. Run Snowfakery
Once the template is ready, you execute Snowfakery using the command line: snowfakery path/file.yml
The above command tells Snowfakery to read the account.yml file and generate 10 account records which will be stored in the account.csv file.
3. Data Output
After configuring your YAML file for CSV output, run the same command as before. You’ll find that Snowfakery generates both the CSV file and a metadata file in the specified directory.
Note: Snowfakery generates the data in CSV, JSON, SQL, and TXT formats. This data can be reviewed and loaded into Salesforce using Salesforce Data Loader or Salesforce CLI ( which is what we’re using).
Complex Test Data Generation With Snowfakery
Hierarchical Data Generation
In Snowfakery, you can define relationships between objects using the friends function. This is particularly useful when you need to link a parent object to its child objects. For example, if you have an account that has multiple contacts, you can set up your YAML file like this:
Here, we defined that each Account can have up to 5 Contacts, and each Contact is linked to an Account using a reference.
Once the YAML file is generated, simply run the command: snowfakery complex_data.yml
b) Utilizing Random Functions for Data Generation in Snowfakery
One of the exciting features of Snowfakery is its ability to use random functions to enhance the diversity of the test data you generate. Instead of hardcoding values, you can introduce variability, making your generated data more realistic and useful for testing purposes.
To start utilizing random functions, consider how you can apply the choice function in your YAML configuration. This function allows you to specify a set of choices and randomly select one each time data is generated. For example, if you want to generate a random number of contacts for each account (eg: 1, 3, 5), you can set it up like the example below.
You can also set probabilities for different outcomes. For instance, if you want to specify that there's a 30% chance of generating one contact, a 30% chance for three contacts, and a 10% chance for five contacts, your YAML would look like this:
c) Picklist Handling
As we dive deeper into Snowfakery, let's explore how we can leverage object relationships to create more complex data structures. To illustrate the power of relationships, consider a scenario where each account can have multiple contacts and opportunities.
You can define these relationships in your YAML file as follows:
Here, the YAML script defines fake Opportunity records with structured relationships to Accounts. In above script:
4 records are created for the opportunity object.
The Opportunity name is dynamically generated using the Account name (e.g., "opp ${Account.Name}").
StageName is the picklist field with random choices such as Closed Won, In Progress, and New values.
Amount: Uses the fake data generator to create a random number between 10,000 and 50,000.
AccountId: Uses a reference to Account, ensuring Opportunities are correctly associated with existing Account records.
What is Smock-It?
Smock-It is a command-line Salesforce test data generator tool designed to generate realistic, synthetic test data for Salesforce environments. It enables developers, QA professionals, and Salesforce admins to test functionalities and workflows without using real customer data, ensuring privacy protection.
The best part about Smock-It? It can tackle limitations of Snowfakery and other big Salesforce test data automation tools such as:
Dependent picklist handling
Automatic field inclusion
Auto Relationship handling
Direct Record Insertion
Conditional test data generation
Before we compare the key features of these Salesforce synthetic test data generators, let’s take a quick look at Smock’s data generation process and some unique features it offers.
How to Install Smock-It?
Smock-It can be installed easily via CLI. Follow these simple steps to get started:
1. Prerequisites
Before installing Smock-It, you will require the following to ensure Smock-It installs and executes successfully.
Salesforce CLI
Node.js (v18.0.0 or later)
Mockaroo API Key (Free and single step process)
2. Installing Smock-It via SF CLI
Run the following command in your terminal: sf plugins install Smock-It
3. Verify Installation
To confirm that Smock-It is installed successfully, run: sf plugins
You should see Smock-It listed among the installed plugins. Alternatively, you can check the version by running: sf Smock-It -- version
4. Authenticate with Your Salesforce Org (Optional but Recommended)
To use Smock-It for inserting test data, authenticate with your Salesforce org: sf login org
After logging in, you can set an alias to avoid re-authenticating every time: sf alias set myOrg=username@yourOrg.com
Now, you can use -a myOrg instead of logging in each time.
5. Run a Test Command to Ensure Everything Works
Try generating test data using a sample command. If the command runs successfully, Smock-It is installed and ready to use!: sf data generate -t myTemplate -a myOrg
How to Generate Test Data with Smock-It?
Mock data generation with Smock-It is fairly easy. It begins with a list of questions. Based on the questions and user response, a template is generated which can be reused or customized for future data generation needs.
Template creation
Check out the use case below to understand template creation with our mock data generation tool - Smock-It.
Use case: Create 200 records for Account, Contact, and Opportunity with output in DI and JSON formats.
Steps to Execute
Run the command sf template init on the CLI, and it will prompt you with a questionnaire.
Provide Input for the questionnaire
Provide a template name (e.g., account_creation): accountDataTemplate
Exclude namespace(s) (comma-separated, e.g., mynamespaceA, mynamespaceB): N/A
Select output format [csv, json, di]: di, json
Choose a language for test data: en
Specify test data count (e.g, 5) (default: 1): 200
List Objects(API names) for data creation (default: lead): account, contact, opportunity
Customize settings for individual sObjects? (Y/n) (default: n): n
Validate added sObjects and fields from your org?(Y/n) (default: n): n
Read through our Init Questionnaire doc for more info:
https://github.com/concretios/smock-it/blob/main/INIT_QUESTIONNAIRE.MD
2. Setting Up the Environment Variable
To begin generating data, we first need to set up our Salesforce environment. Below is how you can do it for Windows and MacOS/Linux.
For Windows:
$env:MOCKAROO_API_KEY="your_mockaroo_api_key"
For macOS/Linux:
export MOCKAROO_API_KEY="your_mockaroo_api_key"
📌 Note: You can obtain your Mockaroo API key by signing up at Mockaroo.
3. Data Generation Command
Once the setup is complete, executing the data generation command is straightforward. For example:
SF data generate --template demo_data_template --alias myOrg
This command instructs Smockit to generate data based on the specified template.
Validating Template Schema
After running the command, Smockit validates the objects and fields defined in the template against the authenticated Salesforce org. If any issues are detected, they are reported in real-time, allowing for immediate corrections.
Viewing Generated Data
Once the data generation process is complete, users can view the results directly in Salesforce. The generated records will include all necessary fields, ensuring that they are realistic and compliant with data privacy regulations like GDPR and CCPA.
Complex Data Generation with Smock-It
a) Conditional Data Generation
Smock-It takes conditional data generation to the next level. With Smock-It you can define specific fields ( check fields to consider and exclude parts) where conditions can be applied.
i) As shown above, we’ve set dependent picklist values conditionally as per our requirement. Similarly, you can also do this for other fields such as email, phone, etc.
fieldsToConsider: {
"email": ["smockit@gmail.com", "test@gmail.com"],
"phone": ["9787677887", "7768997766],
}
As defined in the schema above, for the email and phone fields, we’ve given the conditional values so while generating test data, you’ll have only these values for the respective fields.
ii) You can also define what fields you wish to exclude in your generated records. Take the above image for reference where we’ve excluded the fax fields from record generation.
b) Dependent Picklist Handling
There’s hardly any tool in Salesforce that effectively handles dependent picklists.
A dependent picklist in Salesforce is a field that dynamically changes its available values based on the selection of another picklist field. It is commonly used to enforce data consistency and conditional selections.
For example:
A Country picklist controls which State options appear.
If Country = USA, the State picklist should only show California, Texas, and New York.
Check out the example below for more clarity.
Note: Check the fields to consider part with three different dependent picklist fields with specific values we want to insert.
In the above example:
We’ve a "fieldsToConsider" section where we define custom values for dependent picklists.
"dp-stage__c": ["Negotiation"] ensures that all generated records have the "Negotiation" stage. Similarly, "dp-year__c": ["2024"] fixes the year field.
c) Handling left Fields
Similar to Snowfakery, Smock-It also offers auto-record generation for Unspecified Fields.
By setting pickLeftFields = true, for instance, you can ensure that all fields NOT listed in fieldsToConsider or fieldsToExclude are still populated with randomly generated data, removing the need to define every field explicitly.
If you don’t need to generate unspecified fields, simply set pickLeftFields = false.
d) Relationship Handling (up-to 2 Levels) + Direct Insertion
While most mock data generation tools, including Snowfakery, use a multi-step approach to map relationships, Smock-It does it in a single step without mentioning the parent fieldname/reference ID as we require in Snowfakery.
Check out the below image where Smock-It has created record IDs for all objects. Since there was no contact and account present in the org, Smock-It handled all of the fields and created an account.
This happens in Smock-It in a 3 stage manner.
At first, Smock-It checks your org, whether or not an account is present in the org. If no account is present, it navigates and finds all accounts in your org. Thirdly, it tackles the same for the grandparent, and this relationship handling we are doing up to 2 levels.
Here’s the screenshot of our org before record generation.
Let’s refresh our org and see how many records Smock-It generated. As we can see below, Smock-It generated 11 contacts 10 + 1 ( see contact no 4) for self-lookup.
Comparison: Snowfakery vs Smock-It
Smock-It | Snowfakery | |
---|---|---|
Auto Relationship Handling | ✅ | ❌ |
Conditional data handling | ✅ | ✅ |
Dependent picklist handling | Selects real-time value | Req. multiple steps (via if, when) |
Output formats | DI, JSON, CSV | JSON, CSV, TXT |
Pricing | FREE | FREE |
Auto inclusion of required fields | ✅ | ❌ |
Direct Record Insert | ✅ | ❌ |
Template validation | Validates objects and related fields | ❌ |
Record Limit | 2,00,000/day | 10,00,000/day |
Making the Right Choice!
Both Snowfakery and Smock-It are powerful tools for generating Synthetic Salesforce test data, but they cater to different needs.
Snowfakery is ideal for generating large volumes of test data (up to 10 million records per day) and offers YAML-driven templates for customization. However, it requires manual setup for relationships, dependent picklists, and required fields, making it better suited for advanced users comfortable with scripting.
Smock-It, on the other hand, provides a simpler and more automated CLI-based approach, allowing direct record insertion with automatic relationship handling, required field inclusion, and real-time dependent picklist selection. While its record limit is lower (2,00,000/day), it simplifies data generation for Salesforce admins and developers who need quick and efficient test data creation without extensive scripting.
In short, choose Snowfakery for high-volume, complex data needs and YAML-based customization and Smock-It for its ease of use, automation, and direct Salesforce data insertion.
Let’s Talk!
Drop a note to move forward with the conversation!