I was initially struggling with how to easily create realistic Hubspot sample data for testing. Something that would hopefully involve less work than juggling data between API calls.
For anyone else in a similar spot, here's a walkthrough for how to generate a large volume of realistic and associated records in the CRM using the Hubspot admin UI and Mockaroo.com (an online tool data generation tool) -
The video above and steps below detail creating a set of associated Companies, Contacts, and Deals – but a similar approach should work for other object types as well.
High-Level Steps (for example to create an associated set of Companies, Contacts, Deals) –
Determine data points; for each object type —
export example record(s) as a CSV with ALL properties included
trim columns within CSV to include only desired properties
re-import trimmed CSV into Hubspot to see if any desired properties cannot be set by user
update trimmed CSV to remove any un-importable properties - save this as reference for modeling steps
Prepare helper data sets
create a dataset with list of unique company names (these will serve as seeds and keys for modeling other data; can create this list manually or with Mockaroo).
create datasets for any other desired properties that will expect a value from a prescribed set of options (e.g. "Deal Stage", "Buying Role", etc). Can lookup available options for these values using the Properties API if needed (https://developers.hubspot.com/docs/api/crm/properties)
Model data –
create COMPANY records, one for each name with the company properties determined in Step 1 and desired data shaping/modeling via Mockaroo
create desired number of DEAL records with the deal properties determined in Step 1 and desired data shaping/modeling via Mockaroo – also associating each to a company name (using company helper data set from Step 2).
create desired number of CONTACT records with the contact properties determined in Step 1 and desired data shaping/modeling via Mockaroo – associating each to a company name (using company helper data set from Step 2).
Import data
import Companies - as 'one object' (Companies)
This will create a set of company records in Hubspot that next imports will associate with
import Contacts - as 'multiple objects' (Contacts, Companies)
This should associate the newly imported Contacts with earlier imported Company records
import Deals - as 'multiple objects' (Deals, Companies)
This should associate the newly imported Deals with earlier imported Company records
Example Generated Data
(CSV files that would be imported in step 3 above)
Deals (500, various stages, all associated with companies)
Example Input Data Sets
(CSV files that would be used in Mockaroo modeling in step 2 above. Some sets that have many options, like Web Technologies and Industry may make sense to duplicate and trim to a subset of meaningful values to have a greater likelihood of seeing repeat values and patterns across generated dummy data.)
4 years on from the original post and I'm trying to follow the original instructions. I've had success importing contacts and companies but when it comes to importing the Deals, Hubspot complains that there is no Record ID field. I tried arbitraily creating a Record ID column and adding dummy data but that didn't work either.
Has anybody else been able to get around this please?
the provided screenshot shows that you are not mapping the record id / not importing those. Is there a reason? You mentioned that you have created Deal IDs in upfront.
If you like we can have a short teams call to help you here.
We are trying to import a batch of dummy data so we can use it to see how Hubspot works and can test various scenarios without having to build a large, real data set.
However the import functionality does not seem to support dummy data being uploaded for deals as when I uploaded the spreadsheet above - it complained that there were no Record Ids for the deals. I then added some arbitrary id numbers, and it again complained that these Ids did not correspond to anything in Hubspot.
Do you know a way that we might be able to get around this?
We have created a Python script which generates an extensive Excel file, encompassing Companies, Contacts, Deals, Products and Tickets for G7 countries (although this list is easily expandable). This script can generate thousands of records, thereby providing a comprehensive and immersive experience for the client.
Feel free to amend the script to cater to your specific needs. You might choose to generate more fields, other records, or even Custom Objects. The possibilities are extensive, and the flexibility of this tool allows us to tailor each presentation to the unique needs of our clients.
Here is the code we have developed. We firmly believe in the power of sharing knowledge and resources, and hope this tool serves you well in your engagements.
Here the Python code:
# This code has been created by Michael Kupermann (michael.kupermann@codersunlimited.com or michael@kupermann.com)
# The purpose of this code is to generate dummy data that simulates a realistic dataset for HubSpot CRM.
# This data can then be used for demonstrations, testing, or other purposes that require a representative dataset.
# You need to amend the HubSpot Sales and Service Pipelines before you import the data.
#
# Required Packages:
# 1. Faker: This package is used to generate the fake data for our dataset.
# 2. Pandas: This package is used to handle the data in a tabular format and to write the data to an Excel file.
# 3. DateTime: This package is used to generate realistic date data for the 'close_date' field.
#
# To install the necessary packages, you can use pip, the Python package installer.
# Open your terminal (or command prompt on Windows), and enter the following commands:
# pip install faker
# pip install pandas
# pip install datetime
#
# If you're using a Jupyter notebook, you can prefix these commands with an exclamation mark:
# !pip install faker
# !pip install pandas
# !pip install datetime
fromfakerimportFaker
importpandasaspd
fromdatetimeimportdatetime, timedelta
# Function to generate data for a given country. Here 100 companies with 10 contacts, deals, tickets for each company