Share Your Work

Clayton_
Contributor

How to create Hubspot test / dummy data

I was initially struggling with how to easily create realistic Hubspot sample data for testing. Something that would hopefully involve less work than juggling data between API calls.

 

For anyone else in a similar spot, here's a walkthrough for how to generate a large volume of realistic and associated records in the CRM using the Hubspot admin UI and Mockaroo.com (an online tool data generation tool) -

 

https://youtu.be/9po9MBKjTHY

 

The video above and steps below detail creating a set of associated Companies, Contacts, and Deals – but a similar approach should work for other object types as well.

 

High-Level Steps (for example to create an associated set of Companies, Contacts, Deals) –

 

  1. Determine data points; for each object type —
    • export example record(s) as a CSV with ALL properties included
    • trim columns within CSV to include only desired properties 
    • re-import trimmed CSV into Hubspot to see if any desired properties cannot be set by user
    • update trimmed CSV to remove any un-importable properties - save this as reference for modeling steps
  2. Prepare helper data sets
    • create a dataset with list of unique company names (these will serve as seeds and keys for modeling other data; can create this list manually or with Mockaroo).
    • create datasets for any other desired properties that will expect a value from a prescribed set of options (e.g. "Deal Stage", "Buying Role", etc). Can lookup available options for these values using the Properties API if needed (https://developers.hubspot.com/docs/api/crm/properties)
  3. Model data  –
    1. create COMPANY records, one for each name with the company properties determined in Step 1 and desired data shaping/modeling via Mockaroo
    2. create desired number of DEAL records with the deal properties determined in Step 1 and desired data shaping/modeling via Mockaroo – also associating each to a company name (using company helper data set from Step 2).
    3. create desired number of CONTACT records with the contact properties determined in Step 1 and desired data shaping/modeling via Mockaroo – associating each to a company name (using company helper data set from Step 2).
  4. Import data
    1. import Companies - as 'one object' (Companies)
      • This will create a set of company records in Hubspot that next imports will associate with
    2. import Contacts - as 'multiple objects' (Contacts, Companies)
      • This should associate the newly imported Contacts with earlier imported Company records
    3. import Deals - as 'multiple objects' (Deals, Companies)
      • This should associate the newly imported Deals with earlier imported Company records

 

Example Generated Data

(CSV files that would be imported in step 3 above)

 

  • Companies (200 unique)
  • Contacts (400, portion associated with companies)
  • Deals (500, various stages, all associated with companies)

 

Example Input Data Sets

(CSV files that would be used in Mockaroo modeling in step 2 above. Some sets that have many options, like Web Technologies and Industry may make sense to duplicate and trim to a subset of meaningful values to have a greater likelihood of seeing repeat values and patterns across generated dummy data.)

 

8 Replies 8
MKupermann
Contributor

How to create Hubspot test / dummy data

Hi guys! 

We have created a Python script which generates an extensive Excel file, encompassing Companies, Contacts, Deals, Products and Tickets for G7 countries (although this list is easily expandable). This script can generate thousands of records, thereby providing a comprehensive and immersive experience for the client.

Feel free to amend the script to cater to your specific needs. You might choose to generate more fields, other records, or even Custom Objects. The possibilities are extensive, and the flexibility of this tool allows us to tailor each presentation to the unique needs of our clients.

Here is the code we have developed. We firmly believe in the power of sharing knowledge and resources, and hope this tool serves you well in your engagements.

Here the Python code:

# This code has been created by Michael Kupermann (michael.kupermann@codersunlimited.com or michael@kupermann.com)
# The purpose of this code is to generate dummy data that simulates a realistic dataset for HubSpot CRM.
# This data can then be used for demonstrations, testing, or other purposes that require a representative dataset.
# You need to amend the HubSpot Sales and Service Pipelines before you import the data.
#
# Required Packages:
# 1. Faker: This package is used to generate the fake data for our dataset.
# 2. Pandas: This package is used to handle the data in a tabular format and to write the data to an Excel file.
# 3. DateTime: This package is used to generate realistic date data for the 'close_date' field.
#
# To install the necessary packages, you can use pip, the Python package installer.
# Open your terminal (or command prompt on Windows), and enter the following commands:
# pip install faker
# pip install pandas
# pip install datetime
#
# If you're using a Jupyter notebook, you can prefix these commands with an exclamation mark:
# !pip install faker
# !pip install pandas
# !pip install datetime

from faker import Faker
import pandas as pd
from datetime import datetimetimedelta

#  Function to generate data for a given country. Here 100 companies with 10 contacts, deals, tickets for each company
def generate_data(countrycompany_rows=100contacts_per_company=10deals_per_company=10products_per_deal=10 )
    # Set the locale for Faker based on the country
    if country == "Germany":
        faker = Faker("de_DE")
    elif country == "United States":
        faker = Faker("en_US")
    elif country == "France":
        faker = Faker("fr_FR")
    elif country == "Italy":
        faker = Faker("it_IT")
    elif country == "Japan":
        faker = Faker("ja_JP")
    elif country == "United Kingdom":
        faker = Faker("en_GB")
    elif country == "Canada":
        faker = Faker("en_CA")
    elif country == "Austria":
        faker = Faker("de_AT")
    elif country == "Switzerland":
        faker = Faker("de_CH")

    # Create a dictionary to hold the data
    data = {
        "company_name": [],
        "company_domain": [],
        "company_industry": [],
        "company_address": [],
        "company_country": [],
        "contact_firstname": [],
        "contact_lastname": [],
        "contact_email": [],
        "contact_phone": [],
        "contact_address": [],
        "contact_country": [],
        "contact_function": [],
        "contact_department": [],
        "deal_name": [],
        "deal_stage": [],
        "deal_amount": [],
        "deal_type": [],
        "deal_source": [],
        "close_date": [],
        "ticket_title": [],
        "ticket_status": [],
        "ticket_priority": [],
        "product_name": [],
        "product_price": [],
        "product_description": [],
        "product_sku": [],
        "product_quantity": [],
    }

    # Loop to generate data for each company
    for _ in range(company_rows )
        company_name = faker.company()
        company_domain = faker.domain_name()
        company_industry = faker.random_element(["Technology""Healthcare""Finance""Real Estate"])
        company_address = faker.address().replace('\n'', ')
        company_country = country

        # Loop to generate data for each contact
        for _ in range(contacts_per_company )
            contact_firstname = faker.first_name()
            contact_lastname = faker.last_name()
            contact_email = faker.email()
            contact_phone = faker.phone_number()
            contact_address = faker.address().replace('\n'', ')
            contact_country = country
            contact_function = faker.job()
            contact_department = faker.random_element(["Sales""Marketing""Human Resources""Engineering"])

            # Append generated company and contact data to the lists in the dictionary
            data["company_name"].append(company_name)
            data["company_domain"].append(company_domain)
            data["company_industry"].append(company_industry)
            data["company_address"].append(company_address)
            data["company_country"].append(company_country)
           
            data["contact_firstname"].append(contact_firstname)
            data["contact_lastname"].append(contact_lastname)
            data["contact_email"].append(contact_email)
            data["contact_phone"].append(contact_phone)
            data["contact_address"].append(contact_address)
            data["contact_country"].append(contact_country)
            data["contact_function"].append(contact_function)
            data["contact_department"].append(contact_department)

            # Generate deal and product data
            data["deal_name"].append(f'Deal-{faker.uuid4()}')
            data["deal_stage"].append(faker.random_element(["Appointment Scheduled""Qualified To Buy""Presentation Scheduled""Decision Maker Brought-In"]))
            data["deal_amount"].append(faker.random_int(min=1000max=50000))
            data["deal_type"].append(faker.random_element(["New Business""Existing Business"]))
            data["deal_source"].append(faker.random_element(["Direct Traffic""Organic Search""Paid Search""Social Media"]))
            data["close_date"].append((datetime.today() + timedelta(days=faker.random_int(min=1max=90))).date())

            # Generate product data
            data["product_name"].append(f'Product-{faker.uuid4()}')
            data["product_price"].append(faker.random_int(min=10max=1000))
            data["product_description"].append(faker.catch_phrase())
            data["product_sku"].append(faker.random_int(min=10000max=99999))
            data["product_quantity"].append(faker.random_int(min=1max=100))

            # Generate ticket data
            data["ticket_title"].append(f'Ticket-{faker.uuid4()}')
            data["ticket_status"].append(faker.random_element(["New""Waiting on contact""Waiting on us""Closed"]))
            data["ticket_priority"].append(faker.random_element(["Low""Medium""High"]))

    # Convert the data dictionary to a pandas DataFrame
    df = pd.DataFrame(data)
    return df

# Define the list of countries for which we want to generate data
g7_countries = ["Canada""France""Germany""Italy""Japan""United Kingdom""United States""Austria""Switzerland"]

# Create an empty DataFrame to hold the generated data
result = pd.DataFrame()
for country in g7_countries:
    df = generate_data(country)
    # Append the data for each country to the result DataFrame
    result = result.append(df)

# Write the generated data to an Excel file
result.to_excel(r"C:\~\~\~\hubspot_dummy_data.xlsx"index=False)

 

kvlschaefer
Community Manager
Community Manager

How to create Hubspot test / dummy data

Hi @MKupermann,

 

Thank you for sharing!

 

Best,

Kristen


Did you know that the Community is available in other languages?
Join regional conversations by changing your language settings !
0 Upvotes
CCanepa
Participant

How to create Hubspot test / dummy data

Thanks @Clayton_, this was super helpful.

Since posting this article I was wondering if you've seen this HubSpot-provided demo data generator tool?

https://www.hubspot.com/solutions-partner-resource-center/demo-account-resources

 

TChilds
Participant | Partner
Participant | Partner

How to create Hubspot test / dummy data

@Clayton_ 👌 clutch move to help the community power through details and start pumping out results.

dennisedson
HubSpot Product Team
HubSpot Product Team

How to create Hubspot test / dummy data

Love this @Clayton_ !

I have added this to the Developer Board landing page!

Keep it coming!

GrantCarlile
Top Contributor | Diamond Partner
Top Contributor | Diamond Partner

How to create Hubspot test / dummy data

Well done @dennisedson, correct response : ) 

Makes me happy to see someone work hard, clearly communicate, and the community elevate the work #loveit

Again, well done @Clayton_.

 

@Clayton_@Clayton_

 

 

GrantCarlile
Top Contributor | Diamond Partner
Top Contributor | Diamond Partner

How to create Hubspot test / dummy data

@Clayton_ , this is great. Thanks for sharing. I love to have random movie sets or The A-Team as contacts for small projects : ) 

What you've here is a great rundown! Looking forward to incorporating some of your work here. 

kvlschaefer
Community Manager
Community Manager

How to create Hubspot test / dummy data

Thanks for sharing, @Clayton_ ! This is fantastic 😄 


Did you know that the Community is available in other languages?
Join regional conversations by changing your language settings !