CRM

JLacey9
Participant

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hi,

 

We have a custom Contact property in HubSpot that stores the value of a unique identifier property in our database of record. These values are pushed into HubSpot contacts via one-way sync from our database on a regular basis, but we do not enforce unique values for this property in HubSpot.

 

I stumbled upon one edge case where we have two HubSpot contacts with the same value of this custom property, so I want to see if this is a greater issue and if we have other contacts across our HubSpot instance where multiple contacts have the same value for this custom property.

 

We have over 3.5M contacts in HubSpot, so I tried creating a report so that I could group by this value and filter to identify how many duplicates exist. I could not find a way to do this and found this community thread that suggests we cannot group by in HubSpot reports: https://community.hubspot.com/t5/HubSpot-Ideas/generate-a-report-of-duplicates/idc-p/376552/highligh...

 

I also do not want to export contact data (especially because HubSpot does not allow me to exclude PII like 'name' when exporting a list of Contacts). If I could only export this single custom property then this could be an option for us so that we can use an external method to identify duplicate values.

 

I also tried exploring the duplicate management tool, and found in HubSpot's documentation that we can only identify duplicates based on certain standard Contact property values: https://knowledge.hubspot.com/records/deduplication-of-records#duplicate-management-tool

 

Is there any other way natively in HubSpot to identify duplicate contacts based on a custom Contact property?

 

Thank you!

0 Upvotes
1 Accepted solution
karstenkoehler
Solution
Hall of Famer | Partner
Hall of Famer | Partner

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hi @JLacey9,

 

There are some great third-party solutions that integrate with HubSpot (Koalify, Insycle) which you're probably aware of. Nevertheless, I wanted to mention them. At 3.5 million contacts, some of them get prohibitively expensive however.

 

Currently, there isn't any out of the box solution for what you're trying to do. HubSpot duplicate detection is not that advanced.

 

Still, when I read your post I had an idea that leverages a new product update (beta). You could do the following:

  1. Set up an assocation label for contact <> contact called 'Duplicate': https://knowledge.hubspot.com/object-settings/create-and-use-association-labels
  2. Enroll in the workflow beta for associations: https://knowledge.hubspot.com/workflows/manage-crm-record-associations-with-workflows (by clicking the menu in the top right corner in HubSpot > Product updates > search for associations)
  3. Create a contact-based that enrolls all contacts where the identifier is known, then use the 'Create associations' workflow action to associate contacts, based on the matching identifier, apply the association label 'Duplicate'
  4. Create a calculation property on the contact object that counts (method: count) how many contacts the source record is associated to where the association label is 'Duplicate' https://knowledge.hubspot.com/properties/create-calculation-properties

And voilà, you can now filter for contacts where this calculation property is greater than or equal to 1.

 

I'm not sure how the beta deals with cases where there are more than two records sharing your identifier. That's something you'd have to test. Nevertheless, this should flag all contacts where there is at least one duplicate.

 

Have a great weekend!

Karsten Köhler
HubSpot Freelancer | RevOps & CRM Consultant | Community Hall of Famer

Beratungstermin mit Karsten vereinbaren

 

Did my post help answer your query? Help the community by marking it as a solution.

View solution in original post

6 Replies 6
RBozeman
Participant

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hey @JLacey9,

 

Thank you for the mention @karstenkoehler. I wanted to share some more context about Insycle, as we have some very advanced bulk deduplication (and automation) features that sounds like they would be helpful, particularly for a database of that size. 

 

Here is a quick breakdown of our Merge Duplicates tool:

 

  • Works for all primary record types (contacts, companies, deals, tickets, etc.) and we can add custom objects as well if need be. 
  • Use any field in your database as a potential unique ID matching field. For example, you can use similar or exact matching on a full name, company name, email domain, mailing address,  - whatever fields make the most sense based on your use case.
  • Set rules for determining the master record that duplicates will all merge into, such as the earlier created create, highest deal amount, or most recently updated.
  • Set rules for retaining data down to the individual field level. For example, you could instruct Insycle to append data to specific fields during the merge so that it is not lost, or set rules for defining which record to keep the data from (keep the notes from the record with a deal in active pipeline stage, etc)
  • Fully automate your deduplication templates. Run them on a set schedule or inject them into Workflows so that deduping happens automatically after your contact (or any record type) is created.
  • Preview your deduplication runs to see how they work and ensure accuracy before pushing the update live. 

If you have any questions don't hesitate to shoot me a DM, happy to help. 

0 Upvotes
Jonas_De_Mets
Top Contributor

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Thanks for mentioning Koalify @karstenkoehler.
I also love the creative approach using the association labels 💡

@JLacey9 your use case mirrors the exact scenario I encountered with a reverse ETL setup, which inspired the creation of Koalify. Our plugin is fully integrated with HubSpot, providing an almost native experience.

Here’s how it works:

  1. Create a rule in based on your unique identifier property.
  2. Highlight duplicate pairs in HubSpot using our Koalify properties.
  3. Automatically merge these duplicates via a workflow action.

For datasets over 1 million records, our pricing can be expensive, but I’m happy to discuss options to find a solution that works for you.

Hope this helps!

Jonas De Mets
RevOps & Co-Founder @ Koalify

Connect via LinkedIn


Did my reply help answer your question? Please mark it as a solution.

karstenkoehler
Solution
Hall of Famer | Partner
Hall of Famer | Partner

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hi @JLacey9,

 

There are some great third-party solutions that integrate with HubSpot (Koalify, Insycle) which you're probably aware of. Nevertheless, I wanted to mention them. At 3.5 million contacts, some of them get prohibitively expensive however.

 

Currently, there isn't any out of the box solution for what you're trying to do. HubSpot duplicate detection is not that advanced.

 

Still, when I read your post I had an idea that leverages a new product update (beta). You could do the following:

  1. Set up an assocation label for contact <> contact called 'Duplicate': https://knowledge.hubspot.com/object-settings/create-and-use-association-labels
  2. Enroll in the workflow beta for associations: https://knowledge.hubspot.com/workflows/manage-crm-record-associations-with-workflows (by clicking the menu in the top right corner in HubSpot > Product updates > search for associations)
  3. Create a contact-based that enrolls all contacts where the identifier is known, then use the 'Create associations' workflow action to associate contacts, based on the matching identifier, apply the association label 'Duplicate'
  4. Create a calculation property on the contact object that counts (method: count) how many contacts the source record is associated to where the association label is 'Duplicate' https://knowledge.hubspot.com/properties/create-calculation-properties

And voilà, you can now filter for contacts where this calculation property is greater than or equal to 1.

 

I'm not sure how the beta deals with cases where there are more than two records sharing your identifier. That's something you'd have to test. Nevertheless, this should flag all contacts where there is at least one duplicate.

 

Have a great weekend!

Karsten Köhler
HubSpot Freelancer | RevOps & CRM Consultant | Community Hall of Famer

Beratungstermin mit Karsten vereinbaren

 

Did my post help answer your query? Help the community by marking it as a solution.

NBaptista
Member

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hi Karsten! Could Step 1 to 3 also be used to directly search in the contact creation process if there is a matching contact already?

0 Upvotes
JLacey9
Participant

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

Hi @karstenkoehler thank you for your quick solution - this is great to know more about association labels and the new beta workflow action!

 

I didn't mention in my original post, but our custom Contact property that I want to detect duplictaes on is a number property type, and the article mentions we can only use single-line text, multi-line text, or phone number properties when matching records to create associations. When testing this solution, I added a step to create a new single-line text field to copy the value from the number field, and used that in the 'create association' action in the workflow. 

 

Then when I tested the workflow with the 'create association' action, I got a "BLOCK" server response:

JLacey9_0-1720533550367.png

I am not sure if this could have something to do with Sandbox.

 

If I can get past this error then I believe this solution will work for us! Thanks again for your help.

 

0 Upvotes
karstenkoehler
Hall of Famer | Partner
Hall of Famer | Partner

Ability to identify duplicate contacts based on custom Contact property value

SOLVE

@JLacey9 this one is probably most easily resolved with HubSpot support – could check with HubSpot support what triggers the "BLOCK" server response?

 

Nice one on the single-line text, that's what I would've proposed, too.

Karsten Köhler
HubSpot Freelancer | RevOps & CRM Consultant | Community Hall of Famer

Beratungstermin mit Karsten vereinbaren

 

Did my post help answer your query? Help the community by marking it as a solution.

0 Upvotes