More efficient way to find and merge duplicate contacts


The Sales CRM functions more as intended for lead capture from a web form where one email would be provided to create the contact record.  

You're thinking in terms of the CRM as a Sales and Marketing tool, which it is, but it's also just a CRM: Connected to Gmail it's getting new records daily from people that have multiple email addresses.  Moreover, just doing an import of addresses from other databases can result in multiple Contacts being created for the various email addresses someone might have.

Consider myself as an example, I have 12 different email addresses.  When I take a new client or job, I'll have another.  Those are 13 different records. They are all just me and each of those email addresses should be more easily uncovered as associated with the same person and merged.


What needs to improve
Today we have to go into a Contact records and use the Merge function to find the other emails and merge them.  That's a manual process. A pain for large Contact databases.

The Contact list should have a "Find and Dedup" option.  The CRM should find likely duplicates not based on the email identifier but other personal identifiers: same name, similar name in same location, etc.  Flag those in the list and make it easy to check the duplicate Contacts and "Merge" them.

This really needs to be done as part of the platform as growing companies, teams with many people using Gmail as part of their outreach, etc. will constantly result in new Contact records that can go unnoticed as duplicates of existing records.

I'm not sure if you think that you have solved for this request by building a fuction for merging two records and building a function for allowing contacts to have multiple email addresses. However, this issue is not resolved. It should not be difficult to build an alogorithm and report that searches for "likely" duplicates. I would like to be able to add custom properties to a list of properties that it looks at. 


For example. Perhaps the default properties an algorithm could consider would be first name, last name, phone number, mobile phone number, address, IP address, and email address. I would also like to add custom properties to the list for consideration like loyalty number, club membership number, and contact properties for record IDs from imports from other systems. I'd like to see a list of two contacts that the system says look similar with an option to merge or ignore. If I select merge I'd like to be taken to the merge screen popup for confirmation. If I select ignore I'd like those two names to be removed from the list and remembered as being not equal.


@seobrien - Can you please provide an update to the status of this request?


We also need this functionality in order to efficiently use the Hubspot CRM. We pitch and work with companies and contacts that don't have email addresses (e.g. Floral Managers in grocery stores). We need to be able to de-dupe on their names, companies, addresses and or/phone numbers. 


I feel for everyone here as I've had the same headache for months.  I want to offer a glimmer of hope: It seems that a third-party provider will have a solution for this some time this year:


After we integrated Salesforce CRM with HubSpot, we had to manage our duplicates in Salesforce better. After successfully removing 100% of duplicates in our CRM and switching to a contacts having multiple accounts in Salesforce data model, we were still left with 2000+ duplicate contacts in a 60K database. This has led to syncing issues. We're now having to go through a list of 2000 records and manually click to merge each one of them from the old email address/secondary email address on a dupe contact in HubSpot to the primary currently referenced/syncing in HubSpot to a primary contact.  Just being able to run a process daily like via a smart list of potential dupes group based on some search pattern of first and last name for example, could allow us to bulk analyze and merge many dupes.


After a very good first impression of this CRM, the inability to easily either ignore, remove or merge duplicates has stopped my evaluation of the system in its tracks. Having this function available at either import; and especially in the contacts and companies lists is very important, perhaps critical. 


Any update on this? Doing my quarterly duplicate pull and having a painful reminder of how tedious this is with HubSpot. Please, this has been "in planning" for over a year, can we at least get an update on where this feature is in the pipeline? 


Yes, I work for this company, but, hey, if it can help someone here, I'm willing to try:


I'm from Insycle and we're a HubSpot Certified Connect Partner. We developed a tool that surfaces duplicates by any field and provides flexible way to merge them. Think it can help with some of your pains.


Last week we published a guest post on HubSpot blog that you may find interesting: Data Duplication and HubSpot: Dealing With Duplicates and the Impact They Have on Your Business


Don't hate on us before you try 🙂

What's the status of this?

My team and I removed more than 10k duplicates in Groupon's CRM using the following method. We did that out of the CRM.  I believe Hubspot should build a solution that does something similar but directly within the system, like some Salesforce plugins are doing.


1/ Dealing with "exact match" duplicates

- Hubspot is currently making sure the email is unique at each contact creation.

--> Hubspot should do exactly the same for phone number.


2/ Dealing with "close match" duplicates

These are contacts pretty similar, in names, emails, phones, etc. For this, there is only one solution: to use fuzzy match. Hubspot should fuzzy match contact constantly, on various properties. It should be presented in 2 screens:

- 1st screen: all close match

- 2nd screen: close match to be checked. It's then up to the admin (or his team) to review that and for each pair to "Pass" it or to "Merge" it.





We have reference numbers on our raw data list prior to import as our old CRM was able to identify these against what was already uploaded and update the details on the contact with the same reference as whats on the import. That Hubspot doesn't do this is narrowminded as most other CRM systems we use do!


So far I have seen two third-party companies that have already created a solution to this problem through an integration to Hubspot.  How is it that Hubspot themselves cannot solve this problem?


It is crazy, we really need this feature in HS. The third parties tools are a pain to use.


This would be great if HubSpot added a feature like this! My account has over 6,000 companies in the database, and I am estimating about half of them are duplicates. It would take too much time to search and do this manually so hopefully this will become a Beta soon!

Hi HubSpot community, 

This is something the Product and Engineering team is beginning to research and plan, we hope to deliver a solution in the coming months. 


I've been using incycle, they recently came out with a tool to bulk merge based on properties. Deduply claims to have a solution coming really soon as well.


In a perfect world, HubSpot would use the stored cookie data to help indentify potential duplicates.