More efficient way to find and merge duplicate contacts

The Sales CRM functions more as intended for lead capture from a web form where one email would be provided to create the contact record.  

You're thinking in terms of the CRM as a Sales and Marketing tool, which it is, but it's also just a CRM: Connected to Gmail it's getting new records daily from people that have multiple email addresses.  Moreover, just doing an import of addresses from other databases can result in multiple Contacts being created for the various email addresses someone might have.

Consider myself as an example, I have 12 different email addresses.  When I take a new client or job, I'll have another.  Those are 13 different records. They are all just me and each of those email addresses should be more easily uncovered as associated with the same person and merged.

 

What needs to improve
Today we have to go into a Contact records and use the Merge function to find the other emails and merge them.  That's a manual process. A pain for large Contact databases.

The Contact list should have a "Find and Dedup" option.  The CRM should find likely duplicates not based on the email identifier but other personal identifiers: same name, similar name in same location, etc.  Flag those in the list and make it easy to check the duplicate Contacts and "Merge" them.

This really needs to be done as part of the platform as growing companies, teams with many people using Gmail as part of their outreach, etc. will constantly result in new Contact records that can go unnoticed as duplicates of existing records.

HubSpot updates
Jul 30, 2019

Hi Daniel -- I've reached out to schedule a call for us. Thanks again

Jul 30, 2019

Hi folks -- thank you for all of the continued feedback on the duplicate management tool! We're thrilled that this has been as useful as we'd hoped to help businesses keep their data as clean to take advantage of a unified customer database for marketing, sales, and customer service teams.

 

We're proud to build an open platform to support our integration partners like Dedupely and Insycle. While our partner products can deliver differentiated value, the native duplicate management tool is developed entirely by HubSpot.

 

How we identify potential duplicates: Under the hood, this tool uses machine learning (ML) to identify contacts and companies that you are likely to merge. At its core, ML helps automate tasks by analyzing examples of a task to evalutate new tasks to complete. In the case of duplicate management, the task is merging contacts. By analysing past merges, ML algorithms identify other pairs of contacts that have a high likelihood of being merged based on your behavior, which are shown to you in the duplicate management tool.

Today, the ML models for contact and company merge suggestions are based on contact and company properties including name, email, phone number, company name, company industry (determined by HubSpot Insights), and company industry (determined by HubSpot Insights). We plan to expand the properties that the ML models consider to other properties, but to be transparent, we're more likely to add default contact properties (e.g. contact activity data) before custom portal properties in the near term.

 

Because this is a machine learning product, it doesn't rely on strict rules like exact matches phone number or name; by not relying on rules, the duplicate suggestions can more closely mimic our own human understanding of challenges like several contacts with the same business phone number or several contacts with the same generic name, like Kevin Smiley Happy.  As you merge and dismiss pairs, you provide feedback to the tool to help improve the accuracy of our merge recommendations based on patterns in the contact and company properties.

 

This tool is still under active development, and you should expect improvements around both the accuracy of the suggested merges and the merge experience throughout the rest of the year. 

Jul 12, 2019

Thank you! We're proud to be building an all-on-one platform to allow businesses to combine HubSpot's suite of tools with their choice of powerful integrations, like Dedupely. It takes a village to keep CRM data clean to enable millons of businesses to grow better. 

- Kevin Walsh (senior product manager @ HubSpot)

Jun 13, 2019

Thank you so much @ck2018 ! We appreciate the kind words, and we appreciate being able to work with great, value adding partners like Insycle. We're eager to continue building a strong, loveable platform for engineering teams like Insycle to help us help millions of buisnesses grow better.

 

- Kevin Walsh (senior product manager @ HubSpot)

changed to: Delivered
Jun 12, 2019

Hi HubSpot Community,


Thank you all for the continued feedback on this post. The continued insights into tools needed to help your businesses grow better is exactly what we need to build great products. 

Today, I'm excited to let you know that you’ve got a brand new tool that finds duplicate contact and company data in HubSpot. No extra spreadsheets, tools, or costs. So you’ll be more efficient, and your customers will have more frictionless experiences with your brand. This tool is now available for all professional and enterprise customers - full details on how to use it can be found in this knowledge article


This product leverages machine learning to consider data such as name, email(s), IP-derived country, phone number, zip code, and company name when comparing two objects. When you accept (merge) or reject (dismiss) a pair as duplicates, you’re providing feedback to the model to help it improve over time. We're likely to add more data to the model in the future.

 
Again, thank you all for your continued feedback on this idea. Your use-cases, examples, and urgency help us build better products. Happy deduping! 

Apr 14, 2019

Hi all -- We're excited that we're getting closer to being able to release the new tool to all customers. We're working to scale its processing, so that it works for everyone, including customers with large numbers of contacts. 

 

The link in BB1's post will only work if your portal has been accepted into the beta program. If you are interested in becoming an early user - please fill out the beta form here. We will be in touch if you are a good fit.

 

We won't be accepting every submission into the beta, but we will reach out to submissions that are a good fit for the early version of the tool.

 

Thank you

Jan 8, 2019

Hello HubSpot community - I wanted to re-illustrate that this tool is available from HubSpot in private beta. Currently, the beta supports contact duplicate identification. We hope to introduce companies duplicate identification in the coming weeks as well. 

 

If you are interested in becoming an early user - please fill out the beta form here. We will be in touch if you are a good fit.

 

Note: We will not be accepting every submission into the beta, we will reach out to submissions that are a good fit for the early version of the tool. 

changed to: In Beta
Nov 20, 2018

Hello HubSpot community - I'm excited to let you know we have an early version of this tool available in a private beta. If you are interested in becoming an early user of this product - please fill out the beta form here and we will be in touch if you are a good fit.

 

Note: We will not be accepting every submission into the beta,  but we will be reaching out to submissions that are a good fit for the early version of the tool.  

Oct 29, 2018

If you are interested in becoming an early beta tester of this product once it is developed - please fill out the beta form here.

changed to: In Planning
Oct 29, 2018

Hi HubSpot community, 

This is something the Product and Engineering team is beginning to research and plan, we hope to deliver a solution in the coming months. 

changed to: Investigating
Apr 3, 2017

Thanks for adding this idea. Helping customers identify and merge duplicate contacts and companeis is somethign that we are putting a lot of thought into. We appreciate the feedback and examples, this is extremely useful.

141 Replies
Regular Contributor

Native CRM duplicate identification features are never 100%. I get it, it's resource heavy and very hard to get right. HubSpot has bigger fish to fry and good on them. I've been working hands on in the data cleansing industry for years and can tell you I've had more than my share of failures. Big data handling and all its nuances is really tricky. This is why I founded Dedupely.

We've just added Merge Rules and CSV Backups to Dedupely. Combine that with Automatic Merging and the ability to create multiple filters to catch different types of duplicates (by different fields). This month, we've 50x the speed of find matches feature, even for enormous databases doing fuzzy matching.

 

I know there are other 3rd party integrations that de-dup. Dedupely's difference is we specialize in record deduplication and have battle tested almost every case out there (We've failed forward and managed to come out very strong).

 

That said, we've been accepted to HubSpot Connect and are waiting for our status update.

New Member

It would be great if when searching for the Secondary company to merge, we could search by company ID, not just by company name.  This would help those of us doing data cleaning projects where we may have 3 companies of the same name but each with a unique company ID to select the correct Secondary company we want to merge with the Primary

New Contributor

I have 2 suggestions for the duplicat contacts:

1) Show other fields like owner, lifecycle stage or custom fields to make merge decision easier

2) Can filter the merge cts if either one of the two cts match the filter

 

I am really excited about this feature. Hope Hubspot can improve it soon

updated to: Delivered
HubSpot Moderator

Hi HubSpot Community,


Thank you all for the continued feedback on this post. The continued insights into tools needed to help your businesses grow better is exactly what we need to build great products. 

Today, I'm excited to let you know that you’ve got a brand new tool that finds duplicate contact and company data in HubSpot. No extra spreadsheets, tools, or costs. So you’ll be more efficient, and your customers will have more frictionless experiences with your brand. This tool is now available for all professional and enterprise customers - full details on how to use it can be found in this knowledge article


This product leverages machine learning to consider data such as name, email(s), IP-derived country, phone number, zip code, and company name when comparing two objects. When you accept (merge) or reject (dismiss) a pair as duplicates, you’re providing feedback to the model to help it improve over time. We're likely to add more data to the model in the future.

 
Again, thank you all for your continued feedback on this idea. Your use-cases, examples, and urgency help us build better products. Happy deduping! 

Occasional Contributor

Congrats on the new tool Dylan and everyone!

 

HubSpot is indeed a loveable and open platform, it has evolved from an “all-in-one” suite into an “all-on-one” platform, providing customers the flexibility and autonomy to choose the integrations that work best for them and everyone benefits.

 

I thought it would be useful to share about different alternatives for dealing with duplicates because different companies have different needs. If you’re struggling on a large scale and need flexible ways to identify and bulk merge duplicates, one-off or automatically every day, I invite you to check out Insycle, here is a video.

 

Insycle, a HubSpot Premier Partner, is a suite of tools for managing and working with data easily at scale. It has modules for deduping records (including preview mode, master selection rules, reports and more), fixing people’s name capitalization, formatting phone numbers, standardizing inconsistent titles, importing using append mode, and many more features you could use directly or set up to run automatically to achieve better results for your Sales, Marketing, and Customer Success teams.

 

Ultimately we’re all aligned about helping HubSpot customers Grow Better!

 

Insycle is a HubSpot Premier Partner, Salesforce AppExchange Partner, Intercom Partner, and more.

Resident Expert

@Dylan Very exciting. At this time I am not ready for enterprise or pro. Thank you @ck2018 for providing your solution as well.

 

Yeah HubSpot.

HubSpot Product Team

Thank you so much @ck2018 ! We appreciate the kind words, and we appreciate being able to work with great, value adding partners like Insycle. We're eager to continue building a strong, loveable platform for engineering teams like Insycle to help us help millions of buisnesses grow better.

 

- Kevin Walsh (senior product manager @ HubSpot)

Regular Contributor

Wait! Is this SOLVED NOW?! I saw the updates to review duplicates. It still has some issues but is FAR better than what was previously available. 

Regular Contributor

I'm obviously missing something.  I don't see any improvements.  I have Contact A that I want to merge with Contact B.  Contact A is the primary...and I'm adding Contact B's info. 

 

I go to Contact A's Contact Record, select Merge, select Contact B and if Contact B has any activity more current than Contact A or maybe they both got an email at the same time...Contact B's info prevails.

 

What am I missing?  We have Sales Professional.

 

 

Regular Contributor

Nevermind... The merge feature is still a separate-but related issue.

 

Sigh...

Regular Contributor

Yep - the merge is still lacking. But at least there is an improved way to FIND them. 

New Member

With the new Merge Duplicates feature in HubSpot - is there a way to filter the duplicates, aka by State, for my reps to sort the duplicates in our system??

Advisor | Partner

Now that this tool is delivered, I'd like to suggest some improvements to it.

 

  1. Once you dismissed a "potential duplicate" that should not be suggested again when the list is refreshed
  2. Provide a search bar and filter options to drill down the list of "potential duplicate"

Thank you!

New Member

Is there a way to get a .csv file of all duplicates as I need to make the files accessible to multiple teams so they can let me know which of the duplicates is correct.

Regular Contributor

Big congrats on launching this. It's been a long awaited native feature by many. Right now we're excited to see how Dedupely is being used in tandem with the new native merge feature. This helps the community power up their data cleansing and makes HubSpot a no brainer for companies that want a full solution.

HubSpot Product Team

Thank you! We're proud to be building an all-on-one platform to allow businesses to combine HubSpot's suite of tools with their choice of powerful integrations, like Dedupely. It takes a village to keep CRM data clean to enable millons of businesses to grow better. 

- Kevin Walsh (senior product manager @ HubSpot)

Advisor

@_Kevin A couple of other things.  From the capture below, this is one that Hubspot has identified as a duplicate.  How is this determination being made?  The names, email address and phone numbers don't match?  How are you assured that these are one in the same person?  I have literally hundreds that are like this.

 

Also, one of the properties shown is Last Activity Date.  This is used for contact with a customer, meeting booked, call made, etc.  For B2C companies and for us, we sell directly to customer and rarely have a person to person contact. This criteria is useless to us, and probably many others.  This should be a selectable field, that would be appropriate for the business.

 

Thanks in advance.  Scott

 

Annotation 2019-07-18 114239.png

Occasional Contributor

Phone numbers!!! I have a similar case except with phone calls. I have an API connection to import calls from my tracked marketing numbers (which in many cases are enhanced to include caller id with associated addresses - while also many "unknown" contact names) and the problem I am having is that the merge duplicate feature will not appropriate a dupe by the phone number (more unique and accurate than a name or partial match on an email).

 

Makes no sense that phone number fields aren't considered, as I get more than one call from the same number on many occasions and it creates a contact for every call. I implemented a filter so it doesn't duplicate the number if it already exists, but since my API import/export only goes out once a day with the tool I'm using to manage the API, if I get duplicate calls in the same day, it pushes all calls as new contacts even with the same exact info. I thought the merge duplicate would give me a way to manage this but unfortunately, as it stands, it is a very half baked feature. 

 

Things I would like to see happen with this feature:

- Use phone numbers to find duplicates! Consider all contact info including addresses, mobile numbers, emails, etc.

- Allow configurable merge duplicate filters, like having the tool search custom fields that you set.

- As mentioned above, a find and merge action on each contact is a no brainer!

- Improve accuracy! Some suggestions are completely bogus.

 

Look forward to the improvements as this seems like a mandatory feature for a system that replaces a spreadsheet, which could be easily programmed to do these exact things with little effort (as mentioned, independent devs have been making these kinds of tools themselves for a while!)

Advisor

Excellent ideas.   (especially #4!!!) Maybe @_Kevin can give us an update with any new developments.

 

Scott

Resident Expert

@daniel_voiss I feel your pain, however, a phone merge would hurt us. We deal with small businesses and everyone has the same phone number.