Free trial

4 October 2016

Data-matching: principles to consider when planning your project

Content team

If you're in business, you'll probably agree you'd make little progress without keeping records on your prospects, customers, suppliers, counterparties and the like. But for many, maintaining data integrity is a perennial problem.

Help is at hand from large, centralised and continually refreshed databases like ours, which can bring out the best in your own data, as outlined in our recent white paper, CRM integration: enriching, refreshing and centralising your data for B2B sales and marketing success.

But data integration isn't limited to business development. It's useful for credit risk, third-party due diligence and all manner of other B2B activities.

And it begins with data-matching.

Where it all starts

Data-matching is the first practical step of almost all projects aimed at integrating one database with another.

The main objective of the exercise is to peg entities on your database to entity ID numbers on a larger, more accurate and updated external database.

By doing this, your entities can be enriched and refreshed at source – or at least cleaned and enhanced on a one-time basis, if you don't proceed with full integration.

At Bureau van Dijk, we can help you with this. We achieve it through a combination of software automation and manual checks, all of it carried out after we sign a non-disclosure agreement.

Based on SQL searches or other selection methods, you typically export some or all of your data from your database into a .csv-type spreadsheet or similar, which you send to us. You can choose to include whatever fields of data you like.

For a customer relationship management system (CRM), for example, that might be company name, country, other address fields, contact details and anything else you think will help to determine for each record whether the company on your database is the same as the company with the same or similar name on our database.


With this exported file, we then prepare your data, ready to be put through our matching software.

You can specify which countries in our dataset you want to match your data against, how fields in each dataset should correlate with each other, and other bespoke criteria.

The technical term for this part of the process is "mapping", and it's conceptually similar to the address line-matching step you take when mail-merging in Microsoft Word by importing data from Excel.

Once we've mapped the data fields, we work with you to specify rulesets to determine whether specific entities from each dataset are true matches with each other.

Using algorithms in the software, these rulesets comprise conditionals such as "IF" and "WHERE", which work in combination with each other and with specified percentages of required accuracy. Put into lay terms, an example would be "if 'country' is a 100% match and 'company name' is an 85% match, where the 'telephone number' is blank, consider this a match".

Customisation and beyond

You can customise these rulesets or we can help. If this sounds complicated, it's not particularly, and in any case, we can advise you on the process.

As soon as we establish a true match, we attach a BvD ID to the entity in your database, inextricably linking it to the corresponding entity in Bureau van Dijk's database, along with all its mapped fields. So if any mapped field in the entity in our database is changed, the change is automatically reflected in your CRM or other database.

An extension of this process is the identification of duplicates on your database.

Those that are duplicates are merged, while those that aren't are kept as legitimately separate records. You can specify where the line is drawn between companies that have a number of headquarters. It's also easy to see which entities are now no longer in business.

We also feed a huge quantity of additional detailed information from our databases into your corresponding CRM. This "enrichment" is discussed elsewhere in our white paper, CRM integration: enriching, refreshing and centralising your data for B2B sales and marketing success.

Once integrated, whenever you or your users attempt to add a new company to your database, the link with our database will automatically flag up suggestions for companies that could be a match, which users can then select if appropriate. This stops them from entering duplicates.

You can also do bulk uploads by interrogating our databases with selection criteria relevant to your business and transferring large datasets of more than one company into your database.

While it's an important application, you're not limited to matching data from CRMs.

For example, a credit analyst might want to peg our financial strength information on companies they deal with to their own information on those companies' payment histories with them, adding extra layers of complementary data for more well-rounded decision-making.

Or a compliance researcher could hook up corresponding information on the directors or owners of companies they work with to see if they appear in any PEPs and Sanctions lists.

The only limit is the breadth and depth of our databases – and we currently hold information on more than 200 million companies around the world.

In conclusion – and the next step in your journey

  • Enriching your data
  • Refreshing it
  • Broadening your dataset to spot useful patterns

These are all good reasons to integrate your databases with external ones, whatever your areas of expertise and business needs.

And the general concepts and benefits are simple to grasp.

But, as you might expect, you need to take great care when examining the detail and getting to work on your own data.

So do contact us if you'd like further help.

You can also download a printable PDF of this article for your colleagues.

Bureau van Dijk author logo

Content team, Bureau van Dijk

bvdi white logo

How Bureau van Dijk can help you

Certainty is a highly-prized commodity in business. Data might be getting bigger all the time, but this only makes extracting value from it more difficult.

In capturing and treating private company information we aim to give you more certainty – and help you make better decisions and work more efficiently.



Our solutions are designed to help different business challenges and streamline your workflow. Many of our customers blend our information with their own internal data to get a more complete picture of the companies in their ecosystem.

Try our more certain approach –
welcome to the business of certainty.