top of page

Features

Enrich your data

Cross-reference your data, build bridges between them and complement them with your organization's internal repositories or with external information, such as Open Data or commercial databases.

v617batch2-bb-01-technology-min.jpg
Bande-frise-taleofdata

Complete your data on millions of lines

Enrich your data using different methods to add value to your data and select the strategy that best suits your current context and needs.

  • enrichment using repositories,

  • enrichment by join between 2 sources,

  • enrichment using fuzzy logic and/or phonetics.

enrich 1

Enrich your data with repositories

Repositories are, in essence, enrichment data.

Data enrichment from repositories enables you to cross-reference and supplement your data with internal or external information. This is an important step in data quality enhancement.

Enhance the content of your data in just a few clicks, without ever writing a single line of code, is one of the strengths and special features of Tale of data.

Take advantage of all available repositories

  • by cross-referencing information from different sources,

  • your organization's internal repositories, produced by other departments,

  • or your organization's external repositories. These may be commercial databases or open data sets. Tale of Data provides its users with a wide range of public data in its catalog (SIRENE, IBAN, LEI databases, etc.).

Repository-based reconciliation in Tale of Data allows you to use several matching strategies:

  • Exact matches: ideal. In this case, you have common information between the two sources, and you can easily create a bridge between your information and this repository. 

 

  • Fuzzy matches (e.g. phonetic, similarity rate): this is the option to choose if you have data likely to contain spelling inaccuracies.

Make the most of your repositories with Tale of Data

Tale of Data repositories offer two major advantages:

  1. they can be shared with other users,

  2. they offer excellent performance because they are automatically indexed.

 

Information can be retrieved in a few milliseconds from a repository containing hundreds of millions of lines.

 

This makes it possible to enrich data en masse, in a very short space of time.

Bande-frise-taleofdata

Join enrichment

Join enrichment is a solution that enables multiple files to be assembled using a common key.

The ease of use of this function offers the user a wealth of combinations:

  • joint types (see illustration opposite)

  • join conditions: equal, different, greater than (strictly or not) or less than (strictly or not), ...

​​

Thanks to this function, you can easily enrich your data with additional information from different sources, without having to write a script. In fact, Tale of Data lets you cross-reference Excel or CSV files with data from a database in a single process.

This feature enables non-technical users to process large datasets quickly and efficiently, without any technical programming skills.

Enrichment-jointure-tod

Fuzzy logic enrichment

Fuzzy logic is a complementary method to join-based data enrichment.

While a join strategy requires a common key between your datasets, fuzzy logic frees you from this constraint.

 

Apply reconciliations and enrichments with similar data, always without writing a line of code.

Approximate spelling (1 or more differences), phonetics, ignoring case, accents, spaces, ... whatever strategy and function you use, Tale of data detects 'approximate' terms and correlates data from different sources, even without a common key.

Enrich 5

The advantage of the confidence index in matchmaking

Finally, the confidence index measures the reliability of a fuzzy join. This index ranges from 0 to 1.

  1. If the index = 1, the join is 100% reliable between your two sources, and all joined/overlapped fields are identical.

  2. If the index is between 0.99 and 0.85, the reconciliations proposed by the solution should be studied and the decision taken on a case-by-case basis. So it makes sense to put them together.

  3. Finally, if the index is less than 85%, the join is unreliable. There are large differences between the reconciled fields, and their study is unlikely to be relevant. Tale of Data allows you not to match these data.


In other cases, a single letter discrepancy is normal and not the result of a data entry error. This is the case, for example, with Vitalis and Vitalys. Your confidence index will be high if you use only the name to reconcile the information, even though they are two different companies.


The confidence index therefore helps the user to make easier reconciliation decisions.

Indice-confiance-tod
testimonial strip.png

Discover how to enrich your data with Open Data on Tale of Data. Explore the infinite possibilities for your business strategies.

si ancien_edited.jpg
new band cta.png

Exploit the full potential of your data by scheduling a demonstration

bottom of page