Fight against the financing of terrorism, Cybersecurity, Internal threats

Analyser les données

Internal threats: sensitive-information leak prevention

Our client, one of the largest private banking players in Europe, wanted to minimize the risk of sensitive information leaking (identities, financial transactions, etc.). Since this type of leak is most often due to internal malicious acts, the Information Systems Security Manager wanted to exhaustively identify the sensitive information present in the bank’s information system in order to increase the level of protection.


Two questions therefore arose:

  1. Where exactly are stored sensitive data? Which databases? What tables? Which columns? But also which files? (e.g. Excel files and other listings disseminated on the internal network)

  2. What types of sensitive data are these?

Solution provided by Tale of Data

Our “Mass Data Discovery” technology has enabled us to automatically scan:

  • All relational databases

  • All shared network disks: all directories, and their sub-directories, were searched for Excel, CSV, XML or JSON files

  • The CRM and the content management systems (Sharepoint)


Each record in each table was analyzed for sensitive data: last name, first name, addresses, e-mails, telephone numbers, bank account numbers, etc.


The results were aggregated at the field level (whether it was a database, an Excel file or a CSV listing): at the end of the scan we knew, for example, the exact number of people last names present in any Excel file, in the bank network drives.


The data scan (= “Bottom - Up” approach) provided the chief information security officer (CISO) with exhaustive identification and localization of sensitive data.

The scan report allowed security teams to greatly minimize the risk of data leaks:

  • By tracking down malicious SQL queries that they previously thought were harmless (= any SQL query fetching columns that are part of the list of sensitive columns established by the data scan).

  • By systematically checking access to network directories containing sensitive data, which they did not know to be sensitive before the data scan.

  • By verifying the effectiveness of the anonymization procedures: cross-referencing (using Tale of Data fuzzy joins) anonymized files with a list of known customers should not normally generate any match. Finding any match means that it is mandatory to rework on the anonymization process.

  • By controlling the risk of information leaks over time using regular scans: up to several times a day. Indeed, new listings can appear for a few hours on the network just before a leak.

Other scenarios are possible, do not hesitate to contact us to discuss your business cases.