top of page

Monitoring and improving data quality
from sensors

Sensor data (IoT), financial transactions, plant information, anomaly detection


The need


Our client, an industrial group with hundreds of subsidiaries worldwide, wanted to control and improve the quality of PI (plant information) data from sensors at production sites.

There were several aims:

- Have PI nomenclatures (Assets, Attributes, Tags) with clear naming rules, which are free of duplicates to enable better tag reuse and cross-site analysis.

- Set up a high-performance monitoring system for PI Tags (= time series): real-time detection of missing or aberrant data, identification of faulty sensors, etc.

- Supply teams of Data Scientists with reliable data, an essential prerequisite for building coherent, high-performance predictive models (forecasting, predictive maintenance, etc.).

Proposed solution :


Harmonization of sensor nomenclature:

Tale of Data automatically matches texts (name, description, etc.) with spelling differences using advanced fuzzy matching algorithms: phonetics (English/French), consonant (or vowel) frequency, word fragmentation (N-Gram), and automatic word weighting: less discriminating words are given a lower weight.

Monitoring sensor data with Tale of Data's time series analysis algorithms:

- Determination, by sensor type, of appropriate alert thresholds for measured values (temperature, pressure, etc.): these thresholds were obtained by running an automatic analysis over several years of historical data.

- Determination, by sensor type, of appropriate alert thresholds for time gaps between two measurements: these thresholds were obtained by running an automatic analysis over several years of historical data.

- Automatic alerts when thresholds are exceeded or data is missing



The harmonization of wording and deduplication has enabled the creation of a shared repository of metadata repository PI: Assets, Attributes, Tags.

This metadata repository PI metadata repository, with clear naming rules, has opened up many new possibilities:

- Consistent system representation: same set of attributes for elements representing the same type of equipment, with standardized names, descriptions and units of measurement.
- Facilitation of "multi-point" analyses: standardized metadata enable time series to be aggregated or compared, whether for monitoring, reporting or predictive analysis (Machine Learning ).

By analyzing time series, we were able to put a fully automated monitoring system into production in just a few weeks, continuously analyzing data from tens of thousands of sensors.

Alerts on very specific conditions have been set up (sensors emitting erroneous values or showing anomalies in the time intervals between two measurements). These alerts can be reconfigured at any time by business users, without writing code.

testimonial tape.png

Stay up to date with our latest exciting articles!

new band cta.png

Harness the full potential of your data by scheduling a demonstration

bottom of page