Search request
Press Enter to search and Esc to exit.

News

Unidata has implemented the streaming mode of data quality management
Corporate
24 August 2017

Unidata has implemented the streaming mode of data quality management

Unidata traditionally pays special attention to ensuring data quality. The functions of unified filtering, cleaning and normalization of data coming from different sources, data validation for compliance with specified criteria, data enrichment from internal and external sources are implemented in the platform.

The latest version of the platform can process records received via SOAP and REST-requests from external information systems with the subsequent application of data quality rules and sending of the processed records. At the same time, the records in the platform are not saved. New capabilities allow external enterprise information systems to verify data in real time, check tens of millions of records in batch mode on a schedule, track the history of changing the record quality in time.

There are two modes of processing – streaming and online. The streaming mode assumes a one-time processing of a large number of records on a data verification request. Accordingly, the result of processing contains the corrected records and a list of errors (if any exist). At the same time, information about the errors found in the platform database with possible receiving this information on the record identifier. In turn, in the online mode there is a synchronous response to a request to record verification. The result of the processing contains the corrected record and, if available, a list of errors.

«We are sure that this approach will be widely demanded by the market». – considers general director Unidata, Sergei Kuznetsov. – «The point is this: the streaming mode assumes a one-time processing of a large number of records. At the same time, Unidata platform does not load all incoming data into its storage processing them “on the fly.” After processing the input data Unidata storage contains only statistics about which record has not passed any quality check. The platform stores only the initial keys of the processed data and does not store the record entirely».