Peter Benson, founding Director of the Electronic Commerce Code Management Association (ECCMA), where he is currently CTO, has published a whitepaper, ‘Managing a Data Cleansing Project*.’ Data cleansing appears ‘deceptively simple’ and while common sense will see cleansers through a small project, more structured processes and knowledge are required for large-scale initiatives. The 64 page whitepaper offers insightful definitions of terms such as cataloguing (a synonym for cleansing), master and metadata and more, leveraging the ISO 8000 quality standards and best practices.
‘Quality’ means data that meets requirements so IT can be used to verify data entry against external references. Cleansing also includes structuring data and adding context through an ‘ontology,’ here taken to mean an assembly of ‘a data dictionary, classifications, data requirement statements and rendering guides.’ Other terms—metadata, class, classification and property are given equally pragmatic definitions with examples.
The whitepaper has a tendency to resemble a standards ‘smorgasbord’ with references to Federal, NATO, UNPC, NCS, ISO and the ECCMA Open Technical Dictionary (eOTD). The latter is a super-registry of terminology where concepts are assigned a public domain identifier. eOTD is a key enabler of quality in that it provides a standardized superset of industry and technical terminology. Users can ask for terms to be added to the registry for free and the eOTD is updated in under a day.
The whitepaper discusses roles and responsibilities for clean-up with potential candidates filling the C-Suite. Benson thinks that ultimately, responsibility for corporate data rests with the CEO who should delegate it to a ‘master data quality oversight committee.’ Data management is fundamentally no different from physical inventory management and merits similar attention from both board and bean counters. More from eccma.org.
© Oil IT Journal - all rights reserved.