You probably have already spent a considerable amount of time performing data clean-up. Perhaps you do this with tools such as text editors, or for the more sophisticated, UNIX shell scripting tools such as grep and awk may do the job.
double entries
Typical tasks may be searching for double entries in databases or flat files, changing well names to match the 'official' standard or editing company information after a merger. This humble task took nearly 40 man-years of effort in NAM's SAP migration project, so anything that helps out in this context is a potential money-saver.
regular expression
Quillion's Q-Clean is a user-friendly shell to a Regular Expression generator (a cross platform generalization of the UNIX grep shell command) which allows the user to split database fields, to fix the case of text fields or to homogenize units - such that a column containing successive values of say .75cm, 1", 2in. can be parsed and split into value and unit of measure pairs. Q-Clean works on any DAO or ODBC compliant data base (which means just about anything) and a three hour learning time is claimed. More from www.quillion.com
Click here to comment on this article
If your browser does not work with the MailTo button, send mail to
pdm@oilit.com with PDM_V_3.3_9912_14 as the subject.
Copyright © 2000 The Data Room - all rights reserved. © Oil IT Journal - all rights reserved.