|
||||||
As data warehouses grow in volume and complexity, data in the warehouse may separate into two classes of data: active and inactive.
One example of this characteristic is that a terabyte of data may have 50GB that are actively used and 950GB that are accessed perhaps only once a month or once a quarter. The organization pays the same for the data regardless of how frequently it is used. The data warehouse administrator can either archive the inactive data or place it in near-line storage. Accessing the inactive data, moving it to near-line storage, then deleting the data from the data warehouse defines the separation. While it is true that all data warehouses face separation, the degree of separation varies among warehouses, based on these factors:
Critical Success FactorsThere are three critical success factors that each company needs to identify before moving forward with the issue of data quality:
The senior management commitment to maintaining the quality of corporate data can be achieved by instituting a data administration department that oversees the management of corporate data. The role of this department will be to establish data management standards, policies, procedures, and guidelines pertaining to data and data quality. Data QualityIn addition to referring to the usefulness of the data, data quality has to be defined as data that meets the following five criteria:
The definition of data quality must include the definition of the degree of quality that is required for each element being loaded into the data warehouse. If, for example, customer addresses are stored, it might be acceptable that the four-digit extension to the zip code, or the three-digit extension to a postal code, is missing. However, the street address, city, and state or province are of much higher importance. This parameter must be identified by each individual company and for each item that is used in the data warehouse. A third factor that needs to be considered is the quality assurance of data. Since data is moved from transactional/legacy systems to the data warehouse, the accuracy of this data needs to be verified and corrected if necessary, and this will often involve cleansing of existing data. Since no company is able to rectify all of its unclean data, procedures have to be put in place to ensure data quality at the source. Modify Business ProcessesThis task can only be achieved by modifying business processes and designing data quality into the system. In identifying every data item and its usefulness to the ultimate users of this data, data quality requirements can be established. One might argue that this is too costly, but is has to be kept in mind that increasing the quality of data as an after-the-fact task is five to ten times more costly than capturing it correctly at the source. If companies want to use a data warehouse for competitive advantage and reap its benefits, the issue of data quality is extremely important. Only when data quality is recognized as a corporate asset by every member of the organization will the benefits of data warehousing and CRM initiatives be realized.
The copyright of the article Separation of Warehouse Data in Customer Relations is owned by Duane Sharp. Permission to republish Separation of Warehouse Data in print or online must be granted by the author in writing.
|
||||||
|
|
||||||
|
|
||||||