|
||||||
A marketing database may contain extensive tables of customer data from purchasing records, lifestyle data, and more advanced demographic data such as census records.
Not all of this data is required on a regular basis and should be filtered out of the query tables. It is not always necessary to mine the contents of the entire table to identify useful information. Potential sources of data that may be useful in a data mining application include:
Data Problems‘Dirty’ or inaccurate data in the mining data store must be avoided if results are to be accurate and useful. Many data mining tools include a system log or other graphical interface tool to identify erroneous data in queries, but every effort should be made prior to this stage to ensure that it does not arrive at the mining database. Discovery-driven SystemsIn a data mining environment, data warehouse, query generators, and data interpretation components, are combined with discovery-driven systems to provide the capability to automatically reveal important yet hidden data. The following tasks need to be completed to make full use of data mining:
Creating ModelsThis task makes use of data warehouse contents to generate a model that predicts desired behaviour automatically. Traditional models use statistical techniques and linear and logical regression, while discovery-driven models generate accurate models that are also more comprehensive. For example, in an investment environment, the discovery-driven model can predict the performance of a particular stock. Analyzing LinksThe goal of the links analysis is to establish relevant connections between database records. An example is the analysis of items that are usually purchased together, such as a washer and dryer. Such analysis can lead to more effective pricing and selling strategies. Segmenting DatabasesWhen segmenting databases, collections of records with common characteristics or behaviors are identified. One example is the analysis of sales for a certain time period, to detect patterns in customer purchasing behavior. This is an ideal task for a data warehouse. Detecting DeviationsThe fourth and final task involves detecting deviations, the opposite of data segmentation. In this process, the goal is to identify records that vary from the norm, or lie outside of any particular cluster with similar characteristics. Modeling TechniquesThere are several modeling techniques that make data mining activities more productive:
The creation of a predictive model is facilitated through numerous statistical techniques and various forms of visualizations that ease the user’s recognition of patterns. With supervised induction, classification models are created from a set of records, referred to as the’ training set.’ This method makes it possible to infer conclusions from one set of descriptors of the training set to the general. The advantage of this technique is that the patterns are based on local phenomena, whereas statistical measures check for conditions that are valid for an entire population. Association discovery allows for the prediction of the occurrence of some items in a set of records if other items are also present. For example, it is possible to identify the relationship among different medical procedures by analyzing claim forms submitted to an insurance company, enabling a prediction to be made (within a certain margin of error) on a specific treatment protocol. Sequence discovery aids the data miner by providing information on a customer’s behavior over time, since buying patterns tend to follow a cyclic routine. The detection of such a pattern is especially important to catalog companies, because it helps them to target their potential customer base more effectively with specialized advertising catalogs or promotional flyers.
The copyright of the article Analyzing Customer Databases in Customer Relations is owned by Duane Sharp. Permission to republish Analyzing Customer Databases in print or online must be granted by the author in writing.
|
||||||
|
|
||||||
|
|
||||||