skip to content »

Validating test

validating test-27

A data mining model is reliable if it generates the same type of predictions or finds the same general kinds of patterns regardless of the test data that is supplied.For example, the model that you generate for the store that used the wrong accounting method would not generalize well to other stores, and therefore would not be reliable.

validating test-16

If you identify changes in the data first, it can be very easy to rationalize why these changes should be obvious, even if you never would have have thought of them before the experiment.The following code example uses the derived class Text Box and validates an e-mail address that the user enters.If the e-mail address is not in the standard format (containing "@" and "."), the validation fails, an Error Provider icon is displayed, and the event is canceled.No single comprehensive rule can tell you when a model is good enough, or when you have enough a measure of how well the model correlates an outcome with the attributes in the data that has been provided.Particularly in the phase of exploration and development, you might decide to accept a certain amount of error in the data, especially if the data is fairly uniform in its characteristics.

For example, a model that predicts sales for a particular store based on past sales can be strongly correlated and very accurate, even if that store consistently used the wrong accounting method.

This section introduces some basic concepts related to model quality, and describes the strategies for model validation that are provided in Microsoft Analysis Services.

For an overview of how model validation fits into the larger data mining process, see All of these methods are useful in data mining methodology and are used iteratively as you create, test, and refine models to answer a specific problem.

These metrics do not aim to answer the question of whether the data mining model answers your business question; rather, these metrics provide objective measurements that you can use to assess the reliability of your data for predictive analytics, and to guide your decision of whether to use a particular iterate on the development process.

The topics in this section provide an overview of each method and walk you through the process of measuring the accuracy of models that you build using SQL Server Data Mining.

Therefore, measurements of accuracy must be balanced by assessments of reliability.