<Dd> Rejection Region = t α / 2 (n − 1) n n − 2 + t α / 2 2 (\ displaystyle (\ text (Rejection Region)) (=) (\ frac ((t_ (\ alpha / 2)) (\ left (n - 1 \ right))) ((\ sqrt (n)) (\ sqrt (n - 2 + (t_ (\ alpha / 2) ^ (2))))))); </Dd> <P> where t α / 2 (\ displaystyle \ scriptstyle (t_ (\ alpha / 2))) is the critical value from the Student t distribution with n - 2 degrees of freedom, n is the sample size, and s is the sample standard deviation . To determine if a value is an outlier: Calculate δ = (X − m e a n (X)) / s (\ displaystyle \ scriptstyle \ delta = (X-mean (X)) / s). If δ> Rejection Region, the data point is an outlier . If δ ≤ Rejection Region, the data point is not an outlier . </P> <P> The modified Thompson Tau test is used to find one outlier at a time (largest value of δ is removed if it is an outlier). Meaning, if a data point is found to be an outlier, it is removed from the data set and the test is applied again with a new average and rejection region . This process is continued until no outliers remain in a data set . </P> <P> Some work has also examined outliers for nominal (or categorical) data . In the context of a set of examples (or instances) in a data set, instance hardness measures the probability that an instance will be misclassified (1 − p (y x) (\ displaystyle 1 - p (y x)) where y is the assigned class label and x represent the input attribute value for an instance in the training set t). Ideally, instance hardness would be calculated by summing over the set of all possible hypotheses H: </P>

Which of the following best define removal of outliers