<P> Naive interpretation of statistics derived from data sets that include outliers may be misleading . For example, if one is calculating the average temperature of 10 objects in a room, and nine of them are between 20 and 25 degrees Celsius, but an oven is at 175 ° C, the median of the data will be between 20 and 25 ° C but the mean temperature will be between 35.5 and 40 ° C. In this case, the median better reflects the temperature of a randomly sampled object (but not the temperature in the room) than the mean; naively interpreting the mean as "a typical sample", equivalent to the median, is incorrect . As illustrated in this case, outliers may indicate data points that belong to a different population than the rest of the sample set . </P> <P> Estimators capable of coping with outliers are said to be robust: the median is a robust statistic of central tendency, while the mean is not . However, the mean is generally more precise estimator . </P> <P> In the case of normally distributed data, the three sigma rule means that roughly 1 in 22 observations will differ by twice the standard deviation or more from the mean, and 1 in 370 will deviate by three times the standard deviation . In a sample of 1000 observations, the presence of up to five observations deviating from the mean by more than three times the standard deviation is within the range of what can be expected, being less than twice the expected number and hence within 1 standard deviation of the expected number--see Poisson distribution--and not indicate an anomaly . If the sample size is only 100, however, just three such outliers are already reason for concern, being more than 11 times the expected number . </P> <P> In general, if the nature of the population distribution is known a priori, it is possible to test if the number of outliers deviate significantly from what can be expected: for a given cutoff (so samples fall beyond the cutoff with probability p) of a given distribution, the number of outliers will follow a binomial distribution with parameter p, which can generally be well - approximated by the Poisson distribution with λ = pn . Thus if one takes a normal distribution with cutoff 3 standard deviations from the mean, p is approximately 0.3%, and thus for 1000 trials one can approximate the number of samples whose deviation exceeds 3 sigmas by a Poisson distribution with λ = 3 . </P>

Which is the mean of the data including any outliers