What is the Outlier Formula?

An outlier is the data point of the given sample, observation, or distribution that shall lie outside the overall pattern. A commonly used rule says that one will consider a data point an outlier if it has more than 1.5 IQR below the first quartile or above the third quartileQuartileQuartile Formula is a statistical tool to calculate the variance from the given data by dividing the same into four defined intervals. First Quartile could be calculated as follows: (Q1) = ((n + 1)/4)th Term.read more.

Said differently, low outliers shall lie below Q1-1.5 IQR, and high outliers shall lie Q3+1.5IQR.

One needs to calculate medianCalculate MedianThe median formula in statistics is used to determine the middle number in a data set that is arranged in ascending order. Median ={(n+1)/2}thread more, quartiles, including IQR, Q1, and Q3.

The outlier formula is represented as follows,

The Formula for Q1 = ¼ (n + 1)th term The Formula for Q3 = ¾ (n + 1)th term The Formula for Q2 = Q3 – Q1

You are free to use this image on you website, templates, etc., Please provide us with an attribution linkHow to Provide Attribution?Article Link to be HyperlinkedFor eg:Source: Outlier Formula (wallstreetmojo.com)

Step by Step Calculation of Outlier

Example

Consider a data set of the following numbers: 10, 2, 4, 7, 8, 5, 11, 3, 12. You are required to calculate all the Outliers.

  • First calculate the quartiles i.e., Q1, Q2 and interquartile Now calculate the value Q2 * 1.5 Now Subtract Q1 value from the value calculated in Step2 Here Add Q3 with the value calculated in step2 Create the range of the values calculated in Step3 and Step4 Arrange the data in ascending order Check whether there any values that lie below or higher than the range created in Step5.

Solution:

First, we need to arrange data in ascending order to find the median.

2, 3, 4, 5, 7, 8, 10, 11, 12

Since the number of observations is odd, which is 9, the median would lie in the 5th position, which is 7, and the same will be Q2 for this example.

Therefore, the calculation of Q1 is as follows –

Q1 = ¼ (9 + 1)

= ¼ (10)

Q1 will be – 

Q1 = 2.5 term

It means that Q1 is the average of the 2nd and 3rd position of the observations, which is 3 and 4 here, and an average of the same is (3+4)/2 = 3.5.

Therefore, the calculation of Q3 is as follows –

Q3 = ¾ (9 + 1)

= ¾ (10)

Q3 will be – 

Q3 = 7.5 term

It means that Q3 is the average of the 7th and 8th position of the observations, which is 10 and 11 here, and an average of the same is (10+11)/2 = 10.5.

Low outliers shall lie below Q1-1.5IQR, and high outliers shall lie Q3+1.5IQR.

So, the values are 3.5 – (1.57) = -7 and higher range is 10.5 + (1.57) = 110.25.

Since no observations lie above or lower than 110.25 and -7, we don’t have any outliers in this sample.

Example of Outlier Formula in Excel (with Excel Template)

Below is given data to calculate the outlier.

The number of observations here is 25, and our first step would be converting the above raw data in ascending order.

Median will be –

The median value = ½ (n+1)

= ½ = ½ (26)

= 13th term

The Q2 or median is 68.00

Which is 50% of the population.

Q1 will be –

Q1 = ¼ (n+1)th term

= ¼ (25+1)

= ¼ (26)

= 6.5th term, which is equivalent to 7th term

The Q1 is 56.00, which is bottom 25%

Q3 will be –

Finally, Q3 = ¾ (n+1)th term

= ¾ (26)

= 19.50 term

Here the average needs to be taken, which is of 19th and 20th terms, which are 77 and 77, and the average of same is (77+77)/2 = 77.00

 The Q3 is 77, which is the top 25%

Low Range

Now, low outliers shall lie below Q1-1.5IQR, and high outliers shall lie Q3+1.5IQR

High Range –

So, the values are 56 – (1.568) = -46 and higher range is 77 + (1.568) = 179.

There are no outliers.

Relevance and Uses

The outliers formula is very important to know as there could be data that would get skewed by such a value. Take an example of observations 2, 4, 6, and 101. Now, if somebody takes an average of these values, it will be 28.25, but 75% of the observations lie below 7. Hence, one would be an incorrect decision regarding the observations of this sample.

One can notice here that 101 appears to outline, and if removed, the average would be 4, which does say about the values or observations that they lie within the range of 4. Hence, it is very important to conduct this calculation to avoid misusing leading information of the data. These are widely used by statisticians around the world whenever they are conducting any research.

This article is a guide to Outlier Formula. Here, we discuss a step-by-step calculation of outliers, some practical examples in Excel, and a downloadable Excel template. You can learn more about Excel modeling from the following articles: –

  • What is Quartile Deviation?What Is Quartile Deviation?Quartile deviation is based on the difference between the first quartile and the third quartile in the frequency distribution and the difference is also known as the interquartile range, the difference divided by two is known as quartile deviation or semi interquartile range.read moreQUARTILE Function in ExcelQUARTILE Function In ExcelQuartile functions are used to find the various quartiles of a data set and are part of Excel’s statistical functions. There are three quartiles; the first quartile (Q1) is the middle number between the smallest value and the median value of a data set. The second quartile (Q2) is the median of the data. The third quartile (Q3) is the middle value between the median of the data set and the highest value.read moreFrequency Excel FormulaFrequency Excel FormulaThe FREQUENCY function in Excel calculates the number of times a data values occurs within a given range of values and returns a vertical array of numbers corresponding to each value’s frequency within a range.read moreFind Mode in ExcelFind Mode In ExcelThe MODE Function in Excel is a statistical function that returns the most often occurring value in a dataset. In case there are multiple modes, it will return the lowest one. read moreInterest on LoanInterest On LoanThe term “interest on loan” refers to the amount that a borrower is obligated to pay or a depositor is supposed to earn on a principal sum at a pre-determined rate, which is known as the rate of interest and the formula for interest can be derived by multiplying the rate of interest, the outstanding principal sum and the tenure of the loan or deposit.read more