18.Analysis of errors(3)-Deviation and Standard Deviation.

Our last post was all about studying different parameters and formulas. In this post let us study two very important concepts, from among various parameters studied in the last post, namely – Deviation and Standard Deviation thoroughly.

A]Deviation-Deviation is a simple concept to understand. It tells us how much, an individual value in a given set, deviates from the average value. Consider the following –

181

Heights of five girls are mentioned above.

Mean = (4+5+4+6+5)/5 = 4.8
∴The average height is 4.8feet. So, the normal height of this group is around 4.8 feet. So, how much does Girl4 deviate from the normal?
Deviation = 6-4.8 = 1.2 feet. So, Girl 4 is 1.2 feet taller than the average girl in this group. Thus, Deviation helps us to find out how far away is our value from the average value of the set.

Dont we use this concept in our daily lives? Consider a student scoring 89% in his exams. Wow! 89 percentile is absolutely a very good score! But what if we later come to know that majority of the students in that class scored 98%? Well, suddenly 89 % seems not a very good score ! Why? because it deviates from the average score in the negative direction!

Conversely, consider a student scoring 67% .This is not a very good score, right? Later we find out that the average score in the class was 40% – suddenly this student seems doing pretty good for himself! See, studying mean and  deviations are important!

B]Standard deviation – This is a very useful concept, slightly difficult to understand as compared to deviation.As mentioned earlier, standard deviation is the ‘mean of the mean’.To understand this concept we need to revise what we learnt earlier about the NORMAL DISTRIBUTION CURVE in post 14.

Standard Deviation is a quantity which tells us how closely our values are either clustered around or are spread away/dispersed from the mean.If most values are near the mean, then we get a tall and steep bell-shaped curved.If the values are spread out then we get a more spread out curve as follows –

 

182

We have already studied the formula for Standard deviation (s/σ) in earlier post.
standard-deviation-formula copy             and                standard-deviation-formula.

 

Consider three sets of values –

SET 1 ⇒ {0,13,15,0}     Mean = 7   Standard Deviation = 8.124 ≈ 8.1
SET 2⇒ {0,8,15,5}       Mean = 7   Standard Deviation = 6.271 ≈ 6.3
SET 3⇒ {6,6,8,8}         Mean= 7    Standard Deviation =1.15     ≈ 1.1

(Note – I have just plugged in the respective values in the formula for standard deviation(s) above).

If we take a look at the values above, we see that though the mean for all the three sets is the same, the standard deviation values differ.So mean tells us where the centre of the set of values is and standard deviation tells us how all the values in the set are spread around the mean value.In the first set, as all the values are far away from the mean value(7), the standard deviation is large.In the next set, the values are slightly closer to 7 ,thus the standard deviation value is slightly less.In the third set, all the values are around the mean value and so the standard deviation is very less(just 1.15!) .We should expect a tall bell-shaped curve for SET 3(low standard deviation) and a spread out one(high standard deviation) for SET 1.

To understand this in a better way , see the curve below –

 

183.jpg
Gaussian curve

 

 

In the above curve –

The purple region is the area which is one standard deviation away from the mean(σ+1).This is where most values in the set lie , so it shows where the majority of population lies.This population represents low standard deviation values as they are closer to the mean value.

The green region is the area which is two standard deviations away from the mean(σ+2).This is where values which are slightly far away from the mean lie.This area is less than the purple area for a normal distribution curve.

The blue region is the area which is three standard deviations away from the mean(σ+3).The values in this area are very far away from the mean , so this region represents extreme conditions , which are very rare.Thus, the area of this region is very small (as the frequency of occurrence is very less/rare).

Area under purple region≈ 68% i.e 68% of the values in the given set lie in this region.These values are considered to be normal/average values.
Area under purple region+Area under the green region≈ 95% i.e 68% of the values in the given set lie in this region.
Area under purple region+Area under the green region+Area under the blue region =100%i.e all the values in the given set lie in this region.The 5% of the area which is occupied by the blue region is the area of values which extremely depart from the normal.

Do you know how the discovery of Higgs Boson was presented at CERN? It was as follows –

We have observed a new boson with a mass 0f 125.3± 0.6 GeV at 4.9 σ significance!”

This is how the standard deviation concept is used in the scientific world.I hope the concept of standard deviation is clear with this post.In my next post, we start discussing  the concept of significant numbers.

Be a perpetual student of life and keep learning …

Good day !

 

 

References And Further Reading –

  1. http://www.robertniles.com/stats/stdev.shtml
  2. https://en.wikipedia.org/wiki/Standard_deviation
  3. A documentary – ‘Particle Fever’.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s