First hand tools which gives first hand information.
- Central tendency of data (Mean, median, mode, geometric mean, harmonic mean etc.)
- Variation in data (variance, standard deviation, standard error, mean deviation etc.)
Gives an idea about the mean value of the data
The data is clustered around what value?
Data: ๐ณ1, ๐ณ2, ......,๐ณn
x : Data vector
mean (x)
prod (x) ^ (1/length (x) )
(length (x) is equal to the number of elements in x)
Median :-
Value such that the number of observation above it is equal to the number of observation below it.
median (x)
Example :-
Variability
spread and scatterdness of data around any point, preferably the mean value.
Data set 1: 360, 370, 380
mean = (360 + 370 + 380) /3 = 370
Data set 2: 10, 100, 1000
mean = (10 + 100 + 1000) /3 = 370
How to differentiate between the two data sets?
x : data vector
var (x)
positive square root of variance : standard deviation
sqrt (var (x) )
Variance
Another variant,
If we want divisor to be n, then use
( (n-1) /n) * var (x)
where n = length (x)
Range:
maximum(x1, x2, ....., xn) - minimum(x1, x2, ...., xn)
max (x) - min (x)
Interquartile range:
Third quartile (x1, x2, ..., xn) - First quartile (x1, x2, ...., xn)
IQR (x)
Quartile deviation:
[Third quartile (x1, x2, ..., xn) - First quartile (x1, x2, ..., xn)]/2
= Interquartile range/2
IQR (x) /2
Example :-