Ask any Statistics/Probability/Math Question
Descriptive Statistics
After collection of data we need to summarize the information contained in any set of data. The purpose is served to some extent by classifying the data in the form of a frequency distribution and using various graphs. When the data relate to a variable, the process of summarization can be taken a long step further by using certain descriptive measures. The aim is to focus on certain features of the data which will describe their nature in a general way. The two most important features are central tendency and dispersion.
Measures of Central Tendency
Quite often there will be found in the data a tendency, not withstanding their variability to cluster around a central value .In such a case, it would be legitimate to use a single value, the central value to represent the whole set of figures. Such a representative or typical value of a variable is called a measure of central tendency.
Mean
The mean is obtained by dividing the sum of its given values by their number. The mean of X is given by,
Xbar = 1/n x Xi ,i=1,….,n. (For non frequency data)
Xbar= 1/n x Xifi,i=1,….,n.(For frequency data ungrouped)
For grouped frequency data we consider mid values.
Median
If the given values of X are arranged in an increasing or decreasing order of magnitude, the middlemost value in the arrangement is called the median. The median may alternatively be defined as a value of X such that half of the given values of X are smaller than or equal to it and half are greater than or equal to it.
When n is odd the middlemost value will be n(n+1)/2 th value. And when it is even, it will be any number between the n/2 th and (n/2+1)st values of X in the arrangement. However for definiteness the arithmetic mean of the n/2th and (n/2+1)st values is accepted as the median of X.
For frequency data (grouped) the formula for median is as follows:
Mi= Xl+ (n/2nl)/f0 x c
Xl denotes the lower class boundary of the class containing the median and the corresponding cumulative frequency is nl.
f0= the frequency of the class interval containing the median.
c= width of the class interval containing the median.
Mode
The mode of a variable is the value of the variable having the highest frequency.
For grouped frequency data
Mode= Xl+{ f0f(1)}/ (2f0 f(1)f(1)} x c
Xl denotes the lower class boundary of the class containing the highest frequency.
f0 denotes the highest frequency i.e, frequency of the modal class.
f(1) denotes the frequency of the class preceding the modal class.
f(1) denotes the frequency of the class succeeding the modal class.
c width of the modal class.
Measures of Dispersion
In order to give a proper idea about the overall nature of the given values of a variable it is necessary besides mentioning the average value to state how scattered or dispersed the given values are about the average.
Range
The simplest measure is range. It is defined as the difference between the highest and the lowest given values.
Variance and Standard Deviation
The variance of x is defined as follows:
Var(X)= 1/n x (Xixbar)^2(For non frequency data)
Var(X)=1/n x (Xixbar)^2 x fi (For frequency data)
The standard deviation is given as:
SD(X)=v( 1/n x (Xixbar)^2) (For non frequency data)
SD(X)=v( 1/n x (Xixbar)^2.fi) (For frequency data)
