Organising Data by Grouping- Stem and Leaf Diagram
The simlest method is a stem and leaf diagram.
It shows the isribution of th data.
3 of 13
Grouped Frequency distributions
These are used for summarising large sets of data.
No rules for selecting the groupings ( between 5 and 15 groupings- use equal class widths if possible.)
ou often have to use the u.c.b ad te l.c.bs of the intervals to make sure there are no gaps.
In a histogram the frequency is represented by te aea of the bars.
Area is roportional to Frequency.
Frequency density= Frequency
Class width
4 of 13
Extra Histogram Questions
A histogram represents the age group distribution of people buying a magazine in a newsagent. There were 15 people aged 15-19 and the height and width of the rectangle are 8cm and 1cm respectively.
a) If there were 20 people aged 35-49, what is the height of the rectangle representing this group?
Sketch a rectangle and label 8 cm and 2 cm on it. 8cm squared represents 15 people.
so 1cm squared represents 15/8 people and 1 person represents 8/15 cm squared. also 1cm width represents a class width of 5.
width= 3 cm. Area= 8/15 x 20=32/3 3h=32/3 so h=3 and 5/9cm
5 of 13
Organising Data by ordering- Median and Quartiles.
In an Ordered set of Data the median is the middle value and it is an average.
The range is a measure of spread and is the highest value - the lowest value.
The interquartile range is the highest quarlie - the lowest quartile and is the spread around te middle 50% of the data (it ignores extreme values or anomalies.)
Fo n values (a list or ungrouped frequency table)
6 of 13
To find Q2
Find n+1 to find the position of Q2 in the data.
2
7 of 13
To find Q1
Find n
4
If it is a decimal the round up to te next whole number to get position Q1 in the whole of the data.
If it is a whole number the Q1 is the mean of the value at this position and the next value in the data.
8 of 13
To find Q3
Find 3n
4
Then proceed as for Q1
9 of 13
Decilesand Percentiles
Found in a similar way to Q1 and Q2
Deciles:
3rd decile: D3= (3n/10)
Percentile:
92nd percentile: P92= (92n/100)
Then proceed as with Q1 and Q3.
10 of 13
Box Plots
Two basic comparisons to make when comparing box plots are the interquartile range and the median.
11 of 13
Outliers
Outliers are te extreme high or low values.
There are lots of rules for identifying them.
A common rule is that a value is an outlier if it is:
<Q1 - (1.5(Q3-Q1))
>Q3 + (1.5(Q3-Q1))
It all depends on how big the interquartile range is.
On a box plot diagram an outlier is marked with a cross.
The highest ranger is the next lowest number inside the boundary.
12 of 13
Linear Interpolation
If we are working with data that is already grouped then we do not have the actual values and so can only find an estimate for the median and quartiles.
This method is called linear interpolation.
To find the median:
Work out which class the median is in.
Work out how far into the interval (which is x) the median is and then calculate
Comments
No comments have yet been made