16) Working with data Flashcards

(35 cards)

1
Q

Name the different statistical diagrams ?

A
  • Cumulative frequency graphs
  • Histograms
  • Box-and-whisker plot
  • Stem-and-leaf diagrams
  • Frequency polygons
  • Scatter diagrams
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you draw a cumulative frequency graph ?

A
  • Calculate the cumulative frequency of each data range
  • Plot the points, at the top of each range on the x axis
  • X axis is a variable , y axis is cumulative frequency
  • Draw a curve joining the points smoothly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you find the median, LQ and UQ using a cumulative frequency graph ?

A
  • Median : Halve the cumulative frequency and find the corresponding x value
  • LQ : Divide the cumulative frequency by 4 and find corresponding x value
  • UQ : Multiple the cumulative frequency by 3/4, find the corresponding x value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you find a y percentile using a cumulative frequency graph ?

A
  • eg. 90th percentile = 0.9 x cumulative frequency, and then find the corresponding x value
  • eg. 45th percentile = 0.45 x cumulative frequency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to find the IQR using a cumulative frequency graph ?

A
  • Find the corresponding UQ and LQ values
  • UQ - LQ = IQR
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for histograms ?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you draw a histogram diagram ?

A
  • Calculate the frequency density form the data provided
  • The x axis is the x variable groups, the y axis is the frequency density
  • Draw columns corresponding to the width of the groups and the frequency density
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the assumption you make when calculating the mean and standard deviation from a histogram ?

A

The data is at the centre of each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you draw a box-and-whisker plot ?

A
  • Find the median, LQ, UQ, maximum and minimum
  • Plot the values using lines
  • The y axis is non existent, the x axis is the variable x groups
  • Connect the lines to makes a box shape
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does each section of a box-and-whisker plot represent ?

A

25 % of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If a box-and-whisker plot showed the scores students got when doing different tests, what piece of information is required to determine which test was easier ?

A

To determine this you would need to make a direct comparison with the same cohort of students taking both exams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What piece of information does a box-and-whisker plot not show ?

A
  • It does not show how many individual values are found in the data
  • Eg. If it shows the scores obtained by students on an exam, it does not show the number of students taking the exam
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you compare two data sets presented on a box-and-whisker plot ?

A
  • Comment on average : The median of A is greater than median of B meaning on average X is greater in A than b
  • Comment on spread : The range/ IQR of A is greater than B, meaning A has a greater spread of X
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is one piece of information missing from a box-and-whisker plot ?

A

The number of individual pieces of data in each section
Eg. When showing scores attained on an exam, it doesn’t show the number of students taking the exam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you draw a stem-and-leaf diagram ?

A
  • Make a key
  • Organise the data in terms of size order
  • Plot the values on the diagram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you compare tow data sets on a stem-and-leaf diagram ?

A
  • Comment on the average : A is greater on average
  • Comment on the spread : The spread is similar for A and B/ A has a greater spread than B
  • comment on the shape of the distribution : The distribution of A is more symmetrical
17
Q

What are the advantages of a stem-and-leaf diagram over a box-and-whisker plot ?

A
  • Shows individual data values
  • Shows the shape of the distribution
  • Allows you to easily determine the mean and the mode
18
Q

What are the advantages of a box-and-whisker plot over a leaf-and-stem diagram ?

A
  • Easier to find the median, UQ, LQ
  • Easier to calculate the IQR range
  • Allows outliers to be easily identified
  • Allows distribution to be compared between multiple groups / more than two
19
Q

What is the equation for the standard deviation ?

20
Q

How do you comment on the mean and standard deviation of two data sets ?

A
  • A has a greater mean than B so on average X is greater in A
  • A has a greater standard deviation than B so has a greater spread of X than B/ B has a smaller standard deviation so has more consistent X values than A
21
Q

If there is one extreme, comment on the validity in the comparison between the means of the two data sets ?

A
  • This is not an accurate comparison
  • The mean for data set A is affected / skewed by one extreme or outlier resulting in a low/high mean
22
Q

What may be another reason the standard deviation between two data sets may not be compared ?

A

Without a scale the stands deviation cannot be compared between two data sets on a test/ cannot be points scored compared to SD of percentage )

23
Q

What is the equation for the standard deviation using frequency tables ?

24
Q

What is the equation for the mean using frequency tables ?

25
Explain why the mean and standard deviation calculated form frequency tables are only estimates ?
They are only estimates since we have assumed that all the data in each group is at the centre, rather than using actual data values
26
How to increase the accuracy of estimate standard deviation and mean from frequency tables ?
- Use more groups - Measure more accurately
27
What do these scatter diagrams show ?
28
What does each r value mean with regards to scatter diagrams ?
- r = -1 strong negative correlation - r = 0 no correlation - r = 1 strong positive correlation
29
What can be another interpretation of r=0 ?
- This does not mean there is no relationship between the two variables, it just means there is no linear relationship - This means there may be a non-linear relationship
30
Why may the r value be inaccurate for a scatter diagram ?
There may be two separate groups within the data
31
Explain why drawing a conclusion from a scatter diagrams may be inaccurate when r = +-1 ?
- Correlation does not mean causation - The correlation
32
How can you estimate values using a scatter diagram which has a correlation coefficient ?
You can draw a line of best fit / a regression line
33
What is interpolation/ extrapolation ?
- Interpolation : Using a line of best fit to estimate a value within the data range - Extrapolation : Using a line of best fit to estimate a value outside of the data range
34
Comment on the validity of interpolation/ extrapolation ?
- Interpolation is likely to be valid / accurate - Extrapolation is not likely to be accurate / valid
35
What is considered an outlier ?
- More than 1.5 IQR away from the nearest quartile - More than 2 SD from the mean