What are raw scores?
The data that is gathered from participants. All the numbers that have not been organized or graphed or cleaned up.
WHY not use raw data?
* Finding a pattern in raw data is difficult
* We want to visualize and summarize the data
* Need to also inspect for outliers and for data entry errors.
What are the steps to create a frequency distribution table and a grouped frequency table?
Principles to keep in mind for a Grouped Table:
a) you need to determine the full range of data and include the points that have zero frequency (Top Value - Bottom Value: 8 - 3.5 (then + 1) = 5.5)
b) aim for between approx. 5-10 intervals (no less than 5, no more than 15)
c) for continuous data, use lower and upper limits (the lowest and highest possible values)
Frequency Distribution Table
GROUPED FREQUENCY DATA
GROUPED FREQUENCY TABLE (the data initially)
GROUPED FREQUENCY TABLE - for continuous data
HISTOGRAM - for continuous data
What is PIE CHART?
When you want to show proportions of the whole picture.
What is a BAR GRAPH?
2nd way
Visual depictions of data when the independent variable is nominal and the dependent variable is interval (specifically, scale) :
TWO WAYS:
EX: develop a chart demonstrating the cost of tuition (dep. variable) for 3 types of schools - public, semi-public, & private (indep. variable)
1st way
What is a SCATTERPLOT?
Used to depict the relationship between 2 scale variables
ex: amount of abdominal fat & dementia symptoms
What is a HISTOGRAM?
Histogam bar graph
A histogram is a bar graph of data that shows the frequency of each value of a variable. Same info as a frequency table, but visualised differently.
What is the Biased Scale Lie?
What is the Sneaky Sample Lie?
What is an Interpolation Lie?
What is an Extrapolation Lie?
What is an Inaccurate Value Lie?
All of these need to have representative sampling.
#5
What is a normal distribution?
is a graph showing the typical bell curve in the middle – meaning most of the participants scores were in the middle of the graph.
How do positively skewed distributions and negatively distributions deviate from a normal distribution?
Instead of being a ‘normal’ graph with the bell graph in the middle, there is a tail to one side. It is non-normal and non-symmetrical.
POSITIVE — generally has ‘floor’ effects
NEGATIVE — generally has ‘ceiling’ effects
What is the benefit of creating a visual distribution of data rather than simply looking at a list of the data?
to look at the shape of the distribution
What is a floor effect and how does it affect a distribution?
A situation in which a constraint prevents a variable from taking values below a certain point. Pushes the distribution to the LEFT side of the graph (positive skew)
CALCULATING STATS:
What is 63 out of 1264 in %
What is 2 out of 88 in %
What is 7 out of 39 in %
What is 122 out of 300 in %
What type of variable (nominal, ordinal, scale) are these data as counts?
What kind of variable are they as percentages?
Report these to only 2 decimal places?
1888.999
2.6454
0.0833
On a test of marital satisfaction, scores could range from 0 to 27:
1. What is the full range of data, according to the calculation procedure described in this chapter?
2. What would the interval sie be if we wanted six intervals?
3. List the 6 intervals
If you have data that range from 2 - 68 and you want seven intervals in a grouped frequency table, what would the intervals be?
A grouped frequency table has the following intervals:
30-44
45-59
60-74
If converted into a histogram, what would the midpoints be?
Referring to the grouped frequency table (2.6), how many countries had at least 30 volcanoes?
Referring to the histogram (2.1), how many countries had one or two volcanoes?
If the average person convicted of murder killed only 1 person, serial killers would create what kind of skew?
Would the data for number of murders by those convicted of the crime be an example of a floor effect or a ceiling effect?
A researcher collects data on the ages of university students. As you have probably observed, the distributions of age clusters around 19 - 22 yrs, but there are extremees on both the low end (high school prodigies) and the high end (non-traditional students returning to school):
If you have an instagram account, you are allowed to follow up to 7500 other accounts. At that point, Instagram cuts you off, and you have to unfollow ppl to add more. Imagine you collected data from Instagram users at your university about the number of accounts each one follows:
APPLYING THE CONCEPTS:
Frequency tables, histograms, and the National Survey of Student Engagement: The National Survey of Student Engagement (NSSE) surveys U.S. first-year university students and seniors about their level of engagement in campus and classroom activities that enhance learning. Hundreds of thousands of students at almost 1000 schools have completed surveys since 1999, when the NSSE was first administered. Among the many questions, students are asked how often they have been assigned a paper of 20 pages or more during the academic year. For a sample of 19 institutions classified as national universities that made their data publicly available through the U.S. News & World Report Web site, here are the percentages of students who said they were assigned between 5 and 10 twenty-page papers:
0 5 3 3 1 10 2
2 3 1 2 4 2 1
1 1 4 3 5
a. Create a frequency table for these data. Include a third column for percentages.
b. For what percentage of these schools did exactly 4% of the students report that they wrote between 5 and 10 twenty-page papers that year?
c. Is this a random sample? Explain your answer.
d. Create a histogram of grouped data, using six intervals.
e. In how many schools did 6% or more of the students report that they wrote between 5 and 10 twenty-page papers that year?
f. How are the data distributed?