Statistics Flashcards

(55 cards)

1
Q

What is the most important form of central tendency? (3)

A
  • Standard deviation
  • The bigger the standard deviation the bigger the dispersion of data
  • Used to analyse the variability of returns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is discrete data? (6)

A
  • Finite data
  • Cannot be split up
  • Single digits
  • Categorical - yes/no
  • Ordinal - first class, second class
  • Descriptive - measurements are summarised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is continuous data? (3)

A
  • Constant scale
  • Infinite digits
  • Data can be split
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the limitation of continuous data? (1)

A
  • Accuracy depends on the measuring equipment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a population? (3)

A
  • Set of items with desired characteristics
  • Forms a complete data set
  • Can be time consuming to collect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a sample? (1)

A
  • A subset of a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a random sample? (1)

A
  • Each item in a population has an equal chance of being selected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the forms of non-random sampling? (6)

A
  • Quota sampling
  • Stratified sampling
  • Snowball
  • Judgement
  • Systematic
  • Convenience
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is quota sampling? (1)

A
  • Categorises individuals into groups until each reaches a ‘quota’
    e.g. 50 boys and 50 girls
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is stratified sampling? (3)

A
  • Population is divided into smaller groups ‘stratas’
  • Once divided, sample is taken from each group in the proportion they exist in the population e.g. 20% Afro Caribbean, 10% Asian in a school
  • Designed to reduce sampling error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is systematic sampling? (1)

A
  • nth record of a population is selected from a random starting point e.g. every 5th student
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is convenience sampling? (1)

A
  • Choosing a sample that is the easiest and quickest to collect information e.g. first person you see to get feedback
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is snowball sampling?

A
  • Starts with a few random participants
  • Relies of referrals and recruitment
    e.g. one Bangladeshi refers another in Brick Lane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is judgment sampling? (2)

A
  • Making a judgement of the sample that would be most relevant and represent the population
  • Also called purposive sampling
    e.g. software developer feedback for new IT systems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How is continuous data presented? (4)

A
  • Time series graphs
  • Histograms
  • Semi-log graphs
  • Scatter graphs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a semi log graph? (3)

A
  • Illustrates rate of change
  • Y axis - logarithmically - X axis - linear
  • Steeper the gradient, quicker the growth. Flatter the gradient, slower the growth
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a time series graph? (3)

A
  • Most commonly used in finance (share prices)
  • Shows data over time
  • Chronological order
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a histogram? (2)

A
  • Shows frequency density and data distributions in intervals
  • Area under the bar represents the frequency (not height of bar)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is discrete data presented? (2)

A
  • Pie charts
  • Bar charts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How is the slice in a pie chart calculated? (1)

A
  • n/total*360
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the 3 central tendencies and what are their measures of dispersion? (3)

A
  • Mean - standard deviation
  • Median - Interquartile range (upper quarter minus the lower quarter)
  • Mode - Range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the mean? (4)

A
  • Standard deviation
  • Simple arithmetic mean
  • Used for addition
  • x = sum of X / n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the problem with standard deviation? (2)

A
  • Does not capture the entire population
  • May miss extreme values
24
Q

What is the problem with mode and range? (3)

A
  • There may be more than one mode - bi-modal, tri-modal
  • May be no mode at all
  • Range can be distorted by extreme values
25
What are the assumptions of a normal distribution curve? (3)
- Mean, mode and median all equal the same - The larger the SD the larger the dispersion of data - When extreme conditions occur, the curve will have ‘fat tails’
26
What are the empirical assumptions of the normal distribution curve for standard deviation? (5)
- 50% of observations fall on either side of the curve (mirrored) - Approx 68% of observations will be within 1 standard deviation either side of the mean - Approx 95% will be 2 standards deviations either side of the mean - Approx 99.7% will be 3 standards deviations either side of the mean - Calculated by area under the curve
27
What is the limitation of sample standard deviation? (1)
- Doesn't account for extreme values
28
Why is there a n-1 is sample SD? (2)
- Sample SD is greater than population - To represent the variability that can occur for a population than a sample doesn't capture
29
What are the assumptions of skewed distribution curves? (2)
- If the curve skews to the left, it is positively skewed. Mode, median, mean - If the curve is skewed to the right, it is negatively skewed. Mean, median and mode - alphabetical order
30
What is geometric mean? (2)
- The average rate of change over a given period - Used to measure compound growth
31
What is covariance? (4)
- Measure of the relationship between two variables, e.g. two share prices - If the variables move in the same direction, positive covariance relationship - If the variables move in the opposite direction, negative covariance relationship - If the variables are moving independently of each other, the covariance is zero
32
What is correlation coefficient? (1)
- Correlation coefficient measures the strength of the relationship between two variables, such as two share prices
33
Does correlation imply causation?
- No. Correlation could be influenced by a third party and market conditions e.g war
34
What is autocorrelation? (1)
- Can be used to predict the future of an asset
35
How do you calculate standard deviation?
36
How do you calculate bivariate linear regression?
37
How do you calculate the geometric mean?
38
What is primary data? (2)
- Data collected directly - With purpose in mind
39
What is secondary data? (2)
- Data collected indirectly - With no purpose in mind
40
What is a parameter? (1)
- A number used to describe a population e.g. the mean of a population
41
What is a frequency distribution table? (2)
- Categorises the number of times something has occurred
42
What is a relative frequency distribution table? (2)
- Allows us to see the category in comparison with the total frequency. - Each frequency is a total of the whole
43
What is a cumulative frequency distribution table? (1)
- Identifies the number of times something occurs up to the category under investigation
44
What are the 3 types of correlation? (3)
- Perfectly negative (-1) - movement in the oppposite direction - Uncorrelated (0) - movement in independent directions - Perfectly positive (1) - movement in the same proportion
45
What is the relationship between correlation and diversification?
- The lower the correlation of returns, the better the diversification
46
How is the line of best fit calculated? (4)
- Using the Least Squares Methods - BLUE - best linear unbiased estimator - Minimises the sum of the errors squared - This measure is referred to as R^2
47
What is R squared? (2)
- Coefficient of determination - Ranges from 0 to 100
48
What is multivariate linear regression? (2)
- Adds more independent variable (x) to predict the dependent (y) variable - By adding more factors, R squared increases
49
What is data mining? (3)
- Sorts through large data to identify patterns and relationships - Large data, often goes beyond regression analysis - Machine learning typically applied
50
What is interquartile range? (4)
- Measure of dispersion for median - 3rd quartile - 1st quartile = IQR - The spread of the middle 50% of items in a data set - Not distorted by extreme values as top and bottom of data are removed
51
What are scatter diagrams? (2)
- Identify a correlation between 2 variables - Identifies patterns
52
What is extrapolation? (2)
- Conclusions drawn from data outside of the data range - Predicting based on existing data
53
What is interpolation? (1)
- Conclusions drawn from within a data range
54
Why do we use bivariate linear regression? (1)
- To predict outcomes on the y-axis
55
What is the rule with R square? (1)
- Higher the number, the higher the accuracy of predictive power