C5 - Statistical Data Flashcards

(51 cards)

1
Q

What is the main measure of the underlying risk

A

Exposure measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In addition to the exposure measure, what other elements need to be considered

A

Risk factors

The elements of additional consideration that help define a risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What would the risk factors be called if they were used to determine the level of premium to charge

A

Rating factors

e.g 1.5 = base premium * 1.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 4 common rating factors used by household insurers

A
  1. Age of the policyholder
  2. Number of bedrooms
  3. Number of bathrooms
  4. Number of floors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the premium called before and after a rating factor has been considered

A

Before = Base premium

After = Risk-adjusted premium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Other than exposure measure and risk factors, what 3 other things need to be considered when assessing a risk

A
  1. The coverage of the policy
  2. External/Environmental factors
  3. Economic factors (e.g. inflation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do insurers account for changes over time when assessing historic premiums and claims

A

They adjust them to account for the perceived increase/decrease in overall risk present at the time to make the data more appropriate for comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are 4 important attributes when assessing large volumes of data

A
  1. The minimum value within the set
  2. The maximum value within the set
  3. The spread of data
  4. The averages contained within
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 3 most common measures of average

A
  1. Arithmetic mean
  2. Median
  3. Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are averages important

A

They allow us to summarise large data sets and represent them by a single point.

Making them more manageable point of reference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the calculation for arithmetic mean

A

sum(values) / # of values

The typical calculation of average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What another way of saying average

A

A measure of central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the median

A

The value exactly halfway through a list of values arranged in ascending order

Where there’s an even number of values it is the mean of the two central values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the mode

A

The value which occurs most frequently in the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In what 2 circumstances would the mode be a more suitable measure than the mean or median

A
  1. Where data is dominated by outliers
  2. For preferences (i.e. the most popular)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In what circumstance is the median an inappropriate measure

A

Where valuable data is ignored by simply taking the middle value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Other than the 3 measures of average, what 4 other measures are useful when assessing a large dataset

A
  1. Maximum value
  2. Minimum value
  3. Range - Difference between largest and smallest value
  4. Variance - A measure of the spread of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the standard deviation tell us about the data

A

How far the data varies around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What data point would suggest a volatile claim’s experience

A

A high standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the variance also referred to as

A

The standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q
  1. What type of dataset would you expect to be more stable
  2. and why
A
  1. A larger dataset
  2. Outliers have less of an effect on the standard deviation
22
Q

What is the calculation for the standard deviation

A
  • Take the difference of each number from the mean
  • Squaring these numbers
  • Taking the mean of these values
  • Take the square root of this number
23
Q

What is the calculation for the variance

A
  • Take the difference of each number from the mean
  • Squaring these numbers
  • Taking the mean of these values
24
Q

When a dataset is large, what is a more time effective way of calculating averages

A

Calculating the average of frequency distributions

25
What is a frequency distribution
Where the number of times a value is counted in the dataset for example: x = value f = frequency 1,3,1,2,6,5,1,3, 2 x = 1, f = 3 x = 2, f = 2 x = 3, f = 2 x = 4, f = 0 x = 5, f= 1 x = 6, f= 1
26
How do you get the arithmetic mean from a frequency distribution
- Multiply the frequency by the value, i.e. fx - Sum all values together, sum(fx) - Divide by the total frequency, sum(f) sum(fx)/sum(f)
27
What are 3 important features of using arithmetic mean
1. It incorporates ALL the underlying data 2. It can be easily distorted by outliers 3. Can be tricky to use when only a whole number is possible (i.e 2.4 people living in a home)
28
What can frequency distributions also be used for
To compare different subsets of data e.g size of theft claims vs size of injury claims
29
What is a statistical distribution
A listing of all possible values for a measure and an indication of how often they are expected to occur
30
What are the 2 types of distribution
1. Normal distribution 2. Poisson distribution
31
What shaped curve is the normal statistical distribution
Bell-shaped Highest point is on the average and tails off as you go to the maximum and minimum
32
What does the Poisson distribution express
The probability of a certain number of events occurring within a given time frame (e.g. goals per season)
33
What types of losses are more predictable and have a lower level of uncertainty or variance
High frequency/low severity
34
How do underwriters calculate the expected value of claims
By combining the frequency and severity probabilities
35
What letter is used to denote probability
P
36
What is the relative frequency method of calculating probability
Using historical data to predict the future If a particular type of property has had 5 losses in the last 100 years then the associated probability of a loss occurring in a given year is 1/20.
37
What method of calculating probability uses underwriter skill and judgement
Subjective probabilities
38
What are the 2 main ways of calculating probability
1. Relative frequency 2. Subjective probabilities
39
When is subjective probabilities a more appropriate method of calculating probabilities
When there is little or no historical data to project forward
40
How does subjective probability calculate probability
By assigning certain probabilities to certain outcomes using the underwriter's judgement
41
What type of tool is used to predict an insurer's exposure to low frequency/high severity claims
A stochastic model It simulates scenarios of claim frequency, amounts and timing
42
What types of claim arise from a single event
Catastrophe claims
43
Other than low frequency/high severity claims, what is another type of claim that is difficult to predict
Latent claims Where there is a long delay between incidence and manifestation, for example, asbestosis or pollution related claims
44
How do you calculate the expected value (or number) of future claims
- Multiply each outcome by the probability that it will occur - Sum these values together
45
What does the law of large numbers suggest
That the actual number of events occurring will tend towards the expected number, where there are a large number of similar events
46
What is paramount for underwriters to do in order to sustain a common pool
Charge an equitable premium i.e. one that represents the degree of risk they bring to the pool
47
What is the most valuable tool when underwriting homogeneous risks
A large homogeneous dataset to predict the likelihood of future claims
48
What is the biggest prevention of predicting future performance
Poor data quality
49
What are 6 ways in which past data is not indicative of the future
1. Changes in the underlying risks coverage 2. Changes in underwriting policy (e.g. exclusions or policy limits) 3. Changes in exposure over time 4. Changes in legislation 5. Changes in inflation 6. The accuracy of the data sets
50
Other than data, what is the other important element to consideration of the risk
Commercial experience from UWs, actuaries and statisticians
51
What are 4 examples of how technology has affected data capture and analysis
1. Data can be captured in real time (improving relevance and accuracy) 2. Telematics in motor has allowed for more accurate data 3. Satellites and drones now improve other lines of business 4. AI now allows for collection and storage or larger sets of data