1. What type of dataset would you expect to be more stable 2. and why

1. A larger dataset 2. Outliers have less of an effect on the standard deviation

C5 - Statistical Data Flashcards by Alex Foster

What is the main measure of the underlying risk

Exposure measure

How well did you know this?

Not at all

Perfectly

In addition to the exposure measure, what other elements need to be considered

Risk factors

The elements of additional consideration that help define a risk

How well did you know this?

Not at all

Perfectly

What would the risk factors be called if they were used to determine the level of premium to charge

Rating factors

e.g 1.5 = base premium * 1.5

How well did you know this?

Not at all

Perfectly

What are 4 common rating factors used by household insurers

Age of the policyholder
Number of bedrooms
Number of bathrooms
Number of floors

How well did you know this?

Not at all

Perfectly

What is the premium called before and after a rating factor has been considered

Before = Base premium

After = Risk-adjusted premium

How well did you know this?

Not at all

Perfectly

Other than exposure measure and risk factors, what 3 other things need to be considered when assessing a risk

The coverage of the policy
External/Environmental factors
Economic factors (e.g. inflation)

How well did you know this?

Not at all

Perfectly

How do insurers account for changes over time when assessing historic premiums and claims

They adjust them to account for the perceived increase/decrease in overall risk present at the time to make the data more appropriate for comparison

How well did you know this?

Not at all

Perfectly

What are 4 important attributes when assessing large volumes of data

The minimum value within the set
The maximum value within the set
The spread of data
The averages contained within

How well did you know this?

Not at all

Perfectly

What are the 3 most common measures of average

Arithmetic mean
Median
Mode

How well did you know this?

Not at all

Perfectly

Why are averages important

They allow us to summarise large data sets and represent them by a single point.

Making them more manageable point of reference

How well did you know this?

Not at all

Perfectly

What is the calculation for arithmetic mean

sum(values) / # of values

The typical calculation of average

How well did you know this?

Not at all

Perfectly

What another way of saying average

A measure of central tendency

How well did you know this?

Not at all

Perfectly

What is the median

The value exactly halfway through a list of values arranged in ascending order

Where there’s an even number of values it is the mean of the two central values

How well did you know this?

Not at all

Perfectly

What is the mode

The value which occurs most frequently in the dataset

How well did you know this?

Not at all

Perfectly

In what 2 circumstances would the mode be a more suitable measure than the mean or median

Where data is dominated by outliers
For preferences (i.e. the most popular)

How well did you know this?

Not at all

Perfectly

In what circumstance is the median an inappropriate measure

Where valuable data is ignored by simply taking the middle value

How well did you know this?

Not at all

Perfectly

Other than the 3 measures of average, what 4 other measures are useful when assessing a large dataset

Maximum value
Minimum value
Range - Difference between largest and smallest value
Variance - A measure of the spread of data

How well did you know this?

Not at all

Perfectly

What does the standard deviation tell us about the data

How far the data varies around the mean

How well did you know this?

Not at all

Perfectly

What data point would suggest a volatile claim’s experience

A high standard deviation

How well did you know this?

Not at all

Perfectly

What is the variance also referred to as

The standard deviation squared

How well did you know this?

Not at all

Perfectly

What type of dataset would you expect to be more stable
and why

Study These Flashcards

A larger dataset
Outliers have less of an effect on the standard deviation

What is the calculation for the standard deviation

Study These Flashcards

Take the difference of each number from the mean
Squaring these numbers
Taking the mean of these values
Take the square root of this number

What is the calculation for the variance

Study These Flashcards

Take the difference of each number from the mean
Squaring these numbers
Taking the mean of these values

When a dataset is large, what is a more time effective way of calculating averages

Study These Flashcards

Calculating the average of frequency distributions

What is a frequency distribution

Where the number of times a value is counted in the dataset for example: x = value f = frequency 1,3,1,2,6,5,1,3, 2 x = 1, f = 3 x = 2, f = 2 x = 3, f = 2 x = 4, f = 0 x = 5, f= 1 x = 6, f= 1

How do you get the arithmetic mean from a frequency distribution

- Multiply the frequency by the value, i.e. fx - Sum all values together, sum(fx) - Divide by the total frequency, sum(f) sum(fx)/sum(f)

What are 3 important features of using arithmetic mean

1. It incorporates ALL the underlying data 2. It can be easily distorted by outliers 3. Can be tricky to use when only a whole number is possible (i.e 2.4 people living in a home)

What can frequency distributions also be used for

To compare different subsets of data e.g size of theft claims vs size of injury claims

What is a statistical distribution

A listing of all possible values for a measure and an indication of how often they are expected to occur

What are the 2 types of distribution

1. Normal distribution 2. Poisson distribution

What shaped curve is the normal statistical distribution

Bell-shaped Highest point is on the average and tails off as you go to the maximum and minimum

What does the Poisson distribution express

The probability of a certain number of events occurring within a given time frame (e.g. goals per season)

What types of losses are more predictable and have a lower level of uncertainty or variance

High frequency/low severity

How do underwriters calculate the expected value of claims

By combining the frequency and severity probabilities

What letter is used to denote probability

What is the relative frequency method of calculating probability

Using historical data to predict the future If a particular type of property has had 5 losses in the last 100 years then the associated probability of a loss occurring in a given year is 1/20.

What method of calculating probability uses underwriter skill and judgement

Subjective probabilities

What are the 2 main ways of calculating probability

1. Relative frequency 2. Subjective probabilities

When is subjective probabilities a more appropriate method of calculating probabilities

When there is little or no historical data to project forward

How does subjective probability calculate probability

By assigning certain probabilities to certain outcomes using the underwriter's judgement

What type of tool is used to predict an insurer's exposure to low frequency/high severity claims

A stochastic model It simulates scenarios of claim frequency, amounts and timing

What types of claim arise from a single event

Catastrophe claims

Other than low frequency/high severity claims, what is another type of claim that is difficult to predict

Latent claims Where there is a long delay between incidence and manifestation, for example, asbestosis or pollution related claims

How do you calculate the expected value (or number) of future claims

- Multiply each outcome by the probability that it will occur - Sum these values together

What does the law of large numbers suggest

That the actual number of events occurring will tend towards the expected number, where there are a large number of similar events

What is paramount for underwriters to do in order to sustain a common pool

Charge an equitable premium i.e. one that represents the degree of risk they bring to the pool

What is the most valuable tool when underwriting homogeneous risks

A large homogeneous dataset to predict the likelihood of future claims

What is the biggest prevention of predicting future performance

Poor data quality

What are 6 ways in which past data is not indicative of the future

1. Changes in the underlying risks coverage 2. Changes in underwriting policy (e.g. exclusions or policy limits) 3. Changes in exposure over time 4. Changes in legislation 5. Changes in inflation 6. The accuracy of the data sets

Other than data, what is the other important element to consideration of the risk

Commercial experience from UWs, actuaries and statisticians

What are 4 examples of how technology has affected data capture and analysis

1. Data can be captured in real time (improving relevance and accuracy) 2. Telematics in motor has allowed for more accurate data 3. Satellites and drones now improve other lines of business 4. AI now allows for collection and storage or larger sets of data

C5 - Statistical Data Flashcards

(51 cards)