QUANT METHODS 1, 3 ... Flashcards

Question

1.2 Which measure for return do PMs usually prefer?

Answer 1

NOTE - the time-weighted rate of return is not affected by the timing of cash inflows and outflows. in the IM industry, time-weighted return is the preferred method of performance measurement because PMs typically do not control the timing of deposits to and withdrawals from the accounts they manage If funds are contributed to an investment portfolio just before a period of relatively poor portfolio performance, the money-weighted rate of return will tend to be lower than the time-weighted rate of return. On the other hand, if funds are contributed to a portfolio at a favorable time (just before a period of relatively high returns), the money-weighted rate of return will be higher than the time-weighted rate of return. The use of the time-weighted return removes these distortions, and thus provides a better measure of a manager's ability to select investments over the period. If the manager has complete control over money flows into and out of an account, the money-weighted rate of return would be the more appropriate performance measure.

Answer 2

Interest rates and market returns are typically stated as annualised returns, regardless of the actual length of the time period over which they occur To annualise an HPR that is realised over a specific number of days, use the following formula:

Answer 3

DOES NOT REFLECT COMPOUNDING. if we wanted to compound - (1 + semi-annual rate) ^2 then -1.

Answer 4

If an interest rate is stated as 8% per year, compounded semi-annually, for example: You compound by 4% twice. so do 1 (x1.04), then ans (x1.04) = 1.0816 -1 = 8.16%. (look at formula) So EAR = 8.16%

Answer 5

Showcases how EAR increases in marginally smaller increments. If we tend toward infinity - we can find a ceiling to EAR. Make sure to decimalise Rcc when calculating.

Answer 6

taking logs of each side of the previous equation

Answer 7

1. They are additive over multiple periods 2. e^Rcc can be used as a multiplying and a discounting factor - ease

Answer 8

and note: the periods are additive. therefore CC return from t=0 to t=1, and t=1 to t=2, is the same as t=0 to t=2 when added.

Answer 9

Gross return refers to the total return on a security portfolio before deducting fees for the management and administration of the investment account. Net return refers to the return after these fees have been deducted. Commissions on trades and other costs that are necessary to generate the investment returns are deducted in both gross and net return measures.

Answer 10

Pretax nominal return refers to the return before paying taxes. Dividend income, interest income, short-term capital gains, and long-term capital gains may all be taxed at different rates.

Answer 11

After-tax nominal return refers to the return after the tax liability is deducted.

Answer 12

Real return is nominal return adjusted for inflation. Consider an investor who earns a nominal return of 7% over a year when inflation is 2%. The investor's approximate real return is simply 7 − 2 = 5%. The investor's exact real return is slightly lower: 1.07 / 1.02 − 1 = 0.049 = 4.9%.

Answer 13

Real return measures the increase in an investor's purchasing power—how much more goods she can purchase at the end of one year due to the increase in the value of her investments.

Answer 14

A leveraged return refers to a return to an investor that is a multiple of the return on the underlying asset. The leveraged return is calculated as the gain or loss on the investment as a percentage of an investor's cash investment. An investment in a derivative security, such as a futures contract, produces a leveraged return because the cash deposited is only a fraction of the value of the assets underlying the futures contract. Leveraged investments in real estate are common: investors pay only a portion of a property's cost in cash and borrow the rest.

Answer 15

Vo = the amount the fund can invest without leverage Vb = the amount the fund can borrow rb = interest rate of the borrowed money r = amount earns by investing the proceeds of the leveraged money

Answer 16

OR do can do Rl = (Rp x Va) - (Rd x Vb) then /2

Answer 17

Measures of central tendency identify the center, or average, of a dataset. This central point can then be used to represent the typical, or expected, value in the dataset. The arithmetic mean is the sum of the observation values divided by the number of observations. It is the most widely used measure of central tendency.

Answer 18

An example of an arithmetic mean is a sample mean, which is the sum of all the values in a sample of a population, ΣX, divided by the number of observations in the sample, n. It is used to make inferences about the population mean. The sample mean is expressed as follows:

Answer 19

The median is the midpoint of a dataset, where the data are arranged in ascending or descending order. Half of the observations lie above the median, and half are below. To determine the median, arrange the data from the highest to lowest value, or lowest to highest value, and find the middle observation. The median is important as the mean can be much affected by outliers, take the middle number, if even, take the mean of the middle two numbers.

Answer 20

The mode is the value that occurs most frequently in a dataset. A dataset may have more than one mode, or even no mode. When a distribution has one value that appears most frequently, it is said to be unimodal. When a dataset has two or three values that occur most frequently, it is said to be bimodal or trimodal, respectively.

Answer 21

Trimmed mean, winsorized mean

Answer 22

A trimmed mean excludes a stated percentage of the most extreme observations. A 1% trimmed mean, for example, would discard the lowest 0.5% and the highest 0.5% of the observations.

Answer 23

Instead of discarding the highest and lowest observations, we substitute a value for them. To calculate a 90% winsorized mean, for example, we would determine the 5th and 95th percentile of the observations, substitute the 5th percentile for any values lower than that, substitute the 95th percentile for any values higher than that, and then calculate the mean of the revised dataset.

Answer 24

Quantile is the general term for a value at or below which a stated proportion of the data in a distribution lies. Examples of quantiles include the following: - Quartile. The distribution is divided into quarters. - Quintile. The distribution is divided into fifths. - Decile. The distribution is divided into tenths. - Percentile. The distribution is divided into hundredths (percentages).

Answer 25

The difference between the third quartile and the first quartile (25th percentile) is known as the interquartile range.

Answer 26

To visualize a dataset based on quantiles, we can create a box and whisker plot. In a box and whisker plot, the box represents the central portion of the data, such as the interquartile range. The vertical line represents the entire range.

Answer 27

Can add 'fences' that are positioned at 1.5x the IQR of the data

Answer 28

Dispersion is defined as the variability around the central tendency. The common theme in finance and investments is the tradeoff between reward and variability, where the central tendency is the measure of the reward and dispersion is a measure of risk.

Answer 29

range = maximum value − minimum value

Answer 30

The mean absolute deviation (MAD) is the average of the absolute values of the deviations of individual observations from the arithmetic mean: The computation of the MAD uses the absolute values of each deviation from the mean because the sum of the actual deviations from the arithmetic mean is zero. Absolute is SIZE (rather than sign) HELP TO REMEMBER

Answer 31

The sample variance, s2, is the measure of dispersion that applies when we are evaluating a sample of n observations from a population. The sample variance is calculated using the following formula: REMEMBER THE (N - 1) Normally have to square route to get back to sample standard deviation.

Answer 32

The sample standard deviation is the square root of the sample variance. The sample standard deviation, s, is calculated as follows:

Answer 33

Relative dispersion is the amount of variability in a distribution around a reference point or benchmark.

Answer 34

Relative dispersion is the amount of variability in a distribution around a reference point or benchmark. Relative dispersion is commonly measured with the coefficient of variation (CV), which is computed as follows: CV measures the amount of dispersion in a distribution relative to the distribution's mean. This is useful because it enables us to compare dispersion across different sets of data. In an investments setting, the CV is used to measure the risk (variability) per unit of expected return (mean). A lower CV is better. ULTIMATELY - CV IS VARIATION PER UNIT OF RETURN YOU WOULD PREFER THE LOWER CV

Answer 35

When we use variance or standard deviation as risk measures, we calculate risk based on outcomes both above and below the mean. In some situations, it may be more appropriate to consider only outcomes less than the mean (or some other specific value) in calculating a risk measure. In this case, we are measuring downside risk.

Answer 36

One measure of downside risk is target downside deviation, which is also known as target semideviation. Calculating target downside deviation is similar to calculating standard deviation, but in this case, we choose a target value against which to measure each outcome and only include deviations from the target value in our calculation if the outcomes are below that target.

Answer 37

In the data, just ignore all returns that are above the target (eg all data points above 3% return) With the remainder, take the data point and subtract the target, use this summation in the equation.

Answer 38

Skewness or skew, refers to the extent to which a distribution is not symmetrical. Nonsymmetrical distributions may be either positively or negatively skewed and result from the occurrence of outliers in the dataset.

Answer 39

Outliers are observations extraordinarily far from the mean, either above or below:

Answer 40

A positively skewed distribution is characterized by outliers greater than the mean (in the upper region, or right tail). A positively skewed distribution is said to be skewed right because of its relatively long upper (right) tail.

Answer 41

A negatively skewed distribution has a disproportionately large amount of outliers less than the mean that fall within its lower (left) tail. A negatively skewed distribution is said to be skewed left because of its long lower tail.

Answer 42

For a symmetrical distribution, the mean, median, and mode are equal.

Answer 43

For a positively skewed, unimodal distribution, the mode is less than the median, which is less than the mean. The mean is affected by outliers; in a positively skewed distribution, there are large, positive outliers, which will tend to pull the mean upward, or more positive.

Answer 44

For a negatively skewed, unimodal distribution, the mean is less than the median, which is less than the mode. In this case, there are large, negative outliers that tend to pull the mean downward (to the left).

Answer 45

Sample skewness is equal to the sum of the cubed deviations from the mean divided by the cubed standard deviation and by the number of observations. Sample skewness for large samples is approximated as follows:

Answer 46

Note that the denominator is always positive, but that the numerator can be positive or negative depending on whether observations above the mean or observations below the mean tend to be farther from the mean, on average. When a distribution is right skewed, sample skewness is positive because the deviations above the mean are larger, on average. A left-skewed distribution has a negative sample skewness. If 0 - it is not skewed

Answer 47

Kurtosis is a measure of the degree to which a distribution is more or less peaked than a normal distribution. It is critical in a risk management setting. Most research about the distribution of securities returns has shown that returns are not normally distributed. Actual securities returns tend to exhibit both skewness and kurtosis. Skewness and kurtosis are critical concepts for risk management because when securities returns are modeled using an assumed normal distribution, the predictions from the models will not take into account the potential for extremely large, negative outcomes.

Answer 48

Leptokurtic describes a distribution that is more peaked than a normal distribution. We don't like this as there are bigger outliers on both sides. Less chance of a mean sized deviation from the mean. Kurtosis > 3

Answer 49

platykurtic refers to a distribution that is less peaked, more domed, or flatter than a normal one. (with thinner tails - so prefer) LOOKS LIKE A PLATYPUS BILL. Kurtosis < 3

Answer 50

A distribution is mesokurtic if it has the same kurtosis as a normal distribution.

Answer 51

A distribution is said to exhibit excess kurtosis if it has either more or less kurtosis than the normal distribution. Excess kurtosis = kurtosis -3

Answer 52

The computed kurtosis for all normal distributions is three. Statisticians, however, sometimes report excess kurtosis, which is defined as kurtosis minus three. Thus, a normal distribution has excess kurtosis equal to zero, a leptokurtic distribution has excess kurtosis greater than zero, and platykurtic distributions will have excess kurtosis less than zero.

Answer 53

Sample kurtosis for large samples is approximated using deviations raised to the fourth power:

Answer 54

Scatter plots are a method for displaying the relationship between two variables. With one variable on the vertical axis and the other on the horizontal axis, their paired observations can each be plotted as a single point.

Answer 55

A key advantage of creating scatter plots is that they can reveal nonlinear relationships, which are not described by the correlation coefficient. Panel C illustrates such a relationship. Although the correlation coefficient for these two variables is close to zero, their scatter plot shows clearly that they are related in a predictable way.

Answer 56

Covariance is a measure of how two variables move together. BUT KEEP IN MIND In practice, the covariance is difficult to interpret. The value of covariance depends on the units of the variables. The covariance of daily price changes of two securities priced in yen will be much greater than their covariance if the securities are priced in dollars. Like the variance, the units of covariance are the square of the units used for the data. AND we cannot interpret the relative strength of the relationship between two variables.

Answer 57

The answer does not tell us much, can be any number. We just need to see if positive or negative to see if they mope with each other positively or negatively. This is why we use correlation.

Answer 58

A standardized measure of the linear relationship between two variables is called the correlation coefficient, or simply correlation.

Answer 59

in other words: correlation = covariance/SDX*SDY

Answer 60

1. Correlation measures the strength of the linear relationship between two random variables. 2. Correlation has no units. 3. The correlation ranges from –1 to +1. That is, –1 ≤ ρXY ≤ +1. 4. If ρXY = 1.0, the random variables have perfect positive correlation. This means that a movement in one random variable results in a proportional positive movement in the other relative to its mean. 5. If ρXY = –1.0, the random variables have perfect negative correlation. This means that a movement in one random variable results in an exact opposite proportional movement in the other relative to its mean. 6. If ρXY = 0, there is no linear relationship between the variables, indicating that prediction of Y cannot be made on the basis of X using linear methods.

Answer 61

Spurious correlation refers to correlation that is either the result of chance or present due to changes in both variables over time that is caused by their association with a third variable. For example, we can find instances where two variables that are both related to the inflation rate exhibit significant correlation, but for which causation in either direction is not present. EG: In his book Spurious Correlation,1 Tyler Vigen presents the following examples. The correlation between the age of each year's Miss America and the number of films Nicolas Cage appeared in that year is 87%.

Answer 62

There are a finite number of outcomes

Answer 63

The expected value of a random variable is the probability weighted average of the possible outcomes for the variable. The mathematical representation for the expected value of random variable X, that can take on any of the values from x1 to xn, is:

Answer 64

A general framework, called a probability tree, is used to show the probabilities of various outcomes.

Answer 65

As the name implies, conditional expected values are contingent on the outcome of some other event. An analyst would use a conditional expected value to revise his expectations when new information arrives.

Answer 66

Bayes' formula is used to update a given set of prior probabilities for a given event in response to the arrival of new information.

Answer 67

Just weighted return

Answer 68

and then imagine the maths of just squaring added numbers. variance is an averaged squared distance. (weightA*sigmaA + weightB*sigmaB) if this is squared, remember the way two brackets are multiplied. so then just add the 'Rho' (correlation coefficient) to the doubled part in the middle. Then remember the relationship between correlation and covariance.

Answer 69

the variance. and note - covariance does not depend on order.

Answer 70

All variations of the two

Answer 71

The probability of two things happening at once

Answer 72

1. calculate expected values 2. use the attached table to calculate each of the covariances under each probability 3. add the together

Answer 73

it is 1.96 standard deviations - rounded to 2 here Area under the graph is a confidence level NOTE ALSO - if 5% tail, then from middle is 1.65 sd

Answer 74

number of standard deviations we need to go to capture the confidence level %. this distance is referred to as a z value

Answer 75

a normal distribution that has been standardised so that mean = 0 and standard deviation = 1

Answer 76

NOTE - this is with the backdrop of being a standard normal distribution. So our calculation is translated into this model when calculating the z value.

Answer 77

Shortfall risk is the probability that a portfolio value or return will fall below a particular target value or return over a given period.

Answer 78

Roy's safety-first criterion states that the optimal portfolio minimizes the probability that the return of the portfolio falls below some minimum acceptable level. This minimum acceptable level is called the threshold level.

Answer 79

Also think about it like the calculation of a z value

Answer 80

But then after the largest ratio has been chosen, this needs to be used in the z-table of negative values to find the percentage value that represents the shortfall risk of a portfolio.

Answer 81

replace this with 'log is normal' in your head. So random variable Y is lognormal is ln(Y) is normal. It is used to model price 'relatives': (Pt/Po)

Answer 82

because Pt/Po is the price in future over the price today, this is equal to 1 + HPR (as covered previously). therefore, ln(Y) = ln (Pt/Po) = ln(1 + HPR) = Rcc (continuously compounding returns = NORMAL DSTRIBTUION so LOGARITHM OF YOUR PRICE RELATIVE IS YOUR CONTINUOUSLY COMPOUNDED RETURNS

Answer 83

Lognormal is always positive and positively skewed

Answer 84

The lognormal distribution is useful for modeling asset prices if we think of an asset's future price as the result of a continuously compounded return on its current price.

Answer 85

Scale up by the square root of time.

Answer 86

Monte Carlo simulation is a technique based on the repeated generation of one or more risk factors that affect security values to generate a distribution of security values. For each of the risk factors, the analyst must specify the parameters of the probability distribution that the risk factor is assumed to follow. A computer is then used to generate random values for each risk factor based on its assumed probability distributions. Each set of randomly generated risk factors is used with a pricing model to value the security. This procedure is repeated many times (100s, 1,000s, or 10,000s), and the distribution of simulated asset values is used to draw inferences about the expected (mean) value of the security—and possibly the variance of security values about the mean as well.

Answer 87

1. Value complex securities. 2. Simulate the profits/losses from a trading strategy. 3. Calculate estimates of value at risk (VaR) to determine the riskiness of a portfolio of assets and liabilities. 4. Simulate pension fund assets and liabilities over time to examine the variability of the difference between the two. 5. Value portfolios of assets that have nonnormal return distributions.

Answer 88

An advantage of Monte Carlo simulation is that its inputs are not limited to the range of historical data. This allows an analyst to test scenarios that have not occurred in the past. The limitations of Monte Carlo simulation are that it is fairly complex and will provide answers that are no better than the assumptions about the distributions of the risk factors and the pricing/valuation model that is used. Also, simulation is not an analytic method, but a statistical one, and cannot offer the insights provided by an analytic method.

Answer 89

Resampling is another method for generating data inputs to use in a simulation. Often, we do not (or cannot) have data for a population, and can only approximate the population by sampling from it. (For example, we may think of the observed historical returns on an investment as a sample from the population of possible return outcomes.) To conduct resampling, we start with the observed sample and repeatedly draw subsamples from it, each with the same number of observations. From these samples, we can infer parameters for the population, such as its mean and variance.

Answer 90

In bootstrap resampling, we draw repeated samples of size n from the full dataset, replacing the sampled observations each time so that they might be redrawn in another sample. We can then directly calculate the standard deviation of these sample means as our estimate of the standard error of the sample mean.

Answer 91

Simulation using data from bootstrap resampling follows the same procedure as Monte Carlo simulation. The difference is the source and scope of the data. For example, if a simulation uses bootstrap resampling of historical returns data, its inputs are limited by the distribution of actual outcomes.

Answer 92

Probability sampling refers to selecting a sample when we know the probability of each sample member in the overall population.

Answer 93

With random sampling, each item is assumed to have the same probability of being selected.

Answer 94

If we have a population of data and select our sample by using a computer to randomly select a number of observations from the population, each data point has an equal probability of being selected—we call this simple random sampling. If we want to estimate the mean profitability for a population of firms, this may be an appropriate method.

Answer 95

Nonprobability sampling is based on either low cost and easy access to some data items, or on using the judgment of the researcher in selecting specific data items. Less randomness in selection may lead to greater sampling error.

Answer 96

Another way to form an approximately random sample is systematic sampling—selecting every nth member from a population.

Answer 97

Stratified random sampling uses a classification system to separate the population into smaller groups based on one or more distinguishing characteristics. From each subgroup, or stratum, a random sample is taken and the results are pooled. The size of the samples from each stratum is based on the size of the stratum relative to the population. Stratified sampling is often used in bond indexing because of the difficulty and cost of completely replicating the entire population of bonds. In this case, bonds in a population are categorized (stratified) according to major bond risk factors such as duration, maturity, coupon rate, and the like. Then, samples are drawn from each separate category and combined to form a final sample.

Answer 98

Cluster sampling is also based on subsets of a population, but in this case, we are assuming that each subset (cluster) is representative of the overall population with respect to the item we are sampling. For example, we may have data on personal incomes for a state's residents by county. The data for each county is a cluster.

Answer 99

In one-stage cluster sampling, a random sample of clusters is selected, and all the data in those clusters comprise the sample.

Answer 100

In two-stage cluster sampling, random samples from each of the selected clusters comprise the sample. Contrast this with stratified random sampling, in which random samples are selected from every subgroup.

Answer 101

Convenience sampling refers to selecting sample data based on ease of access, using data that are readily available. Because such a sample is typically not random, sampling error will be greater. An analyst should initially look at the data before adopting a sampling method with less sampling error.

Answer 102

Judgmental sampling refers to samples for which each observation is selected from a larger dataset by the researcher, based on one's experience and judgment. As an example, a researcher interested in assessing company compliance with accounting standards may have experience suggesting that evidence of noncompliance is typically found in certain ratios derived from the financial statements. The researcher may select only data on these items. Researcher bias (or simply poor judgment) may lead to samples that have excessive sampling error. In the absence of bias or poor judgment, judgmental sampling may produce a more representative sample or allow the researcher to focus on a sample that offers good data on the characteristic or statistic of interest.

Answer 103

AND - The bigger the sample size, the lower the dispersion of the sample mean around the true population mean. it is a direct relationship. NORMAL DISTRIBUTION. Dispersion of this is the SIGMAsquared / n (variance/n). So this is the variance of Xbar.

Answer 104

This is the sampling error. The sample statistic (eg sample mean) will bounce around the population mean, and will have a certain volatility as well. remember the video example of the dice averaging. The sample means generated will jump around a distribution with the true population mean in the centre.

Answer 105

The standard deviation of the distribution of sample means (Xbar) is called the standard error of the sample mean. Often, the true population standard deviation is not known. In these cases, we use the sample standard deviation.

Answer 106

As n increases, SExbar goes down. If you have more data, more accuracy.

Answer 107

- SD is the dispersion of a single observation, X, from a distribution. - SE, is the dispersion of the sample mean around the distribution's true population mean, mui

Answer 108

CONFIDENCE INTERVALS work in the same way here.

Answer 109

When standard deviation of the true population is not known, we use Sx in the calculation of SE. Sample mean follows the t-distribution instead of a normal distribution (with degrees of freedom n-1) Therefore, instead of 1.96, we use the t-distribution critical value instead of z-distribution to calculate the confidence interval.

Answer 110

- symmetrical (bell shaped) - fatter tails than a normal distribution - defined by a single parameter, degrees of freedom (df) , where df = n - 1 - as df increases, t-distribution approaches standard normal distribution Notice - how on the t-stat table, as the df increase, it falls towards the z-value of normal distribution.

Answer 111

IF n is larger or equal to 30. BUT AS AN APPROXIMATION.

Answer 112

Computational methods to estimate the standard error of the sample mean include the following: - Bootstrapping - Jackknife

Answer 113

resampling from original with replacement of items when drawn, calculating the sample mean each time; calculate the sample standard decision of these sample means (which would be the standard error).

Answer 114

Calculate multiple sample means, each with one observation removed; calculate standard deviation of these means.

Answer 115

A hypothesis is a statement about the value of a population parameter developed for the purpose of testing a theory or belief. Hypotheses are stated in terms of the population parameter to be tested, like the population mean, µ.

Answer 116

The null hypothesis, designated H0, is the hypothesis that the researcher wants to reject. It is the hypothesis that is actually tested and is the basis for the selection of the test statistics. The null is generally stated as a simple statement about a population parameter. It always includes the 'equal to' condition

Answer 117

The alternative hypothesis, designated Ha, is what is concluded if there is sufficient evidence to reject the null hypothesis and is usually what you are really trying to assess. Why? You can never really prove anything with statistics—when the null hypothesis is discredited, the implication is that the alternative hypothesis is valid.

Answer 118

If the test statistic falls outside the critical z-values, then the null hypothesis is rejected. if it falls within, we fail to reject the null hypothesis. Hypothesis testing involved two statistics: 10 the test statistic calculated from the sample data, and 2) the critical value of the test statistic.

Answer 119

A test statistic is calculated by comparing the point estimate of the population parameter with the hypothesized value of the parameter (i.e., the value specified in the null hypothesis). As indicated in the following expression, the test statistic is the difference between the sample statistic and the hypothesized value, scaled by the standard error of the sample statistic:

Answer 120

- t-distribution - z-distribution (standard normal distribution) - chi-square distribution - f-distribtuion

Answer 121

The power of a test is the probability of correctly rejecting the null hypothesis when it is false. 1 - P(Type II error)

Answer 122

Is the probability of obtaining a test statistic that would lead to a rejection of the null hypothesis, assuming the null hypothesis is true. It is the smallest level of significance for which the null hypothesis can be rejected.

Answer 123

The p-value of a test is the probability of getting the test statistic (or a result more extreme) if the null were true. p-value < significance level -> REJECT (MOST IMPORTANT IDEA) A p-value is the smallest level of significance at which the null can be rejected. (I think this makes things confusing but is common sense - look at pic)

Answer 124

Remember the s2p is calculated with the formula that s1squared = sum of deviations squared / n-1 So the numerator of s2p is sum of squared deviation of both samples, over the denominator of both degrees of freedom added !

Answer 125

Ho: mean1 - mean2 = 0 Ha: mean1 - mean2 =(with cross) 0

Answer 126

A. p-value vs significance level In this approach, we compare the p-value computed from the data, and the significance level in the question This approach emphasises the strength of evidence B. Test statistic vs critical value (e.g. z = 1.96) In this approach, you compare the test-statistic with the critical value form the sampling distribution. If the test statistic falls in the rejection region (e.g. |z| > 1.96) we reject null hypothesis. The critical value defines a boundary beyond which results are considered too extreme under Ho. SUMMARY: Comparing to 0.05 and comparing to 1.96 are two ways of expressing the same decision—one in terms of probability (p-value), the other in terms of distance from the null (test statistic).

Answer 127

(eg. same people taking same test but under different conditions) d(bar) is the mean of all the differences between the two samples. SE of differences (the denominator) is calculated in the exact same way as the sample mean SE)

Answer 128

Testing how much squared deviation we have in our distribution. Hypothesis would relate to a hypothesised level of a true blue variance within the distribution. Numerator = sum of squared deviations Chi-square distribution, does not go below zero (because variance won't)

Answer 129

Hypothesis relates to the ratio of variances Follows F-distribution (2 degrees of freedom) n1 - 1 degrees of freedom on numerator and n2 - 1 on the denominator

Answer 130

Everything so far has been parametric

Answer 131

Basically testing whether the population correlation coefficient is different from zero (so testing for any correlation)

Answer 132

because when calculating correlation, we base this on covariance. and when calculating covariance, we fix xbar and ybar. Each loses a degree of freedom.

Answer 133

Whether two sets of ranks are correlated. This is used in the Non-parametric test: rank correlation This finds the correlation number r that we can then use in our test statistic formula that we used before.

Answer 134

1. Find critical value in a t-table 2. Calculate t-stat

Answer 135

A way to store categorical data

Answer 136

Sum of every cell and sum - for a chi squared statistic

Answer 137

rows and columns

Answer 138

So, we have used expected values, then the t-stat equation to find the t-stat of the table. Then use the degrees of freedom eqn to find the critical value within the t-table. Use the significance level with this too. Chi is a ONE TAILED TEST. (it is a squared distance test so can only be +ve). SO WOULD BE THE 0.95 PART OF T-TABLE

Answer 139

NOTE: linearity - breach of linearity would be some kind of LOBF that is curved/quadratic where the residuals are therefore not independent.

Answer 140

HETEROSKADASCITY: if we don't have homoskedasticity - then our data is not behaving in the same way across all of our regression range, and so the nature of the relationship is changing. We are trying to capture a static equation. B

Answer 141

That we seasonality in our data This is called AUTOCORRELATION

Answer 142

note (we dont need to know its normality if it is a large sample size)

Answer 143

sum of squared errors

Answer 144

sum of squares of the regression

Answer 145

sum of squares total (is actually the numerator of the sample variance) (so SSE and SSR is breaking SST up into explained and unexplained variation)

Answer 146

It is a table that provides and analysis of variance ANOVA compares: variability between groups and variability within groups. An analysis of how good your x variable is at explaining variation in y

Answer 147

1. add the SumSquares regression and residual, so the SSR and SSE. Add them for a total SST. 2. Establish the degrees of freedom. For the regression, as we have one x variable (normally), this will be 1. (in the example used there are 5 degrees of freedom so total has to be n-1, therefore 4) - and therefore those left for the residual are 3. 3. The Mean Square column (a "variance column". Do column 2 (SumSquares so SSR or SSE or SST), then divide by column 1 (degrees of freedom). This is finding the variance explained by each part of the Regression and residual. 4. Calculate the F-stat by doing MSR/MSE from column 3. 5. RSquared (coefficient of determination). This is the level of variation in Y that is explained by the model.

Answer 148

Square route the coefficient of determination (Rsquared)

Answer 149

SO WE ONLY GET ONE F-STAT WITHIN THE TABLE

Answer 150

Ho: all of our coefficients are zero H1: There is at least one of our coefficients is non-zero

Answer 151

We find the critical F-value from statistical tables (95th percentile - if at 5% significance), and compare to F-stat as normal

Answer 152

Better understood as volatility of the residual Route MSE

Answer 153

Rather than testing the regression as a whole, this is testing a single coefficient. Example, if you know from the global regression test that at leats one of the coefficients is non-zero, then it is worthwhile interrogating individually which one it is

Answer 154

This is pretty much the same as when calculating the test-state for a mean, with the sample mean - observed mean / SE

Answer 155

Intervals outside of the regression line, where, for example, we are 95% sure that we will observe our Y value given a value for X

Answer 156

(notation Sf). The unit of distance you need to go on either side of the regression line to capture 95% chance that the true Y will be in that range.

Answer 157

Use the Sf and critical value product to create positive and negative boundaries with the predicted value.

Answer 158

is the same as the continuously compounded return

Answer 159

Extremely large and complex datasets

Answer 160

veracity - accuracy and unbiased nature

Answer 161

Human labels what goes in and out

Answer 162

Human does not label what goes in and out

Answer 163

Called deep learning because you can have multiple layers

QUANT METHODS 1, 3 ... Flashcards

(224 cards)