What is a data dictionary and what is its purpose
1) A map of data assets where data is specified including the required metadata
2) Valuable to keep track of data, and the effort of making one is minimized if data editing is organized from the beginning
Data element
Also called data field. An aspect of an individual or object that can take on varying values among individuals. Every piece of info in a database is a measurement of a data element.
Heteroscedasticity
Describes data that does not have constant variance. (The variability of a variable is not consistent across the values of another variable, like time.) In a residual plot, this looks like a fan or cone shape
Metadata
Descriptions of the fields in the database and their permissible values, as well as how they are created and limitations on their use, if known.
Redundancy
A technique for obtaining high quality data. Ask for the same or similar information at least twice to reduce risk of errors and inaccuracy.
Ex: ask for email twice to make sure it is typed correctly
Ex: ask for age and date of birth
When to implement redundancy
Only when a virtually error free result is required. Otherwise, the cost might outweigh the benefit
Skewness. Positive vs negative
Describes a distribution’s departure from symmetry. Negative skew means the left-hand tail is longer. Positive skew means the right-hand tail is longer.
Why is skewness important?
Stationary
If a distribution is stationary, it means the parameters are stable over time
TVaR
Tail value at risk. The expected loss given that the loss falls in the worst (1-alpha) part of the distribution
Pros and cons of TVaR
VaR
Portfolio VaR
The VaR of the entire portfolio
Individual VaR
The VaR of one asset in the portfolio in isolation
Diversified VaR
The portfolio VaR, taking into account diversification benefits
Undiversified VaR
The sum of the individual VaRs in the portfolio when there is no short position and all correlations are unity
Marginal VaR
The VaR that would be added for a unit increase in the investment in a particular asset
Incremental VaR
The VaR that would be added to the portfolio VaR if the given investment adjustments were made to the portfolio
Component VaR
A partition of the portfolio VaR that indicates how much the portfolio VaR would change (approximately) if the given asset was deleted from the portfolio
Purpose of VaR
How to calculate VaR
Pros and cons of VaR
Data quality
Refers to data’s “fitness for use.” The ability to fulfill the requirements of intended usage of data in a specific situation
Why is data quality important?
High data quality can be a competitive advantage. Poor data quality can:
1. reduce customer satisfaction
2. reduce employee satisfaction (causing high turnover)
3. breed organizational mistrust
4. make it difficult or impossible to accurately determine the financial position of the business
5. make it difficult or impossible to calculate premium income and reserve required
6. waste time and resources investigating and fixing data issues