Do file
This is where we type all of our codes
Allows you to save codes
How to set up a sheet
Run all code in the do file editor
If you want specific codes then highlight the code and press the button
If there is no red once you run the code, then it’s fine
Browse
Opens spreadsheet so we can see all of our data
Browse (variable) in order to see specific variables
Use br to view many variables
Code: Describe
Shows our data set
Understand variables
Int- integer
Code: List
Shows us all of our observations one at a time
Negative: Gives us too much information
Code: list (** VARIABLES**)
Only shows observation for specific variables
Code: list (VARIABLES) in 1/(NUMBER OF OBSERVATIONS YOU WANT TO BE INCLUDED)
Restricts to certain observations
EXAMPLES
- 1/l: List on everything
- 70/l: List starts from the 70th observation to the last
(Lowercase L denotes the final observation)
Code: list (variables) if (outcome of interest)==1
If command, therefore only includes observations that comply with the outcome of interest
Needs 2 equal signs!!
List code with multiple conditions
!= (value): Not equal to
&: means and
|: means or
==0: Not equal to
List (Anything e.g. a letter)*: Lists any variable that starts with the letter or follows the conditions
Code: Count
Counts the number of observations satisfying the condition
Code: Summarise or sum
Descriptive statistics of variables
Able to add if commands
,detail: More detailed breakdown (e.g. median, Skewness)
(On the example there is 1 less rep78 observation as one of the observations has nothing)
Code: tabulate/tab (Variable of interest)
Code: gen (title of the new variable) = function
Code: Replace (variable name) = (function in which the variable changes)
How to remove blank values when generating a new variable?
(1: Less than or equal to 2 or 0 if rep 78 is greater than 3)
CODE: Corr (variables )
How to save the new version of data
Running a t -test in which we test if a variable is equal to a certain value
ttest (variable we are testing) ==0
- Provided with standard deviation (measure of spread of all values)
- Provided with standard error (measure of precision of the average across samples)
- Provided with t value (hypothesised mean minus observed mean divided by standard error)
- Provides confidence interval
- Ha: mean !=0: Two sided alternative, this then shows the probability of getting the test statistic or one more extreme. If it’s less than 0.05 than we reject the null hypothesis.
- Ha: mean < 0 is a one sided test assuming that the mean is positive
T-test for a difference in mean
Ttest (variable we are testing), by (variable we are separating to find the difference between groups. E.g. Binary variable)
Code: Reg (outcome variable, independent variables)
Constant is worked out through: Average value of outcome minus (coefficient estimate for the independent variable * average value of the independent variable )
Code: Predict (name of the new variable E.g. y_hat)
Show properties of regressors
Code: twoway
Creating a graph