We have not had an open book exam previously. The assignment, tutorial questions and quizzes are indicative of the types of questions you will see on the final exam. The final 2 questions here are just more examples. The solutions are sketched . The regression question involves downloading data and analysing it. In the final exam there will be 3 or 4 data sets which will largely be new to you. You will be able to download all data sets (you will not know which one is relevant to you) an hour before your exam begins. Recall there are 4 staggered times 2.00, 2.15. 230 and 245 for the star of the exam. The reason for this is to minimise technical issues with upload. More on this later.

You will be examined on all 3 levels of the course (i.e. formula sheet, practical application and what lies beneath i.e. intuition) as outlined in the week12 lecture – recording and notes. The course has emphasised all 3 levels, with a very important focus on level 3. You will need to have access to Excel data analysis toolpack for the exam.

You should use these questions as a practice -2 hours what can you do!

Best wishes,

The team.

Q1. (a) Explain what the Empirical rule implies about the distribution of X. Draw a diagram to illustrate. Does it matter whether X is approximately Normally distributed or not?

(b) What does the Central limit theorem imply about the distribution of the sample mean? Explain and illustrate. What role does sample size play?

c) Now Assume you wish to test a hypothesis about the population mean. You take a sample, n=100. Your sample mean is 10 and sample standard deviation is 2.

i) Use a 2 tail test, with 5% significance to test the null hypothesis that µ=12. Carefully interpret the result using a diagram. Could your conclusion be incorrect?

ii) How would conclusion change with a one tail test and with a change in significance?

d) Now Assume X takes on 3 possible values: 1, 2 and 6 each with probabilities 1/3.

i) Calculate the Expected value of X and the standard deviation of X.

ii) If we take a sample (with replacement) from the above population, with n=2 what is the distribution of the sample mean?

Q2. Download the data – labour_supply.xls

Examine the 2 worksheets: data and data description. All analysis should be carried out using data analysis toolpack. You will submit your Excel workbook.

a) The data is already sorted by wives that work i.e. work=1 (i.e. positive hours) and work=0 (0 hours). In the space below analyse the data for wives that work and wives that do not. Do you notice any differences between the two groups? Explain carefully. Formally test whether there is a significant difference in the number of children less than age 6 between wives that work and wives that do not. Make sure you interpret your results.

b) Consider the following multiple regression model:

Consider the following multiple regression model with the dependent variable (Y) hours in a year and independent variables kidslt6 kidsge6 age educ city nwifeinc.

(i) What are your a prior expected signs for the population parameters? Explain.

(ii) Estimate the above regression model. Summarise your results in traditional form. Interpret your results and carry out any hypothesis tests you consider appropriate/relevant.

c) We used the data analysis toolpack random number generation several times in lecture demonstrations. The data you have analysed is a sample drawn from the population. If I could use Excel to generate 1000 samples and estimate 1000 sample regressions what would I learn? Discuss the intuition.