#Exercise 10:  multiple linear regression

 

# The Heart and Estrogen/Progestin Study (HERS) is a clinical trial of hormone therapy for prevention of recurrent heart attacks and death among post-menopausal women with existing coronary heart disease. The HERS data are used in many of the examples in Chapters 3 and 4 of the course text book byVittinghoff et al. In this exercise we will study how different variables may influence the glucose level in the blood for the non-diabetic women in the cohort, in particular we are interested to see if exercise may help to reduce the glucose level (cf. Section 4.1 in Vittinghoff et al.).

 

# You may read the HERS data into R and extract the women without diabetes by the commands,

hers=read.table("http://www.uio.no/studier/emner/matnat/math/STK4900/data/hers.txt",sep="\t",header=T,na.strings=".")

hers.no=hers[hers$diabetes==0, ]

 

 

# We will start out by investigating (in questions a-c) how the glucose levels are for women who exercise at least three times a week (coded as exercise=1) and women who exercise less than three times a week (coded as exercise=0).

 

# a)

# Make a summary and boxplot of the glucose levels according to the level of exercise:

summary(hers.no$glucose[hers.no$exercise==0])

summary(hers.no$glucose[hers.no$exercise==1])

boxplot(hers.no$glucose~hers.no$exercise)

 

# Discuss what the summaries and boxplot tell you.

 

 

# b)

# Test if there is a difference in glucose level and make a confidence interval:

t.test(glucose~exercise, var.equal=T,data=hers.no)

 

# What may you conclude for the test and the confidence interval?

 

 

# c)

# Perform a simple linear regression with glucose level as outcome and exercise as predictor: