Assignment 3
Due: Monday 9th May, 5:00pm
The maximum mark for this assignment is 60. It forms 30% of the final grade for this subject. Your assignment should be submitted via Gradescope as a PDF document.
Please put your student ID number in the header of the document.
Unless you are asked to do so, please do not include any Stata output in your assignment document. Instead, format any results you want to show in a way that would be suitable for inclusion in a report or journal article.
This Assignment has 6 short answer questions. You should attempt all questions.
For questions where you are asked to calculate the answers by hand, please show your workings.
For this assessment, you will need to download and open the files “Assignment3_wound.dta” and “Assignment3_chol.dta” from Canvas. We recommend that you perform all Stata tasks via a do-file (you can use the do-files from the Stata practicals located on Canvas, as a guide).
Section A - Wound Healing Dataset
The Wound Healing Society defines a chronic wound as one that has failed to proceed through an orderly and timely reparative process to produce anatomic and functional integrity within an expected period. Chronic wounds represent a significant annual burden on the Australian health care system, with direct health care costs reaching $2.85 billion. Several factors can interfere with one or more phases of the wound healing process, thus causing improper or impaired wound healing. Such factors include age, stress, diabetes, obesity, medications, alcoholism, smoking, and nutrition. A better understanding of the influence of these factors on repair may lead to therapeutics that improve wound healing and resolve impaired wounds.
For this section of the assignment you will be using the dataset “Assignment3_wound.dta”, a prospective cohort study of 750 wound patients investigating the risk factors for wound infection, where the patients were given uniform treatment over 12 weeks.
Table 1: Description of the variables in the Assignment3_wound.dta dataset
Variable name | Description |
id | Study participant identification number |
age | Age (years) |
sex_male | Sex (0 – Female, 1 – Male) |
bmi | Body mass index (kg/m2) |
smoke | Smoking status (0 – Non-smoker, 1 – Smoker) |
alc | Alcohol consumption per week (ml/week) |
stress | Stress score (units, Range: 0 - No stress, 10 - Maximum stress) |
diab | Type II diabetes (0 – No, 1 – Yes) |
infect | Was the wound infected at any time in twelve weeks? (0 – No, 1 – Yes) |
*All other variables except wound infection (infect) were measured at hospital admission.
Question 1 [11 marks]
Wound infections are common among smokers. In this question we will explore the association between smoking (smoke) and wound infection (infect).
- Visualise the unadjusted association between smoking (smoke) and wound infection (infect) by completing the 2×2 table below. [2 marks]
Smoking | Wound infection | Total | |
Yes | No | ||
Smoker (Group 1) | |||
Non-smoker (Group 0) | |||
Total |
- Calculate by hand, the odds of wound infection separately for smokers and non-smokers. [2 marks]
- Calculate by hand, the odds ratio for the association between smoking and wound
[1 mark]
- Calculate by hand, a 95% confidence interval for the population odds ratio for the association between smoking and wound infection. [4 marks]
- Interpret the estimated odds ratio for the association between smoking and wound infection and the corresponding 95% confidence interval for the population odds ratio calculated above, and comment on the association between smoking and wound [2 marks]
Question 2 [8 marks]
One of the research questions of this study was to explore if patients with high alcohol consumption were more susceptible to wound infections. Given this was a prospective cohort study with patients followed up for wound infection over 12 weeks, all three measures, risk difference, risk ratio and odds ratio can be calculated and used to explore this association.
Participants who drink at least the average amount of alcohol per week, estimated to be 187 ml/week, are considered to have a high alcohol consumption. Generate a binary variable for high alcohol consumption named high_alc which categorises alc into two groups; participants with an alcohol consumption of at least 187 ml/week [coded as 1, “High”] and participants with an alcohol consumption of less than 187 ml/week [coded as 0, “Low”]. Use this new binary variable high_alc for alcohol consumption in Question 2.
- Using Stata, obtain the frequency and proportion of patients with wound infection separately for those with a high alcohol consumption and a low alcohol consumption in this Write down the Stata command you used to obtain these proportions as well. [3 marks]
- Using Stata, obtain an estimate for the population risk difference in wound infection between those with high alcohol consumption and those with low alcohol consumption, the 95% confidence interval for the population risk difference and the two-sided p-value for the null hypothesis of no difference in the population risk of wound infection between those with high alcohol consumption and those with low alcohol Write down the Stata command you used to obtain these results as well. [2 marks]
- Interpret the estimated risk difference in wound infection between those with high alcohol consumption and those with low alcohol consumption, the corresponding 95% confidence interval for the population risk difference and the p-value you obtained from Stata, and comment on the association between alcohol consumption and wound [3 marks]
Question 3 [10 marks]
The study investigators were interested in analysing the cross-sectional hospital admission data on type II diabetes (diab) and stress scores (stress), and exploring whether patients with type II diabetes had increased stress levels.
- Using Stata, obtain the mean and the standard error of the mean stress score for patients with type II diabetes and patients without type II diabetes in this Write down the Stata command you used to obtain these results as well. [2 marks]
With type II diabetes | Without type II diabetes |
Sample mean (units) | |
Standard error of the mean (units) |
- Calculate by hand and interpret, the difference in the mean stress score between patients with type II diabetes and without type II diabetes in this [2 marks]
- Calculate by hand, a 95% confidence interval for the population mean difference in the stress score between patients with type II diabetes and without type II [3 marks]
- Calculate by hand and interpret, a two-sided p-value for the null hypothesis that there is no difference in the population mean stress score between patients with type II diabetes and without type II [3 marks]
Question 4 [15 marks]
Type II diabetes is a known risk factor for wound infection. In this question, first we will explore the association between type II diabetes (diab) and wound infection (infect).
- Visualise using Stata, the association between type II diabetes and wound infection by completing the 2×2 table below. [2 marks]
Type II diabetes | Wound infection | Total | |
Yes | No | ||
Yes (Group 1) | |||
No (Group 0) | |||
Total |
- Calculate by hand, the risk of wound infection (i.e., the proportion of patients with wound infection) for those with and without type II [2 marks]
- Calculate by hand and interpret, the difference in the risk of wound infection between those with type II diabetes and those without type II diabetes in this study. [2 marks]
- Calculate by hand, the standard error for the sample risk difference in wound infection between those with type II diabetes and those without type II [3 marks]
- Calculate by hand, a 90% confidence interval for the population risk difference in wound infection between those with type II diabetes and those without type II [2 marks]
Smoking is a known risk factor for type II diabetes and smoking further complicates managing type II diabetes and regulating insulin levels. The study investigators were interested in analysing the cross-sectional hospital admission data on smoking (smoke) and type II diabetes (diab).
- Investigate the association between smoking (smoke) and type II diabetes (diab) in this study by calculating the corresponding odds ratio, the 95% confidence interval for the population odds ratio and a two-sided p-value for the null hypothesis of no association between smoking and type II diabetes using Interpret your results and comment on the association between smoking and type II diabetes. Write down the Stata command you used to obtain your results as well. [4 marks]
Section B - Cholesterol Dataset
High cholesterol is a major health concern in many countries across the globe including Australia and is a leading cause of heart disease. Age, smoking, overweight or obesity, family history, unhealthy diet and lack of physical activity are some of the common risk factors for high cholesterol.
For this section of the assignment you will be using the dataset “Assignment3_chol.dta”, a random sample of 150 participants from a prospective cohort study investigating the effect of overweight/obesity on total cholesterol levels.
Table 2: Description of the variables in the Assignment3_chol.dta dataset
Variable name | Description |
id | Study participant identification number |
age | Age (years) |
sex_female | Sex (0 – Male, 1 – Female) |
bmi_WHO | Body Mass Index category based on WHO classification (1 – <25 kg/m2
normal, 2 – 25-29 kg/m2 overweight, 3 – ≥30 kg/m2 obese) |
smoke | Current smoking status (1 – Non-smoker, 2 – Ex-smoker, 3 – Current
smoker, CPD≤20), 4 – Current smoker, CPD>20) |
totchol | Total cholesterol level (mmol/L) |
CPD = Cigarettes per day; WHO = World Health Organisation
Question 5 [13 marks]
In this question, we will explore the association between cholesterol (totchol) and body mass index (BMI).
- For the purpose of this question, using the bmi_WHO variable, generate a new binary variable named bmi_WHO_bin as; participants with a normal BMI [coded as 0, “Normal”] and participants who are overweight or obese [coded as 1, “Overweight/Obese”]. Write down the Stata command you used to generate the new Obtain using Stata, the frequency and proportion of participants who are overweight/obese in this study. [3 marks]
Use this new binary variable bmi_WHO_bin for BMI in Questions 5b and 5c.
- Investigate the association between cholesterol (totchol) and BMI (bmi_WHO_bin) by calculating the corresponding estimated mean difference, a 95% confidence interval for the population mean difference and a two-sided p-value for the null hypothesis of no association between cholesterol and BMI, by conducting an unpaired t-test in Interpret your results and comment on the association between cholesterol and BMI. Write down the Stata command you used to conduct the unpaired t-test. [5 marks]
- Write an abstract based on the analyses you performed above in Question 5 to explore the association between cholesterol and Your abstract should include the headings; Background, Methods, Results and Conclusion (max 200 words). [5 marks]
Question 6 [3 marks]
Please provide a copy of your Stata do-file for performing the statistical analyses for this Assignment, both sections A and B. Do not upload a second file when submitting your assignment, instead copy and paste the Stata do-file commands to your assignment word document prior to converting to PDF.
Expert's Answer
Chat with our Experts
Want to contact us directly? No Problem. We are always here for you
Get Online
Assignment Help Services