Question 4 (10 marks)
Use the dataset billionaire.xlsx to answer the following questions. The dataset contains the following variables: Nation (country name), Number (number of billionaire), GDP (in billions USD), and Population (in millions).
- Obtain the median of GDP per capita: GDP_pc = GDP/Population, and use the median of GDP_pc to make countries into two groups: "Rich" (countries with GDP per capita above the median) and "Not_Rich" (countries with GDP per capita equal to or below the median). Find the mean number of billionaires for each group & use the barplot of Seaborn to plot these two means. Comment on the results and discuss if the two CI's are "symmetric around the mean value".
- Let the mean number of billionaires of the Rich group as "meanN-Rich" and the mean number of the Not_Rich group as "meanN-Not_Rich". Show that none of the following two null hypotheses (i) meanN-Rich = meanN-Not_Rich (ii) meanN-Rich = 2 * meanN-Not_Rich can be rejected at 10% significance level. Discuss what might contribute to the non-rejections.
- Obtain two scatter plots with fitted lines: Number vs. GDP and Number vs. GDP per capita. Test if (i) Number and GDP are correlated and (ii) Number and GDP per capita are correlated. Comment on the results.
- Estimate the following two multiple regressions: (i) Number on GDP and Population (ii) Number on GDP per capita and Population. Compare and comment on the results of the two regressions.
- Use the scatterplot to visualize the fitness of two regression predictions from Part 4 (as in Topic 8). Discuss the role of the United States in the fitness.
Get Your Customize Task on any
subject starting 10$/Page