Python Data - Assignment 2 Questions

Get Expert's Help on Assessment Writing

Question 3 (8 marks)

The file Party.csv contains data on a sample of 250 voters with tracked variables, including party preference (Party=1 or 0), Age, Female (gender), Married (marital status), Income (in thousands), Education (schooling years), and Religion (Religion=1: religious, and 0: non-religious).

  1. Estimate a logistic regression of Party on Age, Female, Married, Income, Education, and Religion with statsmodels. Discuss the significance of each coefficient & model fitness.
  2. Based on the results of Part 1, build the confusion matrix with (in-sample) prediction. Compute and discuss the predication accuracy, precision, and recall.
  3. Based on the results of Part 1, construct two groups of voters: Group A is formed by voters with over 75% of predicted probability to vote for Part 1 and Group B is formed by voters with over 75% of predicted probability to vote for Part 0. How many voters are in Group A? How many voters are in Group B? Find the 90% confidence interval of the mean income of these two group of voters. Comment on the results.
  4. Perform KMeans clustering with Age, Female, Income, Education, and Religion and use the Elbow curve to justify the optimal number of clusters is 3. Form 3 clusters and use the crosstab to check if the clustering outcome reflects party preference. Comment on the results.

Our Academic Assistance: service is all about doing research and being good at it. The more research one will do, the better the paper will turn out.

Expert's Answer

Hire Our PhD Expert Writers

 

Need Urgent Academic Assistance?

Price Starts from $10 Per Page

*
*
*
*

 

 

Plagiarism Checker

Submit your documents and get Plagiarism report
Check Plagiarism

Chat with our Experts

Want to contact us directly? No Problem. We are always here for you

TOP
Order Notification

[variable_1] from [variable_2] has just ordered [variable_3] Assignment [amount] minutes ago.