7318AFE Business Data Analytics - 24-Hour Assignment Questions

Get Expert's Help on Assessment Writing

Question 4 (10 marks)

Use loans1000.csv for this question. There are 6 variables in the dataset (see below) -- it is of interest to use [1]-[5] to predict [6]. Set random_state=1234 throughout (when required).

  • [1] credit.policy: 1 if the customer meets the credit underwriting criteria, and 0 otherwise.
  • [2] log.annual.inc: The natural log of the self-reported annual income of the borrower.
  • [3] dti: The debt-to-income ratio of the borrower (amount of debt divided by annual income).
  • [4] fico: The FICO credit score of the borrower.
  • [5] delinq.2yrs: The number of times the borrower had been 30+ days past due on a payment in the past 2 years.
  • [6] not.fully.paid: 1 if not fully paid the loan, and 0 otherwise.
  1. How many borrowers do not pay the loan fully? Are the mean values of FICO credit score significantly different between fully-paid borrowers and not.fully.paid borrowers? Justify.
  2. Apply KMeans clustering using [1]-[5] (i.e. credit.policy, log.annual.inc, dti, fico, and delinq.2yrs). Find the optimal number clusters (set the maximum number of clusters as 10) without scaling. Justify your answer.
  3. Form 2 clusters (with KMeans) and use the crosstab to examine if the clustering outcome is in line with borrower's "not.fully.paid" status. Comment on the results.
  4. Create a random partition of the loans1000.csv dataset with 70% of observations in the training set and the remaining 30% in the test set. Report the sample mean and standard deviation for each of the variables in both train and test sets, separately. Comment on the results.
  5. Based on the data split of Part 4, apply Decision Tree, Random Forest, and Gradient Boosting (with n_estimators=1000) to the Train set and use the obtained models to predict the Test set. Use accuracy, precision, and recall to evaluate the performance of these models. Comment on the results.

We’ll write customized paper for you.
Get Upto 30% OFF on your Order!

Expert's Answer

Your future, our responsibilty submit your task on time.

Order Now

 

Need Urgent Academic Assistance?

Price Starts from $10 Per Page

*
*
*
*

 

 

Plagiarism Checker

Submit your documents and get Plagiarism report
Check Plagiarism

Chat with our Experts

Want to contact us directly? No Problem. We are always here for you

TOP
Order Notification

[variable_1] from [variable_2] has just ordered [variable_3] Assignment [amount] minutes ago.