7318AFE Business Data Analytics - 24-Hour Assignment Questions

Get Expert's Help on Assessment Writing

Question 3 (10 marks)

Use the dataset yelp1000.xlsx to answer this question. The dataset contains the following columns: date, stars, text (restaurant text reviews), cool, useful, and funny.

  1. Create a pie plot of the number of reviews (texts) for each type of star rating. Find the most frequent word used (excluding stopwords) for each type of star rating.
  2. Obtain sentiment scores (compound, neg, neu and pos) of each review (text). Report the most positive text and the most negative one. How many texts are with neu score equal to 1?
  3. Find the interquartile range of compound for each stars group (1 to 5) and make a boxplot of compound using different color for each stars group. Comment on the outcomes.
  4. Find the total number of reviews in 2011 and 2012, respectively. Find the proportion of compound below zero in 2011 and 2012, respectively. Compute the 90% confidence interval of the proportion of compound below zero in 2011 and 2012, respectively. Comment on the results.
  5. Use "apply(len)"to create a new column called "length" which is the number of words in the text column. Test if 60% (60 per cent) of the reviews (texts) are with more than 500 words (including space). State the null hypothesis and the alternative hypothesis. Comment on the results.

Get Your Customize Task on any subject starting 10$/Page

Expert's Answer

Hire Our PhD Expert Writers

 

Need Urgent Academic Assistance?

Price Starts from $10 Per Page

*
*
*
*

 

 

Plagiarism Checker

Submit your documents and get Plagiarism report
Check Plagiarism

Chat with our Experts

Want to contact us directly? No Problem. We are always here for you

TOP
Order Notification

[variable_1] from [variable_2] has just ordered [variable_3] Assignment [amount] minutes ago.