BUS5PA Predictive Analytics

Assistance On on Data Analytics Report

Release Date: May 2025
Due Date: 6th June 2025 11:55 PM
Weight: 30%
Format of Submission: A report in PDF format not exceeding 20 pages + SAS files in spk format

Part A - Cluster Analysis (40%)

A wholesale supply company sells four types of dungarees (overalls): fashion jeans, leisure jeans, stretch jeans, and original jeans. The owner of the supply company is interested in identifying the groupings of stores where his products are supplied. To identify such groupings, the owner has selected the DUNGAREE data set, which provides the number of pairs of four different types of dungarees sold at stores over a specific time period. In the DUNGAREE data set, each row represents an individual store. The dataset contains six columns. One column displays the store identification number, and the remaining columns contain the number of pairs of each type of jeans sold. The variables in the dataset are listed below, along with their corresponding roles and levels.

Name Model Role Measurement Level Description
STOREID ID Nominal Identification number of the store
FASHION Input Interval Number of pairs of fashion jeans sold at the store
LEISURE Input Interval Number of pairs of leisure jeans sold at the store
STRETCH Input Interval Number of pairs of stretch jeans sold at the store
ORIGINAL Input Interval Number of pairs of original jeans sold at the store
SALESTOT Rejected Interval Total number of pairs of jeans sold (the sum of FASHION, LEISURE, STRETCH, and ORIGINAL)

 

a. Create a new diagram in your project. Name the diagram Jeans.

b. Define the data set DUNGAREE as a data source. Determine whether the model roles and measurement levels assigned to the variables are appropriate.

c. Examine the distribution of the variables.

  • Are there any unusual data values?
     
  • Are there missing values that should be replaced?

d. Assign the variable STOREID the model role ID and the variable SALESTOT the model role Rejected. Ensure that the remaining variables have the 'Input model' role and the 'Interval measurement level'. Why should the variable SALESTOT be rejected?

e. Add an Input Data Source node to the diagram workspace and select the DUNGAREE data table as the data source.

f. Add a Cluster node to the diagram workspace and connect it to the Input Data node.

g. Select the Cluster node. Leave the default setting as Internal Standardization. What would happen if inputs were not standardized?

h. Run the diagram from the Cluster node and examine the results. Does the number of clusters created seem reasonable? Discuss using knowledge from class discussions - what is a cluster/how many clusters should you have, etc.

j. Use the Segment Profile node to summarize the nature of the clusters. Describe the profiles.

Expert's Answer

Your future, our responsibilty submit your task on time.

Order Now

Need Urgent Academic Assistance?

Price Starts from $10 Per Page

*
*
*
*

TOP
Order Notification

[variable_1] from [variable_2] has just ordered [variable_3] Assignment [amount] minutes ago.