Release Date: May 2025
Due Date: 6th June 2025 11:55 PM
Weight: 30%
Format of Submission: A report in PDF format not exceeding 20 pages + SAS files in spk format
Part A - Cluster Analysis (40%)
A wholesale supply company sells four types of dungarees (overalls): fashion jeans, leisure jeans, stretch jeans, and original jeans. The owner of the supply company is interested in identifying the groupings of stores where his products are supplied. To identify such groupings, the owner has selected the DUNGAREE data set, which provides the number of pairs of four different types of dungarees sold at stores over a specific time period. In the DUNGAREE data set, each row represents an individual store. The dataset contains six columns. One column displays the store identification number, and the remaining columns contain the number of pairs of each type of jeans sold. The variables in the dataset are listed below, along with their corresponding roles and levels.
| Name | Model Role | Measurement Level | Description |
|---|---|---|---|
| STOREID | ID | Nominal | Identification number of the store |
| FASHION | Input | Interval | Number of pairs of fashion jeans sold at the store |
| LEISURE | Input | Interval | Number of pairs of leisure jeans sold at the store |
| STRETCH | Input | Interval | Number of pairs of stretch jeans sold at the store |
| ORIGINAL | Input | Interval | Number of pairs of original jeans sold at the store |
| SALESTOT | Rejected | Interval | Total number of pairs of jeans sold (the sum of FASHION, LEISURE, STRETCH, and ORIGINAL) |
a. Create a new diagram in your project. Name the diagram Jeans.
b. Define the data set DUNGAREE as a data source. Determine whether the model roles and measurement levels assigned to the variables are appropriate.
c. Examine the distribution of the variables.
- Are there any unusual data values?
- Are there missing values that should be replaced?
d. Assign the variable STOREID the model role ID and the variable SALESTOT the model role Rejected. Ensure that the remaining variables have the 'Input model' role and the 'Interval measurement level'. Why should the variable SALESTOT be rejected?
e. Add an Input Data Source node to the diagram workspace and select the DUNGAREE data table as the data source.
f. Add a Cluster node to the diagram workspace and connect it to the Input Data node.
g. Select the Cluster node. Leave the default setting as Internal Standardization. What would happen if inputs were not standardized?
h. Run the diagram from the Cluster node and examine the results. Does the number of clusters created seem reasonable? Discuss using knowledge from class discussions - what is a cluster/how many clusters should you have, etc.
j. Use the Segment Profile node to summarize the nature of the clusters. Describe the profiles.
Expert's Answer
Chat with our Experts
Want to contact us directly? No Problem. We are always here for you

Your future, our responsibilty submit your task on time.
Order NowGet Online
Assignment Help Services





