CRICOS Code 00098G
INFS3830 – Social Media and Analytics – 2020 T1
nd Hands-On Assignment
Due on Friday 20th March at 5pm. Report to be submitted on Moodle. This assignment is
worth 10 points (10% of the overall mark).
The purpose of this assignment is to use SAS contextual analysis to (1) explore a dataset,
(2) create concepts and categories for the dataset, and (3) analyse the dataset to generate
insights from a large unstructured dataset.
The dataset has over 10,000 reviews on music albums, taken from Amazon.com. The data
is a small sample of a larger dataset containing 142.8 million reviews, including ratings, text,
helpfulness votes, spanning May 1996 – July 2014. If you would like more information on this
larger dataset, please visit the website http://jmcauley.ucsd.edu/data/amazon/.
The first step is to explore the data. This includes understanding the music and albums that
are in the dataset. To help facilitate your understanding and exploration, the second step is
to create concepts and categories from the dataset to better organize the data. Finally,
through the process of refining terms, topics, concepts and categories, use the analysis to
provide insights into the dataset. These insights can be wide ranging, from understanding
what artists and albums are popular to finding trends in music to trends in consumer tastes.
Your report should have the following the components:
A standard cover page;
A detailed summary, maximum of 2 pages, of your exploration, concepts,
categorisations and insights; and
A detailed account, up to 4 pages, documenting the process you used to explore,
conceptualise and analyse the data.
The dataset is labelled DIGITAL_MUSIC_3 and can be found in the shared SAS CA data
folder. Please save you work in the sasdata folder in the subfolder that is your student
number with the title sas_assignment_zID.
Submission Details
Word Limit
There is no word limit perse, just a page limit 2 for the summary and 4 for the
documentation. Font should be no smaller than Arial 11, with standard margins. However,
CRICOS Code 00098G
there spacing can be single, 1.15, 1.5, or double. Please note that material exceeding the
page limit will not be considered when grading the assignment.