0% found this document useful (0 votes)
39 views9 pages

SIT731

The SIT731 Data Wrangling unit at Deakin University focuses on preparing raw data for analysis using various methodologies and programming techniques, primarily in Python. Students will engage in hands-on tasks to develop a learning portfolio that demonstrates their understanding of data wrangling, including data extraction, cleaning, and exploratory analysis, while adhering to ethical standards. The unit includes structured learning activities, feedback mechanisms, and assessments to ensure students meet the learning outcomes and develop essential skills in data science and artificial intelligence.

Uploaded by

Qiao Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views9 pages

SIT731

The SIT731 Data Wrangling unit at Deakin University focuses on preparing raw data for analysis using various methodologies and programming techniques, primarily in Python. Students will engage in hands-on tasks to develop a learning portfolio that demonstrates their understanding of data wrangling, including data extraction, cleaning, and exploratory analysis, while adhering to ethical standards. The unit includes structured learning activities, feedback mechanisms, and assessments to ensure students meet the learning outcomes and develop essential skills in data science and artificial intelligence.

Uploaded by

Qiao Wang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Faculty of Science, Engineering and Built Environment

SIT731 Data Wrangling

Deakin University Unit Guide

Trimester 1, 2025
CONTENTS

Welcome ...................................................................................................................................................................... 2
Who is the unit team? .................................................................................................................................................. 2
Unit chair details .......................................................................................................................................................... 2
Other members of the team and how to contact them .............................................................................................. 2
Administrative queries ................................................................................................................................................. 2
About this unit ............................................................................................................................................................. 2
Unit development in response to student feedback ................................................................................................... 2
Learning Outcomes ...................................................................................................................................................... 3
Assessing your achievement of the unit learning outcomes ....................................................................................... 3
Hurdle requirements .................................................................................................................................................... 3
Summative assessment task 1 ..................................................................................................................................... 4
Your learning experiences in this unit .......................................................................................................................... 6
Educator-facilitated (scheduled) learning activities - on-campus unit enrolment ....................................................... 6
Educator-facilitated (scheduled) learning activities - online unit enrolment .............................................................. 6
Typical study commitment ........................................................................................................................................... 6
Note ............................................................................................................................................................................. 6
Unit learning resources ................................................................................................................................................ 7
Essential learning resources ......................................................................................................................................... 7
Recommended learning resources .............................................................................................................................. 7
Where to access unit resources ................................................................................................................................... 7
Key dates for this study period .................................................................................................................................... 7
Unit weekly activities ................................................................................................................................................... 8

24 February 2025
Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

Welcome

Data Science (DS) and Artificial Intelligence (AI) are popular fields in making sense of data that have been collected in large
quantities from various sources. Performing accurate exploration and modelling using DS and AI heavily rely on appropriately
prepared data. Data wrangling is the process of preparing the raw data appropriately for modelling purposes. The aim of this
unit is to learn various data wrangling methodologies and programming techniques to perform them. This include
programming in Python for performing various data wrangling tasks, learning data extraction methods from different
sources, working with different types of data, storing and retrieving them, applying sampling techniques and inspecting
them, cleaning them by identifying outliers/anomalies, handling missing data, transforming, selecting and extracting
features, performing exploratory analysis, visualisation using various tools, summarising data appropriately, performing basic
statistical analysis and modelling using basic machine learning. Further, techniques for maintaining data privacy and
exercising ethics in data manipulation will be covered in this unit.
This Unit Guide provides you with the key information about this unit. Please read it carefully and refer to it frequently
throughout the study period. Your unit site also provides information about your rights and responsibilities. We will assume
you have read this before the unit commences, and we expect you to refer to it throughout the study period.

To be successful in this unit, you must:

read all materials in preparation for your learning activities and follow up each with further study and research on the
topic
start your assessment tasks well ahead of the due date
read or listen to all feedback carefully and use it in your future work
attend and engage in all educator facilitated (scheduled) learning activities and other learning experiences as part of
the unit design

Who is the unit team?

Unit chair: leads the teaching team and is responsible for overall delivery of this unit
Unit chair details

Name: Nayyar Zaidi


Campus: Melbourne Burwood
Email: [email protected]
Phone: +61 3 924 45963
Other members of the team and how to contact them

Details of other teaching staff will be provided on the unit site at the start of the trimester.
Administrative queries

check-out the 'SEBE Student Hub' section on your unit site


contact your Unit Chair or Campus Leader
drop in or contact Student Central to speak with a Student Adviser

For additional support information, please see the Rights and Responsibilities section under 'Content' in your unit site.
About this unit

Unit development in response to student feedback

Every trimester, we ask students to tell us, through eVALUate, what helped and hindered their learning in each Unit. You are
strongly encouraged to provide constructive feedback for this Unit when eVALUate opens (you will be emailed a link).
In the previous editions of the unit, students have told us that these aspects of the unit have helped them to achieve the
learning outcomes:

the availability of engaging, step-by-step tutorial videos and many exercises in Python that emphasised on developing

24 February 2025 Page 2 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

real-life data analysis/modelling skills;


the lecture notes; and
using OnTrack for task submission and feedback, very clear marking guidelines and obligations set, involvement of the
unit team.

The following aspects of the unit have been introduced, enhanced or retained in response to feedback from students who
have undertaken this unit in previous trimesters:

the feedback for this unit has remained positive, with only minor suggestions for improvement, which were
implemented in the course of the unit update for this year.

If you have any concerns about the Unit during the trimester, please contact the unit teaching team - preferably early in the
trimester - so we can discuss your concerns, and make adjustments, if appropriate.
Learning Outcomes

Each Unit in your course is a building block towards Deakin's Graduate Learning Outcomes - not all Units develop and assess
every Graduate Learning Outcome (GLO).
ULO These are the Unit Learning Outcomes (ULOs) for this unit. At the completion Alignment to Deakin Graduate
of this unit, successful students can: Learning Outcomes (GLOs)
ULO1 Undertake data wrangling tasks by using appropriate programming and GLO1: Discipline-specific knowledge
scripting languages to extract, clean, consolidate, and store data of different and capabilities
data types from a range of data sources. GLO3: Digital literacy
ULO2 Research data discovery and extraction methods and tools and apply resulting GLO3: Digital literacy
learning to handle extracting data based on project needs. GLO5: Problem solving
ULO3 Design, implement, and explain the data model needed to achieve project GLO1: Discipline-specific knowledge
goals, related security considerations, and the processes that can be used to and capabilities
convert data from data sources to both technical and nontechnical audiences. GLO2: Communication
GLO5: Problem solving
ULO4 Use both statistical and machine learning techniques to perform exploratory GLO1: Discipline-specific knowledge
analysis on data extracted, and communicate results to technical and non- and capabilities
technical audiences. GLO2: Communication
GLO5: Problem solving
ULO5 Apply and reflect on techniques for maintaining data privacy and exercising GLO8: Global citizenship
ethics in data handling.

Assessing your achievement of the unit learning outcomes

Hurdle requirements

To be eligible to obtain a pass in this unit, students must meet certain milestones as part of the portfolio.
Brief summary of the hurdle requirements Rationale
To be eligible to obtain a pass in this unit, students must meet The pass tasks in this unit provide students with the
certain milestones as part of the portfolio. opportunity to develop and demonstrate achievement of
the Unit Learning Outcomes at the minimum expected
standards.

24 February 2025 Page 3 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

1. Unit Tasks (Learning Portfolio) These tasks are included as hurdle requirements so that
Students are required to submit tasks, collaborate with their students are able to provide evidence of achievement of
tutor to resolve any issues identified, and discuss their these ULOs through their portfolio. The portfolio artefact
understanding of the associated concepts by each task's that they submit is used to measure their performance
indicated due date. against the minimum standards as well as their ability to
justify the outcomes that they have achieved through
• By the end of week 3, students are required to have self-assessment and reflection. The hurdle requirement
completed at least the Pass Tasks 1 and 2 also provides a mechanism for student-staff interaction to
• By the end of week 11, students are required to have check progress and address educational and motivational
submitted all pass tasks to demonstrate achievement of the issues before it is too late in the trimester.
unit learning outcomes.

Task discussions must be conducted in seminar time (for


campus student), via online discussions through MS Teams (for
online students only), or OnTrack. Please ensure that you are
enrolled in the correct mode of study.
Tasks may be discussed with staff within the submission period
by the corresponding due dates. It is strongly recommended
that Tasks are submitted well ahead of these due dates, as
completion of the tasks involve submitting work for
assessment, responding to feedback, discussing the task with
teaching staff, and ensuring work submitted demonstrates the
required outcomes. In many cases work will need to be
corrected and resubmitted, potentially more than once, as part
of this process.

To achieve at least a pass grade, all Pass Tasks must be


submitted meeting minimum standards as clearly specified in
the respective task specifications. For higher grades, all Pass
Tasks must be Complete, and additional Credit, Distinction, and
High Distinction tasks are also required. Each of these tasks will
have an indicated due date.

Summative assessment (tasks that will be graded or marked)


NOTE: It is your responsibility to keep a backup copy of every assignment and the materials used to develop/complete
it where possible (e.g. written/digital reports, essays, videos, images). In the unusual event that one of your
submissions becomes corrupted, is incorrectly submitted or otherwise lost, you may be asked to submit the backup copy.
Any work you submit may be checked by electronic or other means for the purposes of detecting breaches of academic
integrity such as collusion, plagiarism and contract cheating. You must understand your responsibility to act with honesty and
integrity in your studies as Deakin takes all breaches very seriously. Make sure you read Your rights and responsibilities as a
student in this unit to find out more about academic integrity.

Deakin has a universal assessment submission time of 8 pm AEDT/AEST. A late penalty will apply to assessments submitted
after 11.59 pm AEDT/AEST.
Summative assessment task 1

Title Learning Portfolio

24 February 2025 Page 4 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

Brief description of assessment This is an individual task. Students will develop their portfolio weekly by submitting
task tasks and receiving feedback through the OnTrack system. At the end of the Trimester
these tasks, along with a reflective learning summary report, will form the submission
for grading. Grades awarded will take into consideration the quality of the work
submitted, depth of learning demonstrated, evidence of achievement of unit learning
outcomes, timeliness of submissions, and ability to receive and incorporate feedback.

Task in this unit will involve things such as:


• Coding in Python and scripting to perform data wrangling, results analysis,
modelling and presenting results professionally
• Report on reflecting on the use different tools for data discovery, extraction, data
preparation for modelling, storage and retrieval, visualisation, and security and
privacy issues.
• Project Report on handling real data
• Presentation of the project results demonstrating various applications of tools and
techniques for data wrangling on real data.
• Choosing and comparing appropriate data processing methods for the goal at hand,
understanding limitations thereof
• Critical analysis of and discussion on the obtained results, aimed at both specialist
and nonspecialist audiences.
Detail of student output Students will work through weekly tasks throughout this unit and produce a range of
artefacts. This work will be combined with a reflective learning summary report into
the learning portfolio for assessment.
Grading and weighting 80%, marked and graded
(% total mark for unit)
This task assesses your The portfolio must demonstrate that you have achieved all unit learning outcomes by
achievement of these Unit providing evidence and self-reflection against each outcome.
Learning Outcome(s) ULO1: Apply programming, such as in python, and scripting for performing Data
wrangling using a range of data types and sources.
ULO2: Researching different data discover techniques for extracting data of different
types, such as time series, text, web, tabular, speech, image/video and geodata using
different data discovery methods/tools, including web crawling, JSON, and storing,
retrieving and sharing them in various forms, such as using databases and streams.
ULO3: Reviewing project requirements and available data sources to design data
models to support required analysis, and communicate solutions with stakeholders.
ULO4: Performing exploratory data analysis to verify appropriateness of the data
models obtained to meet stakeholder requirements.
ULO5: By applying and critically reflecting upon techniques for maintaining data
privacy and exercising ethics in data handling
This task assesses your GLO1: through performing data wrangling techniques using the python programming
achievement of these Graduate language.
Learning Outcome(s) GLO2: through communicating approaches to data wrangling tasks with different
stakeholders
GLO3: through your ability to discover, extract and process data using different tools
and methods.
GLO5: by exploring data in a project context to meet a range of stakeholder
requirements
GLO8: by critically reflecting upon professional practice and responsibilities associated
with handling potentially sensitive data.
How and when you will receive You will be required to work on and submit tasks for formative feedback regularly, as
feedback on your work per the specified period. The teaching team will then review your progress and
provide you with individual feedback to assist you in completing the tasks and
achieving your target grade for the unit.
When and how to submit your At the end of the unit, you will use OnTrack to combine the artefacts you have
work created and a learning summary report into a single portfolio for assessment. The
portfolio is due 8pm (AEST) Friday 30 May 2025, Week 12 (Study period).

24 February 2025 Page 5 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

Your learning experiences in this unit

Educator-facilitated (scheduled) learning activities - on-campus unit enrolment

1 x 3 hour seminar per week, weekly meetings.


Educator-facilitated (scheduled) learning activities - online unit enrolment

Online independent and collaborative learning including 1 x 2 hour online seminar per week, weekly meetings.
Typical study commitment

Students will on average spend 150 hours over the teaching period undertaking the teaching, learning and assessment
activities for this unit.
This will include educator guided online learning activities within the unit site.
The learning and assessment activities for this unit will include watching weekly videos, attending and participating in group
activities (online or in class), completing the learning and assessment tasks, and associated reading and study time.

All unit learning materials are provided in the unit site and in OnTrack. This includes the following resources organised for 11
weeks of learning to enable you achieve the unit learning outcomes:

In the unit site: In OnTrack:


• Text and Videos introducing unit content • Task sheets
• Links to readings and associated texts • Task resources, as required
• Web links • Individual feedback
• Discussion forums • Alignment of tasks to unit learning outcomes
• Visualisations of your progress to help keep you on track

You are expected to complete the prescribed readings, watch the concept and demonstration videos in the unit site, and
complete unit tasks in OnTrack. As you complete the tasks, you will be able to collect evidence for justifying how you have
met the unit learning outcomes through your portfolio. The process of developing your portfolio is simple and easy, so keep
that in mind as you read the assessment instructions below.

Engage with the learning resources each week on the Unit site before attending seminar or weekly group activities online.
These resources will help you prepare for the group activities. Each week’s seminar, or online group activity, will help develop
your understanding of unit concepts and skills.

Unit tasks within the portfolio are designed to help scaffold your learning, as well as helping you demonstrate achievement
of unit learning outcomes. You will have opportunities to receive and incorporate feedback from your tutor, allowing you to
freely ask questions and clarify misunderstandings without the risk of losing marks.

Make sure that you take advantage of the formative feedback process associated with the learning portfolio and weekly
tasks. During this interaction you can ask questions and clarify doubts to make sure that you will be fully prepared for this
assessment.

Note

At Deakin, courses are delivered within a learning environment that provides all students with equitable and consistent
access to facilities, infrastructure, resources, and support to assist student progress and achievement of learning outcomes.

We have introduced new terms to reflect learning activities to enhance your learning experience, aligning with our innovative
DeakinDesign learning principles and practices. The new terms better reflect how teaching teams will guide you through
your learning journey and the types of learning experiences you will have.

‘Lectures’ are the activities where teaching staff engage you through presentations with student participation.

24 February 2025 Page 6 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

In ’seminars’, an educator will guide you in a smaller group of students through highly interactive discussions and activities.

Your units may also include ‘practical experiences’ such as ‘laboratory‘, ‘workshops‘, ‘clinical skills‘ and more. These hands-
on activities typically take place in specialised facilities with industry tools, equipment or technology to allow you to apply
your knowledge practically.

Some other terms

If you see a ‘meeting‘ in your timetable, this is an optional drop-in session.

‘Assessments‘ or ‘team-based learning‘ indicate an in-class evaluation of your skills or knowledge. A ‘pre-assessment
practice‘ could be scheduled to prepare you for these assessments.

Find out more

Take a look at the Learning activities webpage for a full list of the terminology changes and reasons they were changed.
Unit learning resources

Your unit learning resources can be accessed from your unit site.
There is no prescribed text. Unit materials are provided via the unit site. This includes unit topic readings and references to
further information.

The texts and reading list for SIT731 can be found via the University Library.

Note: Select the relevant trimester reading list. Please note that a future teaching period's reading list may not be available
until a month prior to the start of that teaching period so you may wish to use the relevant trimester's prior year reading list
as a guide only.
Essential learning resources

Essential learning resources will be provided via the unit site. Further, relevant technical documentation will be linked to in
the relevant modules on the unit site.
Recommended learning resources

The Deakin Software Library provides students with access to software that you may need or find useful for your study at
Deakin.

Students looking for additional assistance should consult the following textbooks. These books match some of the content
being covered in the unit:

Minimalist Data Wrangling with Python, Zenodo, Marek Gagolewski, 2022, Melbourne, URL:
https://datawranglingpy.gagolewski.com/ (open-access textbook)
Wes McKinney, Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, O’Reilly, 3rd Edn, 2022
(This book is available online through the Deakin library).

Where to access unit resources

Textbooks can be sourced from various outlets including direct from the publisher, online bookshops, or retailers. Limited
copies of textbooks are also available on loan from the University Library.
Key dates for this study period

Trimester 1 teaching period begins Monday 3 March 2025


Census date Monday 31 March 2025
Easter/intra-trimester break Friday 18 April - Sunday 27 April 2025

24 February 2025 Page 7 of 8


Deakin University, Faculty of Science, Engineering and Built Environment
SIT731 Data Wrangling - Trimester 1, 2025

Trimester 1 teaching period ends Friday 23 May 2025


Study period (end-of-unit assessment /examination preparation period) Monday 26 May - Friday 30 May 2025
End-of-unit assessment and examination period Monday 2 June - Friday 13 June 2025
Inter-trimester break (the period between trimesters) Monday 16 June - Friday 4 July 2025
Unit results released Thursday 3 July 2025 (10.30 am)

Unit weekly activities

Week Commencing Topic Assessment due date


1 3 March 2025 MODULE 1: Python Introduction
2# 10 March 2025 MODULE 2: Data Examination
3 17 March 2025 MODULE 3: Data Validation. Identifying and addressing Hurdle 1: Pass tasks 1-2
issues in data that could compromise the accuracy,
consistency, and reliability of the analysis. Ensuring data
is clean and fit for further analysis.
4 24 March 2025 (continued)
5 31 March 2025 (continued)
6 7 April 2025 MODULE 4: Special topics in data science: A selected
data privacy topic.
7*~ 14 April 2025 MODULE 5: Data Transformation. Modifying the
structure or representation of data to make it more
suitable for statistical analysis or model building.
8 28 April 2025 (continued)
9 5 May 2025 (continued)
10 12 May 2025 (continued)
11 19 May 2025 MODULE 6: Special topics in data science: A selected
data privacy topic.
12 26 May 2025 Study period

Learning portfolio due


Friday 30 May
13-14^ 2 June 2025 End-of-unit assessment and
exam period

# Labour Day public holiday: Monday 10 March 2025 - University closed


* Easter/Intra-trimester break: Friday 18 April - Sunday 27 April 2025 (between weeks 7 and 8)
~ ANZAC Day public holiday: Friday 25 April 2025 - University closed
^ King's Birthday public holiday: Monday 9 June 2025 - University closed

24 February 2025 Page 8 of 8

You might also like