Equips students with the tools to build contemporary Big Data processing and analysis systems. Students learn how to create and develop each task in the machine learning pipeline from acquiring and cleaning data to analysing and visualising insights obtained from datasets including natural language datasets.
Topic 1: Introduction to Big Data
Topic 2: Big Data pre-processing
Topic 3: Big Data technologies: Hadoop, Scala and Spark
Topic 4: Using Spark through Python
Topic 5: Dimension reduction
Topic 6: Data visualisation
Topic 7: Managing time-series data
Topic 8: Advanced Tensor Flow, NumPy and Pandas functionalities
Topic 9: Introduction to natural language processing and text mining
Topic 10: Processing raw text
Topic 11: Classifying text
Topic 12: Recent advances in Big Data processing.
Unit Learning Outcomes express learning achievement in terms of what a student should know, understand and be able to do on completion of a unit. These outcomes are aligned with the graduate attributes. The unit learning outcomes and graduate attributes are also the basis of evaluating prior learning.
|On completion of this unit, students should be able to:|
|1||identify, manipulate and apply Big Data storage and processing technologies|
|2||acquire and clean large data sets|
|3||extract features and model large data sets|
|4||design algorithms and architectures for processing large data sets to extract patterns and information|
|5||develop natural language processing algorithms for text mining|
On completion of this unit, students should be able to:
- identify, manipulate and apply Big Data storage and processing technologies
- acquire and clean large data sets
- extract features and model large data sets
- design algorithms and architectures for processing large data sets to extract patterns and information
- develop natural language processing algorithms for text mining
- Bird, S, Klein, E & Loper, E, 2009, Natural language processing with Python: Analyzing text with the natural language toolkit, O’Reilly Media.
- McKinney, W, 2014, Python for data analysis: Data wrangling with Pandas, NumPy, and IPython, 1st edn, 3rd Release, O’Reilly Media.
- Prescribed text information is not currently available.
Teaching and assessment
Commonwealth Supported courses
For information regarding Student Contribution Amounts please visit the Student Contribution Amounts.
Please check the international course and fee list to determine the relevant fees.