KF7032 - Big Data and Cloud Computing

What will I learn on this module?

In this module you will develop knowledge and skills that will enable you to tackle a realistic big data problem, using some of the principal machine learning techniques and statistical approaches used in big data analysis. Furthermore, you will learn how to implement your solution using an industry leading Cloud computing provider together with appropriate distributed processing environments.

You will learn how to host multi-terabyte sized big datasets using a cloud service provider. This will includes provisioning a commercial cloud provider, and then mastering appropriate distributed operating systems, such as Hadoop. You will then learn approaches to processing and analysing big data, based on advanced statistical processing, supervised and unsupervised machine learning algorithms and other state of the art big data analytic methods. Such techniques include clustering algorithms, pattern based information extraction, linear and non-linear regression, and feature based models. Inevitably, much work on big data analysis is statistical, so you will therefore develop some relevant statistical understanding. As data visualization is frequently critical in helping to develop hypotheses about the data, you will also cover and apply problem relevant 2D and 3D visualization methods where appropriate to the particular datasets.

How will I learn on this module?

You will learn through a combination of methods to support learning, including lectures, practical sessions in workshops and guided learning. Topics will normally be introduced in lectures and explored through practical exercises (helping you develop the practical skills needed) and guided learning activities. You will be encouraged to develop independent self-learning skills and the development of critical analytic approaches to the big data and cloud computing area.

More specifically, you will work in teams using a leading cloud services provider and big data analysis techniques as the basis of your practical work, giving you immediate saleable skills. Staff will support your learning through verbal feedback on your practical achievements.

All module material will be available on the eLearning Portal (ELP) so that you can access information when you need to. The university library offers support for all students through its catalogue and an Ask4Help Online service.

How will I be supported academically on this module?

Staff will support you in the practical sessions, providing advice and feedback on your progress and engaging in discussion with you, to examine your ideas and those of others as your tutors value your input and opinions. You will be strongly encouraged to engage in further study by yourself or with other students outside of class time to become an independent learner. This is an essential capability in every area of Computing, whose utility will long outlive the detail of current technical approaches.

This module will use and promote an eLP (Blackboard) based discussion forum. This will be configured to encourage you, other students and academic staff to participate in discussion about the subject matter of the module.

What will I be expected to read on this module?

You will read books, scientific refereed articles and conference papers. You will be expected to go beyond blogs, way beyond web pages and to develop independent critical research capabilities. This capacity to research and critically analyse formal literature will stand you in good stead when confronted with the swathes of uncritical marketing white papers with which the modern IT professional has to contend.

All modules at Northumbria include a range of reading materials with which students are expected to engage. The reading list for this module can be found at: http://readinglists.northumbria.ac.uk

What will I be expected to achieve?

Knowledge & Understanding:
1. Apply big data analytic algorithms, including those for visualization and cloud computing techniques to multi-terabyte datasets.
2. Critically assess data analytic and machine learning algorithms to identify those that satisfy given big data problem requirements

Intellectual / Professional skills & abilities:
3. Critically evaluate and select appropriate big data analytic algorithms to solve a given problem based on critical review of the literature, considering the processing time available and other aspects of the problem.
4. Design and develop advanced big data applications that integrate with third party cloud computing services and evaluate environmental and societal impact of these applications and minimise their adverse impacts.

Personal Values Attributes (Global / Cultural awareness, Ethics, Curiosity) (PVA):

5. Critically assess and interpret primary research to identify its applicability to a given big data problem scenario.

How will I be assessed?

Formative assessment: Lab exercises carried out within weekly workshops will build up to form a basis of the two summative assessments. Feedback will be given during workshop sessions.

Summative assessments
• The first coursework assessment (worth 25%) will be a team-work and will involve producing background study material and critically reviewing the literature on big data analytics and cloud computing topic (e.g., crime and big data), examining published work on the topic, technical approaches to the given problem, relevant statistics and other computational methods. Each team will write an academic report (up to 2000 words) with at least a dozen citations and references to scientific conference or journal papers. It will assess MLOs 2, 3 and 5. Using a supplied marking scheme, peers will review other group's work. Credit will be given for reviewing appropriately whilst average peer review mark will receive course credit.

• The second coursework assessment (worth 75%) will be an individual work and will involve designing, constructing and critically justifying an appropriate solution for a given big data problem scenario by provisioning and configuring appropriate Cloud Computing resources. Appropriate algorithms and methods of visualising the results to best answer the research question on the given topic will require selection and justification. It will be submitted in the form of a Jupyter notebook comprising code, outputs, results, graphs, etc as well as text integrated within the notebook. It will asses MLOs 1, 2, 3 and 4.

Feedback on assessment: Students will be given detailed feedback on the first group assignment clearly identifying both the weaknesses and strong points of the work. As this will be submitted about 8th week of the module delivery, it will enable the students to identify those areas where they need to focus their efforts on in their individual assignment.





Module abstract

Big Data is the colloquial term used to describe the acquisition of knowledge, insights and understanding gained through identification of patterns in huge, multi-terabyte datasets. In this module you will develop knowledge and skills that will enable you to tackle a realistic big data problem. Furthermore, you will learn how to implement your solution using an industry leading Cloud computing provider together with appropriate distributed processing environments such as Hadoop. Frequently a first step in Big Data analysis insight is gained through visualizing the data. This may give insights into appropriate analytic approaches. You will also learn some of the principal machine learning techniques and statistical approaches used in big data analysis.

Course info

Credits 20

Level of Study Postgraduate

Mode of Study 1 year Full Time

Department Computer and Information Sciences

Location City Campus, Northumbria University

City Newcastle

Start September 2024

Fee Information

Module Information

All information is accurate at the time of sharing. 

Full time Courses are primarily delivered via on-campus face to face learning but could include elements of online learning. Most courses run as planned and as promoted on our website and via our marketing materials, but if there are any substantial changes (as determined by the Competition and Markets Authority) to a course or there is the potential that course may be withdrawn, we will notify all affected applicants as soon as possible with advice and guidance regarding their options. It is also important to be aware that optional modules listed on course pages may be subject to change depending on uptake numbers each year.  

Contact time is subject to increase or decrease in line with possible restrictions imposed by the government or the University in the interest of maintaining the health and safety and wellbeing of students, staff, and visitors if this is deemed necessary in future.


Useful Links

Find out about our distinctive approach at 

Admissions Terms and Conditions

Fees and Funding

Admissions Policy

Admissions Complaints Policy