Corso Vittorio Emanuele II, 39 - Roma 0669207671

Dottorato di Ricerca in Ingegneria dell'Innovazione Tecnologica (Academic Year 2022/2023) - Digital Technologies for Industry 4.0

Big Data Platforms


Credits: 6
Content language:English
Course description

When it comes to Big Data, there are many aspects to consider: for example, how big the datasets are, what kind of analysis we will do, what is the expected result, etc. In this course we will present an overview of the different aspects related to Big Data and describe the most widespread platforms depending on the type of problems that can be faced with each of them.

Prerequisites

Course Introduction to Big Data

Objectives

The course aims to provide a basic understanding of the various issues related to the management and analysis of Big Data, presenting the most widespread platforms currently.

Program

The course will analyze various aspects related to the management and analysis of Big Data, with particular reference to the following aspects:

- Batch computation vs Streaming

- Real Time Analysis

- Python Pandas

- Notebook Jupyter

- Tidy dataset: R, Pandas e Apache Arrow

- Big Graph Data Processing: Pregel e Giraph

- Apache Spark e Storm

- Cassandra

Book

Slide del corso

Mining of Massive Datasets - Jure Leskovec, Anand Rajaraman, Jeff Ullman – Cambridge University Press

Big Data, Big Dupe – Stephen Few – Analytics Press

Exercises

Use of some data analysis platforms using the python language and Jupyter notebook.

Professor/Tutor responsible for teaching
Luigi Laura
Video professors
Prof. Marco Pirrone - CKH Innovations Opportunities Development (Londra)
List of lessons
    •  Lesson n. 1: MAP-REDUCE  Go to this lesson
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone
Marco Pirrone