Spark SQL Training

Spark SQL

Learn a widely used framework, Spark SQL, in a few days and describe complex
distributed applications with ease

French / English

Certificate

Submit now

Next session : August

Start there

Overview

This training allows developers and architects to write
complex distributed applications that enable better decisions to be made faster and
decisions and actions in real time, applied to a wide variety of use cases, architectures and
variety of use cases, architectures and industries

Prerequisite

Good knowledge of the Java language

Knowledge of functional programming and knowledge of database on database management

Goals

Master the fundamental concepts of Spark

Develop applications with Spark Streaming

Doing parallel programming with Spark on a cluster - Exploit data with Spark SQL

Have a first approach to Machine Learning

Training Program

INTRODUCTION TO APACHE SPARK

- History of the Framework
- The different versions of Spark (Scala, Python and Java)
- Comparison with the Apache Hadoop environment
- The different modules of Spark

PROGRAMMING WITH RESILIENT DISTRIBUTED DATASET (RDD)

- Presentation of RDDs
- Creating, manipulating and reusing RDDs
- Accumulators and broadcast variables
- Using partitions

HANDLING STRUCTURED DATA WITH SPARK SQL

- SQL, DataFrames and Datasets
- The different types of data sources
- Interoperability with RDDs
- Performance of Spark SQL
- JDBC/ODBC server and Spark SQL CLI

SPARK ON A CLUSTER

- The different types of architecture: Standalone, Apache Mesos or Hadoop YARN
- Configure a cluster in Standalone mode
- Packing an application with its dependencies
- Deploying applications with Spark-submit
- Size a cluster

ANALYSE IN REAL TIME WITH SPARK STREAMING

- How it works
- Presentation of Discretized Streams
- The different types of sources
- Manipulation of the API
- Comparison with Apache Storm

HANDLING GRAPHS WITH GRAPHX

- Presentation of GraphX
- The different operations
- Creating graphs
- Vertex and Edge RDD
- Presentation of different algorithms

MACHINE LEARNING WITH SPARK

- Introduction to Machine Learning
- The different classes of algorithms - Presentation of Spark ML and MLlib
- Implementations of the different algorithms in MLlib