Spark SQL Training

Apache_Spark_logo.svg

Spark SQL

Learn a widely used framework, Spark SQL, in a few days and describe complex
distributed applications with ease

French / English

Certificate

Submit now

Next session : August

Overview

This training allows developers and architects to write
complex distributed applications that enable better decisions to be made faster and
decisions and actions in real time, applied to a wide variety of use cases, architectures and
variety of use cases, architectures and industries
Apache_Spark_logo.svg

Prerequisite

Good knowledge of the Java language
Knowledge of functional programming and knowledge of database on database management

Goals

Master the fundamental concepts of Spark
Develop applications with Spark Streaming
Doing parallel programming with Spark on a cluster - Exploit data with Spark SQL
Have a first approach to Machine Learning

Training Program

INTRODUCTION TO APACHE SPARK
- History of the Framework
- The different versions of Spark (Scala, Python and Java)
- Comparison with the Apache Hadoop environment
- The different modules of Spark
PROGRAMMING WITH RESILIENT DISTRIBUTED DATASET (RDD)
- Presentation of RDDs
- Creating, manipulating and reusing RDDs
- Accumulators and broadcast variables
- Using partitions
HANDLING STRUCTURED DATA WITH SPARK SQL
- SQL, DataFrames and Datasets
- The different types of data sources
- Interoperability with RDDs
- Performance of Spark SQL
- JDBC/ODBC server and Spark SQL CLI
SPARK ON A CLUSTER
- The different types of architecture: Standalone, Apache Mesos or Hadoop YARN
- Configure a cluster in Standalone mode
- Packing an application with its dependencies
- Deploying applications with Spark-submit
- Size a cluster
ANALYSE IN REAL TIME WITH SPARK STREAMING
- How it works
- Presentation of Discretized Streams
- The different types of sources
- Manipulation of the API
- Comparison with Apache Storm
HANDLING GRAPHS WITH GRAPHX
- Presentation of GraphX
- The different operations
- Creating graphs
- Vertex and Edge RDD
- Presentation of different algorithms
MACHINE LEARNING WITH SPARK
- Introduction to Machine Learning
- The different classes of algorithms - Presentation of Spark ML and MLlib
- Implementations of the different algorithms in MLlib

Project

01.
Model and render data
02.
Create interactive dashboards
03.
Securely publish and share these dashboards in Microsoft OneDrive and SharePoint
Select the fields to be shown. Others will be hidden. Drag and drop to rearrange the order.
  • Image
  • SKU
  • Rating
  • Price
  • Stock
  • Availability
  • Add to cart
  • Description
  • Content
  • Weight
  • Dimensions
  • Additional information
Click outside to hide the comparison bar
Compare