Apache Spark Training

Apache Spark Training gives you a jump-start into Spark by training you on the various concepts of Big Data using Spark. The important feature of Spark is its in-memory cluster computing that increases the processing speed of an application. Our trainers are here to make you an expert of Spark.

 

Apache Spark Training Curriculum

Introduction To Big Data and Spark

Learn how to apply data science techniques using parallel programming during Spark training, to explore big (and small) data.

Introduction to Big Data
Challenges with Big Data
Batch Vs. Real Time Big Data Analytics
Batch Analytics – Hadoop Ecosystem Overview
Real Time Analytics Options
Streaming Data – Storm
In Memory Data – Spark
What is Spark?
Modes of Spark
Spark Installation Demo
Overview of Spark on a cluster
Spark Standalone Cluster

Spark Baby Steps

Learn how to invoke spark shell, build spark project with sbt, distributed persistence and much more…in this module.

Invoking Spark Shell
Creating the Spark Context
Loading a File in Shell
Performing Some Basic Operations on Files in Spark Shell
Building a Spark Project with sbt
Running Spark Project with sbt
Caching Overview
Distributed Persistence
Spark Streaming Overview
Example: Streaming Word Count
Playing With RDDs In Spark

The main abstraction Spark provides is a resilient distributed dataset (RDD), which is a collection of elements partitioned across the nodes of the cluster that can be operated on in parallel.

RDDs
Spark Transformations in RDD
Actions in RDD
Loading Data in RDD
Saving Data through RDD
Spark Key-Value Pair RDD
Map Reduce and Pair RDD Operations in Spark
Scala and Hadoop Integration Hands on
Shark – When Spark Meets Hive
Shark is a component of Spark, an open source, distributed and fault-tolerant, in-memory analytics system, that can be installed on the same cluster as Hadoop. This module of spark training, will give insights about Shark.

Why Shark?
Installing Shark
Running Shark
Loading of Data
Hive Queries through Spark
Testing Tips in Scala
Performance Tuning Tips in Spark
Shared Variables: Broadcast Variables
Shared Variables: Accumulators.

  • PRIVATE
  • 10 Days
  • 0 Units
  • 0 Hrs

Select Your Currency

WOOCS 1.1.8
Drop Us A Query
[contact-form-7 id="5639" title="Drop Us A Query"]
© 2016, ALL RIGHTS RESERVED.
Create an Account