Spark Tutorials and Insights

Learn how to use Apache Spark with R, Hadoop, Cassandra, and more with Spark tutorials written by domain experts.

GET STARTED

Spark tutorials, posts, and more

Big Data Analysis Using PySpark

## Learning Objectives 1. Introduction to PySpark 2. Understanding RDD, MapReduce 3. Sample Project - Movie Review Analysis ## Why Spark 1. Lighting Fast Processing 2. Real Time Strem Processing 3. Ea...
Big Data Analysis Using PySpark

Getting Started with Cassandra and Spark

This tutorial is going to go through the steps required to install Cassandra and Spark on a Debian system and how to get them to play nice via Scala.
Getting Started with Cassandra and Spark

Exploring geographical data using SparkR and ggplot2

The present analysis will use the power of SparkR to analyse large datasets in order to explore the 2013 American Community Survey dataset, more concretely its geographical features.
Exploring geographical data using SparkR and ggplot2

Spark & R: data frame operations with SparkR

In this third tutorial (see the previous one) we will introduce more advanced concepts about SparkSQL with R that you can find in the SparkR documentation, applied to the 2013 American Community Survey housing data. These concepts are related with data frame manipulation, including data slicing, summary statistics, and aggregations.
Spark & R: data frame operations with SparkR

Spark & R: Loading Data into SparkSQL Data Frames

In this second Spark & R tutorial, we will read data into a SparkSQL data frame as well as have a quick look at the schema.
Spark & R: Loading Data into SparkSQL Data Frames

Spark & R: Downloading data and Starting with SparkR using Jupyter notebooks

In this tutorial we will use the 2013 American Community Survey dataset and start up a SparkR cluster using IPython/Jupyter notebooks.
Spark & R: Downloading data and Starting with SparkR using Jupyter notebooks

Spark & Python: SQL & DataFrames

This tutorial will introduce you to Spark capabilities. By using SQL language and data frames, you can perform exploratory data analysis easily.

Building a Movie Recommendation Service with Apache Spark & Flask - Part 2

This Apache Spark tutorial goes into detail on how to use Spark machine learning models, or even another kind of data analytics objects, within a web service. By using the Python language, we make this task very easy, thanks to Spark own Python capabilities and to Python-based frameworks such as Flask.
Building a Movie Recommendation Service with Apache Spark & Flask - Part 2

Spark & Python: MLlib Logistic Regression

In this tutorial, you will learn how to use Spark's machine learning library MLlib to build a Logistic Regression classifier for network attack detection.

Spark & Python: Working with RDDs (I)

This tutorial introduces two different ways of getting data into the basic Spark data structure, RDD.

Spark & Python: MLlib Decision Trees

In this tutorial, you'll learn how to use Spark's machine learning library MLlib to build a Decision Tree classifier for network attack detection and use the complete datasets to test Spark capabilities with large datasets.

Spark & Python: MLlib Basic Statistics & Exploratory Data Analysis

In this Spark and Python tutorial, you'll learn more about MLlib basic statistics and exploratory data analysis.

Building a Movie Recommendation Service with Apache Spark & Flask - Part 1

Spark & Python: Working with RDDs (II)

This is a Spark and Python tutorial that teaches you how to work with RDDs (Part II).

Linear Models with SparkR 1.5: Uses and Present Limitations

In this analysis we will use SparkR machine learning capabilities in order to try to predict property value in relation to other variables present in the 2013 American Community Survey dataset.
Linear Models with SparkR 1.5: Uses and Present Limitations

Get curated posts in your inbox

Read more posts to become a better developer

YOU MAY ALSO BE INTERESTED IN

Share ideas
with an editor
built for developers

LEARN MORE