Skip to content

ntuananh/CS522_BigData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

CS522_BigData

Implement some MapReduce algorithms, including Pair, Stripe, and Hybrid for Word Co-Occurence and Relative Frequency problems.

Run MapReduce jobs using Spark.

Reference:

Project 1: Pair - Stripe - Hybrid Approach for WordCount

drawing

Implement some MapReduce algorithms, including Pair, Stripe, and Hybrid for Word Co-Occurence and Relative Frequency problems.

  • In Mapper WordCount
  • Average
  • In Mapper Average
  • Pair Approach
  • Stripe Appoach
  • Hybrid Approach

Prerequisites

Cloudera & Eclipse Setup

Running

  • Run in eclipse or,
  • Run the bash script file

Project 2: Spark (Scala)

drawing

Using Spark, compute mean and standard deviation of the amount of gas consumption in UK

Dataset

Prerequisites

Spark & Scala Setup

Running

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published