Skip to content

shasibhusanJena/Data-Engineering-training

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Engineering-training

it helps to learn data engineering topics and technologies

  • Hadoop
    • HDFS (Hadoop Distributed File System)
    • map reduce is the processing unit.
    • YARN (Yet another Resource Negotiator is a resource management unit)
  • hive (It is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis using SQL.)
  • Spark (It is an open-source used for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance)
  • Cassandra (It is an open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many servers, providing high availability with no single point of failure)
  • HBase (open-source non-relational distributed database modeled, which run on top of HDFS)
  • Scala
  • Pyspark
  • Kafka

About

it helps to learn data engineering topics and technologies

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages