-
Notifications
You must be signed in to change notification settings - Fork 3
/
hadoop_syllabus.txt
69 lines (60 loc) · 1.39 KB
/
hadoop_syllabus.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
Introduction to Big Data
What is Big Data?
What are the challenges for processing big data?
What technologies support big data?
The V’s of BigData and Growing.
Introduction to Hadoop
An Overview of Hadoop
History of Hadoop
The Hadoop Distributed File System
MapReduce Programming model
Hadoop Ecosystem
Hadoop Cluster Setup
HDFS Design Goals
Name Node (NN), Secondary Name Node (SNN) and Data Nodes (DN)
Job Tracker(JT) and Task Tracker (TT)
Replica and Block Placement
HDFS commands
Read and Write Flow
Apache MapReduce
Components
Programming Model
Configuring and Writing MapReduce jobs in IDE
Hive
Introduction
Installation and Configuration
Data Types and File Formats
Loading data in internal table
Loading data in external table
Views in hive
Indexes in hive
Performance tuning in hive
Pig Latin
Installation and Configuration
What is grunt shell ?
Command Syntaxes
Data Model of Pig
Pig Script for wordcount
Java Code for running Pig for wordcount
Sqoop
Installation and Configuration
sqoop-import data
sqoop-free form query import
sqoop-export data
Oozie
What is oozie ?
Why do we use it ?
oozie Architecture
oozie action nodes
NoSQL
Introduction and Interaction
Storage Architecture
CRUD Operations
Query NoSQL Stores
Modifying Data Stores
Indexing
Managing Transactions
NoSQL in cloud
Parallel Processing
Performance Tuning
Tools and Utilities