0% found this document useful (0 votes)
9 views21 pages

DATA228 Lecture Notes Week 4

The document provides an overview of distributed filesystems, focusing on Hadoop's Distributed File System (HDFS). It outlines key attributes, architecture, and core concepts of HDFS, including its design for large files, replication, and data flow for reads and writes. Additionally, it discusses the limitations of HDFS and its components like Namenodes and Datanodes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views21 pages

DATA228 Lecture Notes Week 4

The document provides an overview of distributed filesystems, focusing on Hadoop's Distributed File System (HDFS). It outlines key attributes, architecture, and core concepts of HDFS, including its design for large files, replication, and data flow for reads and writes. Additionally, it discusses the limitations of HDFS and its components like Namenodes and Datanodes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

DATA 228

Big Data Technologies and Applications (Fall 2024)

Sangjin Lee
Hadoop: distributed ilesystems
& HDFS

Ch pter 3, “H doop: the De initive Guide” 4th Edition, Tom White


a
a
f
f
What is a distributed ilesystem?

“A distributed ilesystem is ilesystem th t en bles clients to ccess ile stor ge from multiple
hosts through computer network s if the user w s ccessing loc l stor ge.”
f
a
a
f
a
a
a
f
a
a
a
a
f
a
a
What is a distributed ilesystem?
Key phrases

• A ilesystem

• Multiple hosts

• Through computer network

• As if the user w s ccessing loc l stor ge


f
a
a
a
a
a
f
What is a distributed ilesystem?
More attributes

• Sem ntics of ilesystem

• P ths, directories, ccess control, timest mps, etc.

• POSIX compli nce?

• Resiliency nd f ult toler nce import nt > loc l ilesystems

• Tr nsient network f ilures

• D t losses
a
a
a
a
a
a
a
a
a
f
a
a
a
a
a
a
f
f
Examples of distributed ilesystems

• More tr dition l: SMB, NFS

• Big-d t -driven: HDFS, GFS, M pR File System

• Stor ge-derived: CephFS, GlusterFS

• Cloud solutions (block-b sed): EBS (AWS), PD (GCP)

• Cloud solutions (object-b sed)*: s3 (AWS), GCS (GCP)

• Other vendor solutions: NetApp, Nut nix, Cohesity, …

* Not ll object stor ge systems re ilesystems.


a
a
a
a
a
a
a
a
a
a
f
a
a
f
Hadoop’s distributed ilesystem

• H doop provides n bstr ct (distributed) ilesystem API

• Clients of distributed ilesystems c n inter ct with them t the bstr ct level (vi URIs)

• HDFS is only one implement tion provided by H doop out of the box

• Ex mples

• file:// (loc l iles), hdfs:// (HDFS), s3n:// (s3 “n tive”), gs:// (GCS), …
a
a
a
f
a
a
f
a
a
a
a
f
a
f
a
a
a
a
a
HDFS
Design

• Ge red tow rds very l rge iles: GBs or TBs

• Stre ming d t ccess

• Write-once nd re d-m ny-times

• Re ding whole iles over r ndom seeks

• Commodity h rdw re

• Highly resilient to individu l node f ilures: multiple replic s, block rep irs, reb l ncing
a
a
a
a
a
a
a
a
f
a
a
a
a
a
a
a
f
a
a
a
a
a
HDFS
What HDFS is NOT so good at

• Low-l tency d t ccess

• Tr de-o between throughput nd l tency

• Lots of sm ll iles: tr de-o from rchitectur l nd sc le consider tions

• Multiple writers

• Arbitr ry ile modi ic tions

• Doesn’t provide full POSIX compli nce


a
a
a
ff
f
a
f
a
a
a
f
a
a
a
ff
a
a
a
a
a
a
a
a
HDFS
Core concepts

• Blocks

• Blocks re useful concept in ilesystem implement tions

• Loc l ilesystem blocks: commonly 512 B - 8 KB

• H doop’s def ult block size: 128 MB (often much l rger in re l clusters)

• Implic tions for sm ll iles

• Replic tion: 3 by def ult (er sure coding c n reduce it)

• Compression: up to users
a
a
f
a
a
a
a
a
a
a
f
a
f
a
a
a
a
HDFS
Architecture

• N menodes nd d t nodes
a
a
a
a
HDFS
Namenode

• “One” for single cluster

• N menode m n ges met d t : met d t for iles nd directories

• Block loc tions re reported by d t nodes (not persisted by n menode)

• N menode requires l rge mount of memory

• N menode d t (n mesp ce im ge nd the edit log) re written to disk in sever l loc tions

• N menode c n be sc l bility bottleneck


a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
a
f
a
a
a
a
a
HDFS
Namenode high availability (HA)

• Redund nt stor ge of the ilesystem met d t

• Second ry n menode

• Gets periodic upd tes from the prim ry n menode nd ret ins the st te

• It c n run s hot st ndby

• F ilover vi ZooKeeper

• Fencing

• Client h ndles it vi client libr ry


a
a
a
a
a
a
a
a
a
a
a
a
a
f
a
a
a
a
a
a
a
a
a
HDFS
Datanodes

• One per node

• Stores nd retrieves blocks ( sked by clients nd the n menode)

• Veri ies blocks’ checksums periodic lly

• Reports the block list to the n menode


f
a
a
a
a
a
a
HDFS
Data ow: reads
fl
HDFS
Data ow: reads

• DistributedFileSystem returns the block loc tions from the n menode

• Actu l re ds re done vi FSDataInputStream

• Re ds go directly to d t nodes (not through n menode)


a
a
fl
a
a
a
a
a
a
a
a
HDFS
Data ow: writes
fl
HDFS
Data ow: writes

• Client m kes request to write new ile vi DistributedFileSystem

• N menode cre tes record of the new ile

• D t nodes form pipeline of writes: blocking oper tion

• D t nodes report block loc tions to N menode

• Replic pl cement

• R ck diversity: s me node s client —> o -r ck —> s me r ck


a
a
a
a
a
a
a
fl
a
a
a
a
a
a
a
a
a
a
f
a
f
ff
a
a
a
a
a
HDFS
Replica placement
HDFS
Coherency model

• A ile is gu r nteed to exist fter create()

• A ile content m y not be visible even fter the stre m is lushed (vi flush())

• A ile content is gu r nteed to be visible fter hflush()

• File ren mes or directory ren mes re NOT tomic


f
f
f
a
a
a
a
a
a
a
a
a
a
a
a
a
f
a
HDFS
Demo

Exploring lesystem APIs


fi

You might also like