0% found this document useful (0 votes)

28 views43 pages

Storage Donvito Chep 2013

Uploaded by

rahmandhikamuhammadcahyani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views43 pages

Storage Donvito Chep 2013

Uploaded by

rahmandhikamuhammadcahyani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Testing of several distributed

file-system (HadoopFS,
CEPH and GlusterFS) for
supporting the HEP
experiments analisys.
Giacinto DONVITO
INFN-Bari

1
Agenda
• Introduction on the objective of the test
activities
• HadoopFS
• GlusterFS
• CEPH
• Tests and results
• Conclusion and future works
2
Introduction on the objective of
the test activities
• The aim of the activity is to verify:
– Performance
– Reliability
– Features
• Considering solutions that provides software
redundancy
– A site could use commodity hardware to achieving
high level of data availability
• The scalability should be guaranteed at the order
of few PetaByte
3
Introduction on the objective of
the test activities
• The focus is to serve typical Tier2/Tier3 sites
for LHC experiments
– Supporting interactive usage
– Running data analysis
– Supporting SRM, gridftp, Xrootd
– Being prepared for the new cloud storage
techniques
• Open Source solutions
4
HadoopFS

• Apache Hadoop Distributed File System:

– Open-source
– Developed in Java
– Large dataset
– Fault tolerant
– Commodity hardware
– High throughput
– Scalable
– Rack awareness 5
HadoopFS

6
HadoopFS

• "The primary objective of HDFS is to store

data reliably even in the presence of
failures" (Hadoop documentation)
– File are split in chunk (default 64MB)
• dfs.blocksize
– Placement policy (default):
• 3 replicas
– 1 replica in the local rack
– 2 replicas in the remote rack
• dfs.replication
7
HadoopFS functionality test
• We have executed several test to check the
behavior at several types of failures:
– Metadata failures
• Client retry, active-standby namenode
– Datanode failures:
• During write operation, during read operation, in case of data
corruption, mis-replicated blocks, under and over replicated
blocks
• We always succeeded to fulfill the expected
behavior and no (un-expected) data-loss were
registered
8
HadoopFS
Data Corruption

Mis-replicated blocks

Rack 2
Rack 1
HadoopFS: our development
• One Replica Policy
– 1 replica per rack
• Increasing the reliability
• Increasing the available bandwidth for read
operation

10
HadoopFS: our development
• Hierarchical Policy
– It is able to exploit a geographically distributed
infrastructure
– 2 replicas in the source site in 2 different racks
• The data will survive also to the loss of a
complete site

11
HadoopFS pros&cons
• MapReduce
• Dynamic self-healing
• Great Scalability
– Already tested in few big Tier2 in LHC and many
companies
• Web monitoring interface
• Support for SRM (Bestman) and gridftp/xrootd
(Nebraska)
• Non strictly-posix compliance
– Fuse based
• No support for new cloud storage technologies 12
GlusterFS
• OpenSource solution acquired by RedHat
• Could be used both with disk in the WN and with standard
infrastructures based on disk servers (SAN/DAS)
• Written in C under GPLv3
• Posix compliance
• Exploit NFS protocol
• Available on many platforms (RedHat, Debian, MacOS,
NetBSD, OpenSolaris)
• Support also new storage cloud technologies (Block
Storage, Object Storage, etc)
– It is based on Swift (OpenSource Object Storage developed within
OpenStack framework)
13
GlusterFS
• Working behavior:
– The client exploit a FUSE module to access file and implement
advanced policy (Distribute/Stripe/Replica, etc)
• The client and
server could exploit
both TCP/IP and
infiniband
connections
• The server hosts
data on standard
file-systems (ext4,
xfs, etc)

14
GlusterFS

• Working behavior:
– Striped Volume
– Replicated Volume
– Distributed Volume
– Striped Replicated Volume
– Distributed Replicated Volume 15
GlusterFS
• POSIX ACL support over NFSv3
• Virtual Machine Image Storage
• qemu – libgfapi integration
• improvements in performance for VM image hosting
• Synchronous replication improvements
• Distributed-replicated
and Striped-replicated
are very important in
the contest where
performance and data
availability is important
16
GlusterFS
• GlusterFS provides a geographical replication solution
• Could be useful as disaster recovery solution
• It is based on the paradigm of active-backup
• It is based on rsync
• It is possible to
replicate the whole file-
system or a part of it
• It could be used also
from one site to
multiple instances of
GlusterFS on different
sites 17
Glusterfs pros&cons
• Easy to install and configure
• Fully posix compliance
• Many available configuration
• Great performance
• Provides interesting cloud storage solutions
• Some instabilities and data loss in some
specific situations
• There are no many scalability reports beyond
petabyte 18
CEPH file-system
• Development started in 2009
• Now it is acquired by a company (Inktank) also if it is
still an completely OpenSource projects
• It is integrated by default in the Linux Kernel since
2.6.34 release (may 2010)
• It could use, although not already at “production
level”, BTRFS (B-tree fle system) as backend
– Several interesting features (Raid0/1, and soon Raid5/6,
data deduplication, etc) implemented at software level

19
CEPH file-system
• Designed to be scalable and fault-tolerant
– In order to support >10’000 disk server
– Up to 128 metadata server (could serve up to 250kops/s aggregate)
• CEPH can provide three different storage interfaces: Posix
(both at kernel level and using fuse), Block and Object
storage
• Several IaaS cloud platforms (i.e.: OpenStack, CloudStack)
officially supports CEPH to provides Block Storage solution
• The suggested configuration do not require/suggest the use of
any hardware raid: the data availability is implemented at
software level

20
CEPH file-system
• The data distribution is based on an hash function
– No query needed to know the location of a given file
• This means that the mapping is “unstable”:
Ceph data placement
– Adding a disk server, mean that the whole cluster need to
reshuffling the location of the data
• It is possible to define “failure domain” at the level of: disk,
server, rackFiles striped over objects
File …

4 MB objects by default
• Data placement rules could be

Objects …
Objects mapped to placement
customized:

groups (PGs)
– “tre different copies of the same PGs …
pgid = hash(object) & mask … … …

file in three different racks”

PGs mapped to sets of OSDs

• All the datanodes knows the exact

location of all crush(cluster,
the filesrule, in pgid)
the= cluster
[osd2, osd3]

OSDs
(grouped by
failure domain)
~100 PGs per node 21
A simple example
CEPH file-system
fd=open(”/foo/bar”, O_RDONLY)
• Monitor: manages the heartbeats
Client:
among
requests open from MDS nodes Client MDS Cluster

• MDS: manages l’I/O MDS:on metadata

reads directory /foo from object store
• OSD: contains the objects
MDS: issues capability for file content

• The client will interact with

read(fd,
buf,all1024)
the three services
• A 10 node cluster will be composed by:
Client: reads data from object store
– 3 monitor node
– 3 MDS node close(fd)
– 7 OSD node Client: relinquishes capability to MDS

Object Store

MDS out of I/O path

Object locations are well known–calculated
from object name

22
CEPH file-system
• The three storage interfaces (posix, block and object) are
different gateways on the same objects APIs
• The object could be stored also “striped” in order to increase
the performances
– Object Size, Stripe Width, Stripe Count Failure recovery
Metadata scaling
• Data Scrubbing: it is possible to periodically check the data
consistency (to avoid
8 MDS nodes, and 250,000 metadata ops/second
inconsistencies between data and
Nodes quickly recover

15 seconds—unresponsive node declared dead

5 seconds—recovery
metadata, and or data corruptions)

tes of potentially many terabytes/second Subtree partitioning limits effect of individual failures on rest of cluster

ystems containing many petabytes of data

23
CEPH functionalities test
• The “quorum” concept is used for each critical
service (there should be odds numbers of instances):
– If 2 over 3 services are active the client could read and write. Is only
one is active the client could only read
• We verified the behaviour in case of failure of each
service:
– The High Availability worked always as expected
– We tested both failure in data and metadata services
– Both using posix and RBD interfaces
• We tested also the possibility to export the storage using
standard NFS protocols
– It works quite well both using RBD and POSIX interface
• Was very unstable using kernel interface
24
CEPH RBD
• CEPH RBD features:
– Thinly provisioned
– Resizable images
– Image import/export
– Image copy or rename
– Read-only snapshots
– Revert to snapshots
– Ability to mount with Linux or QEMU KVM clients
• In OpenStack it is possible to use CEPH both as device in
Cinder (Block storage server) and for hosting virtual images in
Glance (Image Service)
• CEPH provides an Object Storage solution that has interfaces
compatible with both S3 (Amazon) and Swift (OpenStack)
25
CEPH Performance test
Test Performance -‐ Reading
140,00

120,00
CephFS
BS 100,00
CephFS with NFS
80,00

MB/s
CephFuse
4K 60,00
CephFuse with NFS
128K 40,00 CephRBD
4M 20,00 CephRBD with NFS

0,00
BS
4K 128K 4M
Test Performance -‐ Wri5ng
70,00

60,00
CephFS
50,00
CephFS with NFS
40,00
MB/s

CephFuse
30,00

20,00
CephFuse with NFS

CephRBD
Virtual Machine
10,00
CephRBD with NFS
0,00 26
BS
4K 128K 4M
CEPH pros&cons
• Complete storage solution (supports all the
storage interfaces: posix, object, block)
• Great scalability
• Fault-tolerant solution
• Difficult to install and configure
• Performance issues
• Some instabilities while under heavy load

27
HDFS v2 CDH 4.1.1 (by USCMS
Nebraska)
● 20 datanodes, 1 namenode
● Chunk size: 128MB, Rdbuffer: 128MB, Big_writes active
● # iozone -r 128k -i 0 -i 1 -i 2 -t 24/36 -s 10G
MB/s
24 Threads 36 Threads
Ini5al Write 239.72
Re-‐write
Random Write
Ini5al Read 155.18 193.65
Re-‐read 151.33 207.43
Random Read 29.06 39.98
HDFS – 24 threads

155*20 = 3.1GByte/s
HDFS – 36 threads

193*20 = 3.8GByte/s
Ceph Cuttlefish (0.61)
● 3 Mon, 1 Mds, 120 osd (6osd * 20nodi)
● On all the nodes (SLC6)
○ # iozone -r 128k -i 0 -i 1 -i 2 -t 24 -s 10G

MB/s
24 Threads
Ini5al Write 52.49
Re-‐write 54.05
Random Write ERROR
Read 95.38 95*20 = 1.9 GByte/s
Re-‐read 102.04
Random Read ERROR 31
Ceph Cuttlefish (0.61)

95*20 = 1.9 GByte/

s
Ceph Dumpling (0.67.3)
● 3 Mon, 1 Mds, 95 osd (5osd * 19nodi)
● On all the nodes (SLC6)
○ # iozone -r 128k -i 0 -i 1 -i 2 -t 24 -s 10G

MB/s
24 Threads
Ini5al Write 18.93
Re-‐write 19.31
Random Write 13.96
Read 53.40 53*19 = 1.0 GByte/s
Re-‐read 57.29
Random Read 5.13 33
Ceph-Dev (0.70)
● 3 Mon, 1 Mds, 15 osd (5osd * 3nodi)
● On all the nodes (Ubuntu 12.04)
○ # iozone -r 128k -i 0 -i 1 -i 2 -t 24 -s 10G
MB/s

24 Threads
Ini5al Write 51,06
Re-‐write 60,05
Random Write 7,00
Read 101,58
Re-‐read 133,61
Random Read 12,05
Gluster v3.3
● 21 nodes, 6 brick per node
● On all the nodes (SLC6)
○ # iozone -r 128k -i 0 -i 1 -i 2 -t 24 -s 10G

MB/s
24 Threads
Ini5al Write 234.06
Re-‐write 311.75
Random Write 326.89
Ini5al Read 621.08 621*21 = 13 GByte/s
Re-‐read 662.92
Random Read 242.75
Gluster v3.3

10GByte/s
Gluster v3.4
● 20 nodes, 6 brick per node
● On all the nodes (SLC6)
○ # iozone -r 128k -i 0 -i 1 -i 2 -t 24 -s 10G

MB/s
24 Threads
Ini5al Write 306.34
Re-‐write 406.90
Random Write 406.33
Read 688.06 688*20 = 13 GByte/s
Re-‐read 711.46
Random Read 284.00
Gluster v3.4

10GByte/s
Using dd for comparing them all

24 dd in parallel -‐ 10GB ﬁle -‐ bs 4M

MB/s HDFS CEPH CF GLUSTER

read 220.05 126.91 427.3
write 275.27 64.71 268.57

Average per single host (the cluster is made by 20 hosts)
Conclusions …

• We have tested, from a point of view of the

performance and functionalities, three of the
main known and diffused storage solution …
• … trying to focus on the possibility not to use
an hardware raid solution
• taking into account the new cloud storage
solution that are becoming more and more
interesting
40
Conclusions …
• Hadoop
– looks very stable, mature and scalable solution
– Not fully posix compliance and not the fastest
• GlusterFS:
– Very fast, posix compliant, and easy to manage
– Maybe not as scalable as the others, still have few
reliability problems
• CEPH:
– Looks very scalable, complete and technological
advanced
– Still not very mature and stable, performance issues
41
… and future works

• We will continue this activity of testing storage

solution in order to follow the quite fast evolution
in this field
• In particular CEPH looks quite promising if/when
stability and performance issues will be solved.
• The increasing interest in cloud storage solution
are forcing the developers to put effort in
providing both block and object storage solutions
together with the standard posix
42
People Involved

• Domenico DIACONO (INFN-Bari)

• Giacinto DONVITO (INFN-Bari)

• Giovanni MARZULLI (GARR/INFN)

Network Services Doc and Presentation
No ratings yet
Network Services Doc and Presentation
39 pages
Scalable OpenSource Storage
No ratings yet
Scalable OpenSource Storage
31 pages
Class Notes
No ratings yet
Class Notes
9 pages
Docs Template
No ratings yet
Docs Template
12 pages
Ceph An Overview
No ratings yet
Ceph An Overview
8 pages
XtreemFS-A Cloud File System
No ratings yet
XtreemFS-A Cloud File System
27 pages
s41781-021-00071-1
No ratings yet
s41781-021-00071-1
10 pages
Lecture_14_HDFS_GFS
No ratings yet
Lecture_14_HDFS_GFS
30 pages
Storage Tiering and Erasure Coding in Ceph - 150222
No ratings yet
Storage Tiering and Erasure Coding in Ceph - 150222
79 pages
Distributed Filesystems Review
No ratings yet
Distributed Filesystems Review
30 pages
Distributed File System Review: Schubert Zhang May 2008
No ratings yet
Distributed File System Review: Schubert Zhang May 2008
30 pages
Virtual Machine Block Storage With The Distributed Storage System
No ratings yet
Virtual Machine Block Storage With The Distributed Storage System
40 pages
Ceph
No ratings yet
Ceph
40 pages
Electronics: Performance Evaluations of Distributed File Systems For Scientific Big Data in FUSE Environment
No ratings yet
Electronics: Performance Evaluations of Distributed File Systems For Scientific Big Data in FUSE Environment
16 pages
Openstack Ceph Admin
No ratings yet
Openstack Ceph Admin
168 pages
The Hadoop Approach
100% (2)
The Hadoop Approach
14 pages
18-Distributed File Systems Study On Operating Systems
No ratings yet
18-Distributed File Systems Study On Operating Systems
24 pages
Hadoop Requirements v16
No ratings yet
Hadoop Requirements v16
27 pages
Ceph File System
100% (1)
Ceph File System
13 pages
Gluster FS
No ratings yet
Gluster FS
60 pages
LNCS 2834 Design of Cluster Safe File System 1st edition by Gang Zheng, Kai Ding, Zhenxing He ISBN 3540200541 978-3540200543 - Download the ebook today and own the complete content
100% (9)
LNCS 2834 Design of Cluster Safe File System 1st edition by Gang Zheng, Kai Ding, Zhenxing He ISBN 3540200541 978-3540200543 - Download the ebook today and own the complete content
36 pages
Ceph Workshop: Gridka School 2015
No ratings yet
Ceph Workshop: Gridka School 2015
56 pages
Hadoop and Big Data Unit 2
No ratings yet
Hadoop and Big Data Unit 2
11 pages
3.1 Hadoop Ecosystem
No ratings yet
3.1 Hadoop Ecosystem
48 pages
Large Scale Distributed File System Survey
No ratings yet
Large Scale Distributed File System Survey
7 pages
A Novel Distributed File System Using Blockchain Metadata
No ratings yet
A Novel Distributed File System Using Blockchain Metadata
20 pages
Cloud Computing - Unit 3
No ratings yet
Cloud Computing - Unit 3
38 pages
CS19741-Cloud Computing-Unit 3 Notes
No ratings yet
CS19741-Cloud Computing-Unit 3 Notes
37 pages
DATA228 Lecture Notes Week 4
No ratings yet
DATA228 Lecture Notes Week 4
21 pages
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
No ratings yet
Presentation ON Distributed File System: Institute of Engineering and Technology Bundelkhand University
51 pages
Ceph, Storage for CERN Cloud
No ratings yet
Ceph, Storage for CERN Cloud
10 pages
15 Gfs
No ratings yet
15 Gfs
40 pages
Red Hat Ceph Storage-1.2.3-Red Hat Ceph Architecture-En-US
No ratings yet
Red Hat Ceph Storage-1.2.3-Red Hat Ceph Architecture-En-US
24 pages
Rook-Ceph: Bare Mental Persistent Storage Strategies For Kubernetes
No ratings yet
Rook-Ceph: Bare Mental Persistent Storage Strategies For Kubernetes
7 pages
RH Ceph-storage-5-datasheet
No ratings yet
RH Ceph-storage-5-datasheet
6 pages
Zint Les Feedback
No ratings yet
Zint Les Feedback
3 pages
Hadoop Intro
No ratings yet
Hadoop Intro
40 pages
BDA-Unit-I
No ratings yet
BDA-Unit-I
18 pages
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
No ratings yet
Big-Data Computing: Hadoop Distributed File System: B. Ramamurthy
43 pages
Apache Hadoop Filesystem and Its Usage in Facebook
No ratings yet
Apache Hadoop Filesystem and Its Usage in Facebook
33 pages
Other File Systems: LFS, NFS, and Afs
No ratings yet
Other File Systems: LFS, NFS, and Afs
37 pages
DW - Bigdata9
No ratings yet
DW - Bigdata9
113 pages
12 Open Source Cloud Storage Software To Store and Sync Your Data Quickly and Safely
No ratings yet
12 Open Source Cloud Storage Software To Store and Sync Your Data Quickly and Safely
16 pages
Ceph: A Scalable, High-Performance Distributed File System
No ratings yet
Ceph: A Scalable, High-Performance Distributed File System
14 pages
Learning Ceph Sample Chapter
No ratings yet
Learning Ceph Sample Chapter
23 pages
RHCS 2017 - Past, Present and Future
No ratings yet
RHCS 2017 - Past, Present and Future
37 pages
The Google File System: Kenneth Chiu
No ratings yet
The Google File System: Kenneth Chiu
40 pages
Gluster Intro TDOSE
No ratings yet
Gluster Intro TDOSE
18 pages
Chapter 4 - Hadoop Ecosystem
No ratings yet
Chapter 4 - Hadoop Ecosystem
24 pages
LNCS 2834 Design of Cluster Safe File System 1st edition by Gang Zheng, Kai Ding, Zhenxing He ISBN 3540200541 978-3540200543 - Read the ebook online or download it for a complete experience
No ratings yet
LNCS 2834 Design of Cluster Safe File System 1st edition by Gang Zheng, Kai Ding, Zhenxing He ISBN 3540200541 978-3540200543 - Read the ebook online or download it for a complete experience
34 pages
Distributed File Systems
No ratings yet
Distributed File Systems
28 pages
GFS
No ratings yet
GFS
44 pages
P2P File Sharing
No ratings yet
P2P File Sharing
43 pages
The Hadoop Distributed File System
No ratings yet
The Hadoop Distributed File System
44 pages
Distributed File Systems Concepts and e 61384
No ratings yet
Distributed File Systems Concepts and e 61384
54 pages
Distributed Computing Module 5 Important Topics PYQs
No ratings yet
Distributed Computing Module 5 Important Topics PYQs
23 pages
CFS: A Distributed File System For Large Scale Container Platforms
No ratings yet
CFS: A Distributed File System For Large Scale Container Platforms
13 pages
The Ceph Handbook: Building and Managing Scalable Distributed Storage Systems
From Everand
The Ceph Handbook: Building and Managing Scalable Distributed Storage Systems
Robert Johnson
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
Linux Services Deployment
From Everand
Linux Services Deployment
Fabian Mestre
No ratings yet
File Input and Output
No ratings yet
File Input and Output
2 pages
Hus File Module Storage Subsystem Administration Guide
No ratings yet
Hus File Module Storage Subsystem Administration Guide
40 pages
Replaced HDD USB Update Firmware MB - MPS V1
No ratings yet
Replaced HDD USB Update Firmware MB - MPS V1
6 pages
AJ WCRehostUtilityGuide 12-1-2 0
No ratings yet
AJ WCRehostUtilityGuide 12-1-2 0
206 pages
Zones
No ratings yet
Zones
20 pages
SDN3
No ratings yet
SDN3
25 pages
Paas Under The Hood Printversion
No ratings yet
Paas Under The Hood Printversion
23 pages
SF - Admin - 61 - Aixveritas Storage Foundation 6.1
No ratings yet
SF - Admin - 61 - Aixveritas Storage Foundation 6.1
823 pages
Os Practical Sample
No ratings yet
Os Practical Sample
15 pages
Yaffs
No ratings yet
Yaffs
34 pages
SS4000-E SW Release Notes-Release1.4
No ratings yet
SS4000-E SW Release Notes-Release1.4
33 pages
Operating System Concepts & Computer Fundamentals: Sachin G. Pawar Sunbeam, Pune
No ratings yet
Operating System Concepts & Computer Fundamentals: Sachin G. Pawar Sunbeam, Pune
100 pages
These Are Some of The L3 Level Unix Interview Questions
No ratings yet
These Are Some of The L3 Level Unix Interview Questions
4 pages
Operating System Design and Implementation 2nd Edition by Andrew Tanenbaum, Albert Woodhull ISBN 0136386776 9780136386773 pdf download
100% (2)
Operating System Design and Implementation 2nd Edition by Andrew Tanenbaum, Albert Woodhull ISBN 0136386776 9780136386773 pdf download
39 pages
Linux Book
No ratings yet
Linux Book
35 pages
dbms-all-modules-hi-hhjgcjgchvkh
No ratings yet
dbms-all-modules-hi-hhjgcjgchvkh
52 pages
Netbsd en Devel
No ratings yet
Netbsd en Devel
367 pages
HANA Redhat-Clusters
No ratings yet
HANA Redhat-Clusters
36 pages
Data Storage Technologies and Networks - (Quiz 1)
No ratings yet
Data Storage Technologies and Networks - (Quiz 1)
5 pages
Easeus Partition Master User Guide
No ratings yet
Easeus Partition Master User Guide
46 pages
TestDisk Step by Step - CGSecurity
No ratings yet
TestDisk Step by Step - CGSecurity
13 pages
Chapter 8 - File Management
100% (2)
Chapter 8 - File Management
100 pages
Kickstart 2 7 0 RN Nov 2021 KICKSTART-2.7.0
No ratings yet
Kickstart 2 7 0 RN Nov 2021 KICKSTART-2.7.0
10 pages
Ozone User v0
No ratings yet
Ozone User v0
43 pages
Red Hat Gluster Storage-3.4-Quick Start Guide-En-US
No ratings yet
Red Hat Gluster Storage-3.4-Quick Start Guide-En-US
37 pages
Installing Oracle Database 11g Release 1 On Enterprise Linux 5 (32 - and 64-Bit)
No ratings yet
Installing Oracle Database 11g Release 1 On Enterprise Linux 5 (32 - and 64-Bit)
19 pages
1OS Structure
No ratings yet
1OS Structure
8 pages
Napp It
No ratings yet
Napp It
62 pages
CC Mini Project Report
No ratings yet
CC Mini Project Report
13 pages
OS 18ec641
No ratings yet
OS 18ec641
2 pages

Storage Donvito Chep 2013

Uploaded by

Storage Donvito Chep 2013

Uploaded by

Testing of several distributed

• Apache Hadoop Distributed File System:

• "The primary objective of HDFS is to store

file in three different racks”

PGs mapped to sets of OSDs

• All the datanodes knows the exact

• MDS: manages l’I/O MDS:on metadata

• The client will interact with

MDS out of I/O path

15 seconds—unresponsive node declared dead

ystems containing many petabytes of data

95*20 = 1.9 GByte/

24 dd in parallel -­‐ 10GB ﬁle -­‐ bs 4M

MB/s HDFS CEPH CF GLUSTER

• We have tested, from a point of view of the

• We will continue this activity of testing storage

• Domenico DIACONO (INFN-Bari)

• Giacinto DONVITO (INFN-Bari)

• Giovanni MARZULLI (GARR/INFN)

You might also like

24 dd in parallel -‐ 10GB ﬁle -‐ bs 4M