SlideShare a Scribd company logo
A Library for Number Theoretic
Transforms Erasure Codes
Zenko Live:
QuadIron
Agenda
• Introduction to Zenko
• The problem space of data resiliency
• Giorgio Regni, CTO
• QuadIron demo
• Vianney Rancurel, R&D
• Questions and Answers
What’s Coming
QuadIron
Why?
Planetary scale decentralized storage:
• Distributing data over hundreds of drives, servers, locations with
minimum overhead
• Guaranteeing that each parities is useful and can help reconstruct the
original data
• Keeping the data secure even though fragments are present in many
different places
Pro Tip: When you get stuck, change
vectorial space
To be updated by
Giorgio
What happens when you need more parities?
Demo
• Video file of ~90MB
• Using coding 90+160
• Split in 90 fragments of 1MB: x.00 … x.89
• Generate 160 parities (in fact 250) (overhead of 2.77): x.c00 … x.c249
(non-systematic code)
• Delete data fragments
• Delete 100 parities (tolerate 100 drive failures!)
• Repair
• Play the video
QuadIron
Lib QuadIron is online
Forum https://forum.zenko.io
Website: www.zenko.io/blog
Code: https://github.com/scality/quadiron
https://www.zenko.io/blog/free-library-erasure-codes/
QuadIron An open source library for number theoretic transform-based erasure codes
Backup
Properties of Erasure Codes: Definition
A C(n,k) erasure code is defined by n=k+m
❒ k being the number of data fragments.
❒ m being the number of desired erasure fragments.
Example: C(9, 6)
Properties of Erasure Codes
❒ Optimality: e.g. MDS (Maximum Distance Separable) erasure code guarantees that any k
fragments can be used to decode a file
❒ Systematicity: Systematic codes generate n-k erasure fragments and therefore maintain k data
fragments. Non-systematic codes generate n erasure fragments
❒ Speed: Erasure codes are characterized by their encode/decode speed. Speed may vary acc/to
the rate (k and m parameters). Speeds may also be more or less predictive acc/to codes.
❒ Rate sensitivity: Erasure codes can also be compared by their sensitivity to the rate r=k/n, which
may or may not impact the encoding and decoding speed
❒ Rate adaptivity: Changing k and m without having to generate all the erasure codes
❒ Confidentiality: determined if an attacker can partially decode the data if he obtains less than k
fragments. Non-systematic codes are confidential (different from threshold schemes)
❒ Repair Bandwidth: the number of fragments required to repair a fragment.
(Main) Types of Erasure Codes
❒ Traditional RS Codes (e.g. Vandermonde or Cauchy matrices)
❒ LDPC Codes
❒ Locally-Repairable-Codes (LRC)
❒ FFT Based RS Codes
❒ Multiplicative FFTs (prime fields)
❒ Additive FFTs (binary extension fields)
Types of Codes: Traditional RS Codes
Types of Codes: Traditional RS Codes
Types of Codes: Traditional RS Codes
The good:
❒ Simple
❒ Support systematic and adaptive rates.
The bad:
❒ Matrix multiplication: O(k x n)
Types of Codes: LDPC Codes
❏ H is a matrix for a C(8,4) code
❏ wc
is the number of 1 in a col
❏ wr
is the number of 1s in a
row
❏ To be called low density wc
<< n and wr
<< m
❏ Regular if wc
constant and wr
= wc
.(n/m)
❏ Matrix can be generated
pseudo-randomly
❏ Presence of short cycles f1,
f2 bad
Source: Bernhard M.J. Leiner
Types of Codes: LDPC Codes
Low-Density-Parity-Check (LDPC) codes are also an important class of erasure codes and are
constructed over sparse parity-check matrices.
The good:
❒ Theoretically an LDPC code optimal for all the interesting properties for a given use case exist.
The bad:
❒ LDPC are not MDS: it is always possible to find a pattern that cannot decode (e.g. having only k
fragments out of n). Overhead is k*f or k+f with a small f, but the overhead is not deterministic.
❒ You can always find/design an LDPC code optimized for few properties (i.e. tailored for a
specific use case) but it will be sub-optimal for the other properties
❒ Designing a good LDPC code is some kind of black art that requires a lot of fine tuning and
experimentation.
Types of Codes: LRC Codes
❏ P1, P2, P3 and P4 are
constructed over a standard
RS
❏ S1 + S2 + S3 = 0
❏ No need to store S3
Source: XORing Elephants: Novel Erasure Codes
for Big Data
Types of Codes: LRC Codes
Locally-Repairable-Codes (LRC) have tackled the repair bandwidth issue of the
RS codes. They combine multiple layers of RS: the local codes and the global
codes.
The good:
❒ Better repair bandwidth than RS codes. Because with RS code we need to
read k fragments to decode.
The bad:
❒ Those codes are not MDS and they require an higher storage overhead
than MDS codes.
Types of Codes: Multiplicative FFT
Types of Codes: Multiplicative FFT
Types of Codes: Additive FFT
Types of Codes: FFT Based RS Codes
Fast Fourier transform (FFT) have a good set of desirable properties.
The good:
❒ Relatively simple
❒ O(N.log(N)) (because we use FFT to speed up the matrix multiplication)
❒ MDS
❒ Fast for large n
The bad:
❒ Repair bandwidth: If there is a missing erasure, we need k codes to
recover the data fragments. For systematic codes, in any case we need to
download k codes.
Multiplicative FFT: Vectorization
Multiplicative FFT: Horizontal Vectorization
Multiplicative FFT: Vertical Vectorization
Multiplicative FFT: Vertical Vectorization
Multiplicative FFT: Vertical Vectorization
Multiplicative FFT: Vertical Vectorization
Speed Comparison
❏ Isa-l: Intel Intelligent Storage Acceleration Library. Matrix based RS HW accelerated:
http://01.org/intel-storage-acceleration-library-open-source-version
❏ Wirehair: Fast and Portable Fountain Codes in C. Hybrid LDPC.
https://github.com/catid/wirehair
❏ Leopard: MDS Reed-Solomon Erasure Correction Codes for Large Data in C.
Additive FFT based. https://github.com/catid/leopard
Thanks Catid !
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Types of Codes: Speed Comparison
Application
❒ Decentralized Storage
Application: Decentralized Storage
Requirements for an erasure code for a decentralized storage archive:
❒ Simple (e.g. may compile on WASM)
❒ Fast, e.g. for > 24 fragments
❒ MDS: A rock solid contract
❒ Work with all rates, and all combinations of n and k
❒ Systematic for smaller fragments
❒ Non-systematic for larger fragments -> Confidentiality ensured if fragments
not stored on same servers (not a threshold scheme though, must be
combined with encryption)
❒ Repair-Bandwidth not critical
Application: Decentralized Storage
Application: Decentralized Storage
❒ Multiple locations, multiple servers per location
❒ Each server is a “Quadiron Provider”
❒ E.g. 10 locations on the globe with 5
servers/location: C(50,35) => can lose 3
locations or 15 servers for an overhead of 1.4
❒ A server is just a bunch of disks, e.g. 45 drives
❒ Can have local parities on servers to avoid
repairing too often on the network e.g. C(45, 40)
= 1.125
❒ Total overhead 1.4 * 1.125 = 1.57
❒ E.g. w/ 10TB drives, 22PB => 14PB useful
❒ Use blockchain transactions to store the location of
blocks
❒ E.g. using Parity, proof-of-work (non-trusted
env) or proof-of-authority (trusted env =>
millions tx/s)
❒ Index the ledgers by block-ids
❒ Use the indexes to locate the blocks
❒ Consolidate indexes
Decentralized Storage: Zenko
QuadIron
❏ Multi-cloud data controller
❏ 1 API endpoint S3
compatible
❏ Native cloud storage
❏ Metadata search across
clouds
❏ 100% open source
❏ github.com/scality/zenko
❏ zenko.io
❏ forum.zenko.io
❏ Give us feedback !
❏ Try the sandbox on Orbit !
S3 API
Wasabi,
Digital
Ocean, etc
Applications: Use Case
Using the Library
C++ Library is available at: https://github.com/scality/quadiron
LICENSE: BSD 3-clause
Compiling:
$ mkdir build
$ cd build
$ cmake -G 'Unix Makefiles' ..
$ make
Using the Library: Code
// #include <quadiron.h>
const int word_size = 8;
const int n_data = 16;
const int n_parities = 64;
const size_t pkt_size = 1024;
quadiron::fec::RsFnt<T>* fec = new quadiron::fec::RsFnt<uint64_t>(
quadiron::fec::FecType::NON_SYSTEMATIC, word_size, n_data, n_parities, pkt_size
);
// encode
std::vector<std::istream*> d_files(fec->n_data, nullptr);
std::vector<std::ostream*> c_files(fec->n_outputs, nullptr);
std::vector<quadiron::Properties> c_props(fec->n_outputs);
fec->encode_packet(d_files, c_files, c_props);
// decode
std::vector<std::ostream*> r_files(fec->n_data, nullptr);
fec->decode_bufs(d_files, c_files, c_props, r_files);
Using the Library: Next Steps
❒ Optimize Multiplicative FFT decoding: For now a relatively slow
Lagrange interpolation
❒ We know how to do it for special values of k and m (k mod m = 0,
m mod k = 0)
❒ Optimize Additive FFTs
❒ Implement Systematic Additives FFTs
❒ Implement NTT adaptive codes for both multiplicative and additive
FFTs
❒ Other optimizations
❒ Frobenius FFTs for both multiplicative and additive FFTs
Developers
Lam Pham-Sy: Lam Pham-Sy is a research engineer working on information theory and computer science. His main
research focuses on different families of forward erasure correcting codes such as ReedSolomon codes, Low-Density
Parity-Check codes, Locally Repairable codes etc. Their application covers from digital communication to data storage.
He did his PhD program in a collaboration between CEA-Leti and Eutelsat S.A. on the subject of forward erasure codes
for satellite communications. Afterwards, he continued his researches at ETIS laboratory and at Orange Labs. Currently
he works at Scality S.A. as a research engineer whose research topics include application of erasure codes in
distributed storage systems, finite field arithmetics.
Sylvain Laperche: Sylvain Laperche is a code craftsman. With a background in biotech engineering,he learnt how to
hack bacteria before learning how to hack a computer. That changed when it studied bioinformatics, and since then he
honed and applied its skill on a wide set of problematics: genome sequencing, complex embedded systems, climate
modelling at European scale, mass-scale geolocation for telco industries. Its steps led him to work on distributed
storage systems and he currently works as an R&D engineer at Scality. Sylvain Laperche has an Engineer’s degree in
Hardware, Circuit Design and Embedded Systems from ISIMA.
Zenko/QuadIron Community - Questions ?
1,000+ registered Zenko Orbit users
Forum https://forum.zenko.io
Website: www.zenko.io/blog
QuadIron github: https://github.com/scality/quadiron
https://www.zenko.io/blog/free-library-erasure-codes/
Ad

More Related Content

What's hot (20)

A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
Intel® Software
 
FPGA Implementation of Mixed Radix CORDIC FFT
FPGA Implementation of Mixed Radix CORDIC FFTFPGA Implementation of Mixed Radix CORDIC FFT
FPGA Implementation of Mixed Radix CORDIC FFT
IJSRD
 
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
Shinya Takamaeda-Y
 
My review on low density parity check codes
My review on low density parity check codesMy review on low density parity check codes
My review on low density parity check codes
pulugurtha venkatesh
 
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
paperpublications3
 
Parallella: Embedded HPC For Everybody
Parallella: Embedded HPC For EverybodyParallella: Embedded HPC For Everybody
Parallella: Embedded HPC For Everybody
jerlbeck
 
2 1
2 12 1
2 1
Arthur Sanchez
 
Lzw
LzwLzw
Lzw
Daniel A
 
64bit SMP OS for TILE-Gx many core processor
64bit SMP OS for TILE-Gx many core processor64bit SMP OS for TILE-Gx many core processor
64bit SMP OS for TILE-Gx many core processor
Toru Nishimura
 
FNR : Arbitrary length small domain block cipher proposal
FNR : Arbitrary length small domain block cipher proposalFNR : Arbitrary length small domain block cipher proposal
FNR : Arbitrary length small domain block cipher proposal
Sashank Dara
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
MohammedAlasmar2
 
Introduction to DPDK RIB library
Introduction to DPDK RIB libraryIntroduction to DPDK RIB library
Introduction to DPDK RIB library
Глеб Хохлов
 
Ppt fnr arbitrary length small domain block cipher proposal
Ppt fnr  arbitrary length small domain block cipher proposalPpt fnr  arbitrary length small domain block cipher proposal
Ppt fnr arbitrary length small domain block cipher proposal
Karunakar Saroj
 
LDPC Codes
LDPC CodesLDPC Codes
LDPC Codes
Sahar Foroughi
 
Andes open cl for RISC-V
Andes open cl for RISC-VAndes open cl for RISC-V
Andes open cl for RISC-V
RISC-V International
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI Library
Intel® Software
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environment
Iffat Anjum
 
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
ADLINK Technology IoT
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
Marina Kolpakova
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
Intel® Software
 
FPGA Implementation of Mixed Radix CORDIC FFT
FPGA Implementation of Mixed Radix CORDIC FFTFPGA Implementation of Mixed Radix CORDIC FFT
FPGA Implementation of Mixed Radix CORDIC FFT
IJSRD
 
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
A High Performance Heterogeneous FPGA-based Accelerator with PyCoRAM (Runner ...
Shinya Takamaeda-Y
 
My review on low density parity check codes
My review on low density parity check codesMy review on low density parity check codes
My review on low density parity check codes
pulugurtha venkatesh
 
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
Design and Implementation of Area Efficiency AES Algoritham with FPGA and ASIC,
paperpublications3
 
Parallella: Embedded HPC For Everybody
Parallella: Embedded HPC For EverybodyParallella: Embedded HPC For Everybody
Parallella: Embedded HPC For Everybody
jerlbeck
 
64bit SMP OS for TILE-Gx many core processor
64bit SMP OS for TILE-Gx many core processor64bit SMP OS for TILE-Gx many core processor
64bit SMP OS for TILE-Gx many core processor
Toru Nishimura
 
FNR : Arbitrary length small domain block cipher proposal
FNR : Arbitrary length small domain block cipher proposalFNR : Arbitrary length small domain block cipher proposal
FNR : Arbitrary length small domain block cipher proposal
Sashank Dara
 
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Linaro
 
Ppt fnr arbitrary length small domain block cipher proposal
Ppt fnr  arbitrary length small domain block cipher proposalPpt fnr  arbitrary length small domain block cipher proposal
Ppt fnr arbitrary length small domain block cipher proposal
Karunakar Saroj
 
Everything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI LibraryEverything You Need to Know About the Intel® MPI Library
Everything You Need to Know About the Intel® MPI Library
Intel® Software
 
Lecture 14 run time environment
Lecture 14 run time environmentLecture 14 run time environment
Lecture 14 run time environment
Iffat Anjum
 
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
Spectra DTP4700 Linux Based Development for Software Defined Radio (SDR) Soft...
ADLINK Technology IoT
 
Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...Pragmatic optimization in modern programming - modern computer architecture c...
Pragmatic optimization in modern programming - modern computer architecture c...
Marina Kolpakova
 

Similar to QuadIron An open source library for number theoretic transform-based erasure codes (20)

Lecture summary: architectures for baseband signal processing of wireless com...
Lecture summary: architectures for baseband signal processing of wireless com...Lecture summary: architectures for baseband signal processing of wireless com...
Lecture summary: architectures for baseband signal processing of wireless com...
Frank Kienle
 
Ag32224229
Ag32224229Ag32224229
Ag32224229
IJERA Editor
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File Multicast
Yoss Cohen
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Shuai Yuan
 
Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)
Scott Mansfield
 
24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf
FrangoCamila
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
Scott Mansfield
 
Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1
Kshitij Singh
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Boris Yen
 
Raptor codes
Raptor codesRaptor codes
Raptor codes
José Lopes
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
inside-BigData.com
 
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptxCA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
trupeace
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Skills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
Unit IV Memory and Programmable Logic.pptx
Unit IV Memory and Programmable Logic.pptxUnit IV Memory and Programmable Logic.pptx
Unit IV Memory and Programmable Logic.pptx
JeevaSadhasivam
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdf
ssuser3fb50b
 
Reed solomon codes
Reed solomon codesReed solomon codes
Reed solomon codes
Samreen Reyaz Ansari
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
Putchong Uthayopas
 
Lecture summary: architectures for baseband signal processing of wireless com...
Lecture summary: architectures for baseband signal processing of wireless com...Lecture summary: architectures for baseband signal processing of wireless com...
Lecture summary: architectures for baseband signal processing of wireless com...
Frank Kienle
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
aspyker
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File Multicast
Yoss Cohen
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
Kernel TLV
 
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like systemAccelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Accelerate Reed-Solomon coding for Fault-Tolerance in RAID-like system
Shuai Yuan
 
Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)Application Caching: The Hidden Microservice (SAConf)
Application Caching: The Hidden Microservice (SAConf)
Scott Mansfield
 
24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf24-02-18 Rejender pratap.pdf
24-02-18 Rejender pratap.pdf
FrangoCamila
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
Scott Mansfield
 
Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1Digital logic-formula-notes-final-1
Digital logic-formula-notes-final-1
Kshitij Singh
 
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Boris Yen
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
inside-BigData.com
 
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptxCA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
CA-Lec4-RISCV-Instructions-1aaaaaaaaaa.pptx
trupeace
 
Simon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelismSimon Peyton Jones: Managing parallelism
Simon Peyton Jones: Managing parallelism
Skills Matter
 
Peyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_futurePeyton jones-2011-parallel haskell-the_future
Peyton jones-2011-parallel haskell-the_future
Takayuki Muranushi
 
Unit IV Memory and Programmable Logic.pptx
Unit IV Memory and Programmable Logic.pptxUnit IV Memory and Programmable Logic.pptx
Unit IV Memory and Programmable Logic.pptx
JeevaSadhasivam
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdf
ssuser3fb50b
 
Ad

More from Scality (13)

Introducing MetalK8s, An Opinionated Kubernetes Implementation
Introducing MetalK8s, An Opinionated Kubernetes ImplementationIntroducing MetalK8s, An Opinionated Kubernetes Implementation
Introducing MetalK8s, An Opinionated Kubernetes Implementation
Scality
 
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Scality
 
Storage that Powers Digital Business: Scality for Enterprise Backup
Storage that Powers Digital Business: Scality for Enterprise BackupStorage that Powers Digital Business: Scality for Enterprise Backup
Storage that Powers Digital Business: Scality for Enterprise Backup
Scality
 
2017 Hackathon Scality & 42 School
2017 Hackathon Scality & 42 School2017 Hackathon Scality & 42 School
2017 Hackathon Scality & 42 School
Scality
 
Leader in Cloud and Object Storage for Service Providers
Leader in Cloud and Object Storage for Service ProvidersLeader in Cloud and Object Storage for Service Providers
Leader in Cloud and Object Storage for Service Providers
Scality
 
Scality medical imaging storage
Scality medical imaging storageScality medical imaging storage
Scality medical imaging storage
Scality
 
Zenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud WorldZenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud World
Scality
 
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital MediaSuperior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Scality
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
Scality
 
Hackathon scality holberton seagate 2016 v5
Hackathon scality holberton seagate 2016 v5Hackathon scality holberton seagate 2016 v5
Hackathon scality holberton seagate 2016 v5
Scality
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
Scality
 
Scality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup PresentationScality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup Presentation
Scality
 
Scality Holberton Interview Training
Scality Holberton Interview TrainingScality Holberton Interview Training
Scality Holberton Interview Training
Scality
 
Introducing MetalK8s, An Opinionated Kubernetes Implementation
Introducing MetalK8s, An Opinionated Kubernetes ImplementationIntroducing MetalK8s, An Opinionated Kubernetes Implementation
Introducing MetalK8s, An Opinionated Kubernetes Implementation
Scality
 
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Wally MacDermid presents Scality Connect for Microsoft Azure at Microsoft Ign...
Scality
 
Storage that Powers Digital Business: Scality for Enterprise Backup
Storage that Powers Digital Business: Scality for Enterprise BackupStorage that Powers Digital Business: Scality for Enterprise Backup
Storage that Powers Digital Business: Scality for Enterprise Backup
Scality
 
2017 Hackathon Scality & 42 School
2017 Hackathon Scality & 42 School2017 Hackathon Scality & 42 School
2017 Hackathon Scality & 42 School
Scality
 
Leader in Cloud and Object Storage for Service Providers
Leader in Cloud and Object Storage for Service ProvidersLeader in Cloud and Object Storage for Service Providers
Leader in Cloud and Object Storage for Service Providers
Scality
 
Scality medical imaging storage
Scality medical imaging storageScality medical imaging storage
Scality medical imaging storage
Scality
 
Zenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud WorldZenko: Enabling Data Control in a Multi-cloud World
Zenko: Enabling Data Control in a Multi-cloud World
Scality
 
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital MediaSuperior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Superior Streaming and CDN Solutions: Cloud Storage Revolutionizes Digital Media
Scality
 
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 ServerAWS re:Invent 2016 - Scality's Open Source AWS S3 Server
AWS re:Invent 2016 - Scality's Open Source AWS S3 Server
Scality
 
Hackathon scality holberton seagate 2016 v5
Hackathon scality holberton seagate 2016 v5Hackathon scality holberton seagate 2016 v5
Hackathon scality holberton seagate 2016 v5
Scality
 
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
S3 Server Hackathon Presented by S3 Server, a Scality Product, Seagate and Ho...
Scality
 
Scality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup PresentationScality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup Presentation
Scality
 
Scality Holberton Interview Training
Scality Holberton Interview TrainingScality Holberton Interview Training
Scality Holberton Interview Training
Scality
 
Ad

Recently uploaded (20)

How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 
How analogue intelligence complements AI
How analogue intelligence complements AIHow analogue intelligence complements AI
How analogue intelligence complements AI
Paul Rowe
 
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager APIUiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPath Community Berlin: Orchestrator API, Swagger, and Test Manager API
UiPathCommunity
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
Quantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur MorganQuantum Computing Quick Research Guide by Arthur Morgan
Quantum Computing Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In FranceManifest Pre-Seed Update | A Humanoid OEM Deeptech In France
Manifest Pre-Seed Update | A Humanoid OEM Deeptech In France
chb3
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
HCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser EnvironmentsHCL Nomad Web – Best Practices and Managing Multiuser Environments
HCL Nomad Web – Best Practices and Managing Multiuser Environments
panagenda
 
Role of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered ManufacturingRole of Data Annotation Services in AI-Powered Manufacturing
Role of Data Annotation Services in AI-Powered Manufacturing
Andrew Leo
 
Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025Splunk Security Update | Public Sector Summit Germany 2025
Splunk Security Update | Public Sector Summit Germany 2025
Splunk
 
Cybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure ADCybersecurity Identity and Access Solutions using Azure AD
Cybersecurity Identity and Access Solutions using Azure AD
VICTOR MAESTRE RAMIREZ
 
Cyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of securityCyber Awareness overview for 2025 month of security
Cyber Awareness overview for 2025 month of security
riccardosl1
 
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Massive Power Outage Hits Spain, Portugal, and France: Causes, Impact, and On...
Aqusag Technologies
 
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
IEDM 2024 Tutorial2_Advances in CMOS Technologies and Future Directions for C...
organizerofv
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Technology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data AnalyticsTechnology Trends in 2025: AI and Big Data Analytics
Technology Trends in 2025: AI and Big Data Analytics
InData Labs
 
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven InsightsAndrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell: Transforming Business Strategy Through Data-Driven Insights
Andrew Marnell
 
Electronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploitElectronic_Mail_Attacks-1-35.pdf by xploit
Electronic_Mail_Attacks-1-35.pdf by xploit
niftliyevhuseyn
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdfSAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
SAP Modernization: Maximizing the Value of Your SAP S/4HANA Migration.pdf
Precisely
 

QuadIron An open source library for number theoretic transform-based erasure codes

  • 1. A Library for Number Theoretic Transforms Erasure Codes Zenko Live: QuadIron
  • 2. Agenda • Introduction to Zenko • The problem space of data resiliency • Giorgio Regni, CTO • QuadIron demo • Vianney Rancurel, R&D • Questions and Answers What’s Coming
  • 4. Why? Planetary scale decentralized storage: • Distributing data over hundreds of drives, servers, locations with minimum overhead • Guaranteeing that each parities is useful and can help reconstruct the original data • Keeping the data secure even though fragments are present in many different places
  • 5. Pro Tip: When you get stuck, change vectorial space To be updated by Giorgio
  • 6. What happens when you need more parities?
  • 7. Demo • Video file of ~90MB • Using coding 90+160 • Split in 90 fragments of 1MB: x.00 … x.89 • Generate 160 parities (in fact 250) (overhead of 2.77): x.c00 … x.c249 (non-systematic code) • Delete data fragments • Delete 100 parities (tolerate 100 drive failures!) • Repair • Play the video
  • 9. Lib QuadIron is online Forum https://forum.zenko.io Website: www.zenko.io/blog Code: https://github.com/scality/quadiron https://www.zenko.io/blog/free-library-erasure-codes/
  • 12. Properties of Erasure Codes: Definition A C(n,k) erasure code is defined by n=k+m ❒ k being the number of data fragments. ❒ m being the number of desired erasure fragments. Example: C(9, 6)
  • 13. Properties of Erasure Codes ❒ Optimality: e.g. MDS (Maximum Distance Separable) erasure code guarantees that any k fragments can be used to decode a file ❒ Systematicity: Systematic codes generate n-k erasure fragments and therefore maintain k data fragments. Non-systematic codes generate n erasure fragments ❒ Speed: Erasure codes are characterized by their encode/decode speed. Speed may vary acc/to the rate (k and m parameters). Speeds may also be more or less predictive acc/to codes. ❒ Rate sensitivity: Erasure codes can also be compared by their sensitivity to the rate r=k/n, which may or may not impact the encoding and decoding speed ❒ Rate adaptivity: Changing k and m without having to generate all the erasure codes ❒ Confidentiality: determined if an attacker can partially decode the data if he obtains less than k fragments. Non-systematic codes are confidential (different from threshold schemes) ❒ Repair Bandwidth: the number of fragments required to repair a fragment.
  • 14. (Main) Types of Erasure Codes ❒ Traditional RS Codes (e.g. Vandermonde or Cauchy matrices) ❒ LDPC Codes ❒ Locally-Repairable-Codes (LRC) ❒ FFT Based RS Codes ❒ Multiplicative FFTs (prime fields) ❒ Additive FFTs (binary extension fields)
  • 15. Types of Codes: Traditional RS Codes
  • 16. Types of Codes: Traditional RS Codes
  • 17. Types of Codes: Traditional RS Codes The good: ❒ Simple ❒ Support systematic and adaptive rates. The bad: ❒ Matrix multiplication: O(k x n)
  • 18. Types of Codes: LDPC Codes ❏ H is a matrix for a C(8,4) code ❏ wc is the number of 1 in a col ❏ wr is the number of 1s in a row ❏ To be called low density wc << n and wr << m ❏ Regular if wc constant and wr = wc .(n/m) ❏ Matrix can be generated pseudo-randomly ❏ Presence of short cycles f1, f2 bad Source: Bernhard M.J. Leiner
  • 19. Types of Codes: LDPC Codes Low-Density-Parity-Check (LDPC) codes are also an important class of erasure codes and are constructed over sparse parity-check matrices. The good: ❒ Theoretically an LDPC code optimal for all the interesting properties for a given use case exist. The bad: ❒ LDPC are not MDS: it is always possible to find a pattern that cannot decode (e.g. having only k fragments out of n). Overhead is k*f or k+f with a small f, but the overhead is not deterministic. ❒ You can always find/design an LDPC code optimized for few properties (i.e. tailored for a specific use case) but it will be sub-optimal for the other properties ❒ Designing a good LDPC code is some kind of black art that requires a lot of fine tuning and experimentation.
  • 20. Types of Codes: LRC Codes ❏ P1, P2, P3 and P4 are constructed over a standard RS ❏ S1 + S2 + S3 = 0 ❏ No need to store S3 Source: XORing Elephants: Novel Erasure Codes for Big Data
  • 21. Types of Codes: LRC Codes Locally-Repairable-Codes (LRC) have tackled the repair bandwidth issue of the RS codes. They combine multiple layers of RS: the local codes and the global codes. The good: ❒ Better repair bandwidth than RS codes. Because with RS code we need to read k fragments to decode. The bad: ❒ Those codes are not MDS and they require an higher storage overhead than MDS codes.
  • 22. Types of Codes: Multiplicative FFT
  • 23. Types of Codes: Multiplicative FFT
  • 24. Types of Codes: Additive FFT
  • 25. Types of Codes: FFT Based RS Codes Fast Fourier transform (FFT) have a good set of desirable properties. The good: ❒ Relatively simple ❒ O(N.log(N)) (because we use FFT to speed up the matrix multiplication) ❒ MDS ❒ Fast for large n The bad: ❒ Repair bandwidth: If there is a missing erasure, we need k codes to recover the data fragments. For systematic codes, in any case we need to download k codes.
  • 32. Speed Comparison ❏ Isa-l: Intel Intelligent Storage Acceleration Library. Matrix based RS HW accelerated: http://01.org/intel-storage-acceleration-library-open-source-version ❏ Wirehair: Fast and Portable Fountain Codes in C. Hybrid LDPC. https://github.com/catid/wirehair ❏ Leopard: MDS Reed-Solomon Erasure Correction Codes for Large Data in C. Additive FFT based. https://github.com/catid/leopard Thanks Catid !
  • 33. Types of Codes: Speed Comparison
  • 34. Types of Codes: Speed Comparison
  • 35. Types of Codes: Speed Comparison
  • 36. Types of Codes: Speed Comparison
  • 37. Types of Codes: Speed Comparison
  • 38. Types of Codes: Speed Comparison
  • 39. Types of Codes: Speed Comparison
  • 40. Types of Codes: Speed Comparison
  • 42. Application: Decentralized Storage Requirements for an erasure code for a decentralized storage archive: ❒ Simple (e.g. may compile on WASM) ❒ Fast, e.g. for > 24 fragments ❒ MDS: A rock solid contract ❒ Work with all rates, and all combinations of n and k ❒ Systematic for smaller fragments ❒ Non-systematic for larger fragments -> Confidentiality ensured if fragments not stored on same servers (not a threshold scheme though, must be combined with encryption) ❒ Repair-Bandwidth not critical
  • 44. Application: Decentralized Storage ❒ Multiple locations, multiple servers per location ❒ Each server is a “Quadiron Provider” ❒ E.g. 10 locations on the globe with 5 servers/location: C(50,35) => can lose 3 locations or 15 servers for an overhead of 1.4 ❒ A server is just a bunch of disks, e.g. 45 drives ❒ Can have local parities on servers to avoid repairing too often on the network e.g. C(45, 40) = 1.125 ❒ Total overhead 1.4 * 1.125 = 1.57 ❒ E.g. w/ 10TB drives, 22PB => 14PB useful ❒ Use blockchain transactions to store the location of blocks ❒ E.g. using Parity, proof-of-work (non-trusted env) or proof-of-authority (trusted env => millions tx/s) ❒ Index the ledgers by block-ids ❒ Use the indexes to locate the blocks ❒ Consolidate indexes
  • 45. Decentralized Storage: Zenko QuadIron ❏ Multi-cloud data controller ❏ 1 API endpoint S3 compatible ❏ Native cloud storage ❏ Metadata search across clouds ❏ 100% open source ❏ github.com/scality/zenko ❏ zenko.io ❏ forum.zenko.io ❏ Give us feedback ! ❏ Try the sandbox on Orbit ! S3 API Wasabi, Digital Ocean, etc
  • 47. Using the Library C++ Library is available at: https://github.com/scality/quadiron LICENSE: BSD 3-clause Compiling: $ mkdir build $ cd build $ cmake -G 'Unix Makefiles' .. $ make
  • 48. Using the Library: Code // #include <quadiron.h> const int word_size = 8; const int n_data = 16; const int n_parities = 64; const size_t pkt_size = 1024; quadiron::fec::RsFnt<T>* fec = new quadiron::fec::RsFnt<uint64_t>( quadiron::fec::FecType::NON_SYSTEMATIC, word_size, n_data, n_parities, pkt_size ); // encode std::vector<std::istream*> d_files(fec->n_data, nullptr); std::vector<std::ostream*> c_files(fec->n_outputs, nullptr); std::vector<quadiron::Properties> c_props(fec->n_outputs); fec->encode_packet(d_files, c_files, c_props); // decode std::vector<std::ostream*> r_files(fec->n_data, nullptr); fec->decode_bufs(d_files, c_files, c_props, r_files);
  • 49. Using the Library: Next Steps ❒ Optimize Multiplicative FFT decoding: For now a relatively slow Lagrange interpolation ❒ We know how to do it for special values of k and m (k mod m = 0, m mod k = 0) ❒ Optimize Additive FFTs ❒ Implement Systematic Additives FFTs ❒ Implement NTT adaptive codes for both multiplicative and additive FFTs ❒ Other optimizations ❒ Frobenius FFTs for both multiplicative and additive FFTs
  • 50. Developers Lam Pham-Sy: Lam Pham-Sy is a research engineer working on information theory and computer science. His main research focuses on different families of forward erasure correcting codes such as ReedSolomon codes, Low-Density Parity-Check codes, Locally Repairable codes etc. Their application covers from digital communication to data storage. He did his PhD program in a collaboration between CEA-Leti and Eutelsat S.A. on the subject of forward erasure codes for satellite communications. Afterwards, he continued his researches at ETIS laboratory and at Orange Labs. Currently he works at Scality S.A. as a research engineer whose research topics include application of erasure codes in distributed storage systems, finite field arithmetics. Sylvain Laperche: Sylvain Laperche is a code craftsman. With a background in biotech engineering,he learnt how to hack bacteria before learning how to hack a computer. That changed when it studied bioinformatics, and since then he honed and applied its skill on a wide set of problematics: genome sequencing, complex embedded systems, climate modelling at European scale, mass-scale geolocation for telco industries. Its steps led him to work on distributed storage systems and he currently works as an R&D engineer at Scality. Sylvain Laperche has an Engineer’s degree in Hardware, Circuit Design and Embedded Systems from ISIMA.
  • 51. Zenko/QuadIron Community - Questions ? 1,000+ registered Zenko Orbit users Forum https://forum.zenko.io Website: www.zenko.io/blog QuadIron github: https://github.com/scality/quadiron https://www.zenko.io/blog/free-library-erasure-codes/