The Google Chubby lock service presented in 2006 is the inspiration for Apache ZooKeeper: let's take a deep dive into Chubby to better understand ZooKeeper and distributed consensus.
âAlexa, be quiet!â: End-to-end near-real time model building and evaluation i...Flink Forward
Â
Flink Forward San Francisco 2022.
To improve Amazon Alexa experiences and support machine learning inference at scale, we built an automated end-to-end solution for incremental model building or fine-tuning machine learning models through continuous learning, continual learning, and/or semi-supervised active learning. Customer privacy is our top concern at Alexa, and as we build solutions, we face unique challenges when operating at scale such as supporting multiple applications with tens of thousands of transactions per second with several dependencies including near-real time inference endpoints at low latencies. Apache Flink helps us transform and discover metrics in near-real time in our solution. In this talk, we will cover the challenges that we faced, how we scale the infrastructure to meet the needs of ML teams across Alexa, and go into how we enable specific use cases that use Apache Flink on Amazon Kinesis Data Analytics to improve Alexa experiences to delight our customers while preserving their privacy.
by
Aansh Shah
Redis is an in-memory key-value store that is often used as a database, cache, and message broker. It supports various data structures like strings, hashes, lists, sets, and sorted sets. While data is stored in memory for fast access, Redis can also persist data to disk. It is widely used by companies like GitHub, Craigslist, and Engine Yard to power applications with high performance needs.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
Â
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
Presentation at Strata Data Conference 2018, New York
The controller is the brain of Apache Kafka. A big part of what the controller does is to maintain the consistency of the replicas and determine which replica can be used to serve the clients, especially during individual broker failure.
Jun Rao outlines the main data flow in the controllerâin particular, when a broker fails, how the controller automatically promotes another replica as the leader to serve the clients, and when a broker is started, how the controller resumes the replication pipeline in the restarted broker.
Jun then describes recent improvements to the controller that allow it to handle certain edge cases correctly and increase its performance, which allows for more partitions in a Kafka cluster.
Flink Forward San Francisco 2022.
Resource Elasticity is a frequently requested feature in Apache Flink: Users want to be able to easily adjust their clusters to changing workloads for resource efficiency and cost saving reasons. In Flink 1.13, the initial implementation of Reactive Mode was introduced, later releases added more improvements to make the feature production ready. In this talk, weâll explain scenarios to deploy Reactive Mode to various environments to achieve autoscaling and resource elasticity. Weâll discuss the constraints to consider when planning to use this feature, and also potential improvements from the Flink roadmap. For those interested in the internals of Flink, weâll also briefly explain how the feature is implemented, and if time permits, conclude with a short demo.
by
Robert Metzger
Aljoscha Krettek is the PMC chair of Apache Flink and Apache Beam, and co-founder of data Artisans. Apache Flink is an open-source platform for distributed stream and batch data processing. It allows for stateful computations over data streams in real-time and historically. Flink supports batch and stream processing using APIs like DataSet and DataStream. Data Artisans originated Flink and provides an application platform powered by Flink and Kubernetes for building stateful stream processing applications.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafkaâs internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
Â
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2mAKgJi.
Ian Nowland and Joel Barciauskas talk about the challenges Datadog faces as the company has grown its real-time metrics systems that collect, process, and visualize data to the point they now handle trillions of points per day. They also talk about how the architecture has evolved, and what they are looking to in the future as they architect for a quadrillion points per day. Filmed at qconnewyork.com.
Ian Nowland is the VP Engineering Metrics and Alerting at Datadog. Joel Barciauskas currently leads Datadog's distribution metrics team, providing accurate, low latency percentile measures for customers across their infrastructure.
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
Â
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
Introduction to memcached, a caching service designed for optimizing performance and scaling in the web stack, seen from perspective of MySQL/PHP users. Given for 2nd year students of professional bachelor in ICT at Kaho St. Lieven, Gent.
Apache Kafka is a high-throughput distributed messaging system that allows for both streaming and offline log processing. It uses Apache Zookeeper for coordination and supports activity stream processing and real-time pub/sub messaging. Kafka bridges the gaps between pure offline log processing and traditional messaging systems by providing features like batching, transactions, persistence, and support for multiple consumers.
Apache Kafka is an open-source distributed event streaming platform used for building real-time data pipelines and streaming apps. It was developed by LinkedIn in 2011 to solve problems with data integration and processing. Kafka uses a publish-subscribe messaging model and is designed to be fast, scalable, and durable. It allows both streaming and storage of data and acts as a central data backbone for large organizations.
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
Â
Moving from Lambda and Kappa Architectures to Kappa+ at Uber
Kappa+ is a new approach developed at Uber to overcome the limitations of the Lambda and Kappa architectures. Whether your realtime infrastructure processes data at Uber scale (well over a trillion messages daily) or only a fraction of that, chances are you will need to reprocess old data at some point.
There can be many reasons for this. Perhaps a bug fix in the realtime code needs to be retroactively applied (aka backfill), or there is a need to train realtime machine learning models on last few months of data before bringing the models online. Kafka's data retention is limited in practice and generally insufficient for such needs. So data must be processed from archives. Aside from addressing such situations, enabling efficient stream processing on archived as well as realtime data also broadens the applicability of stream processing.
This talk introduces the Kappa+ architecture which enables the reuse of streaming realtime logic (stateful and stateless) to efficiently process any amounts of historic data without requiring it to be in Kafka. We shall discuss the complexities involved in such kind of processing and the specific techniques employed in Kappa+ to tackle them.
Log Management
Log Monitoring
Log Analysis
Need for Log Analysis
Problem with Log Analysis
Some of Log Management Tool
What is ELK Stack
ELK Stack Working
Beats
Different Types of Server Logs
Example of Winlog beat, Packetbeat, Apache2 and Nginx Server log analysis
Mimikatz
Malicious File Detection using ELK
Practical Setup
Conclusion
Kafka is an open-source distributed commit log service that provides high-throughput messaging functionality. It is designed to handle large volumes of data and different use cases like online and offline processing more efficiently than alternatives like RabbitMQ. Kafka works by partitioning topics into segments spread across clusters of machines, and replicates across these partitions for fault tolerance. It can be used as a central data hub or pipeline for collecting, transforming, and streaming data between systems and applications.
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
Â
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Apache Iceberg - A Table Format for Hige Analytic Datasets
Speaker:
Ryan Blue, Netflix
For more Alluxio events: https://www.alluxio.io/events/
The document discusses Apache HBase replication, which asynchronously copies data between HBase clusters. It uses a push-based architecture shipping write-ahead log (WAL) entries similarly to MySQL replication. Replication provides eventual consistency and preserves the atomicity of individual updates. Administrators can configure replication by setting parameters and managing peer clusters and queues stored in Zookeeper. Replicated edits flow from the replication source on a region server to the remote replication sink where they are applied.
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
Â
What if you could get the simplicity, convenience, interoperability, and storage niceties of an old-fashioned CSV with the speed of a NoSQL database and the storage requirements of a gzipped file? Enter Parquet.
At The Weather Company, Parquet files are a quietly awesome and deeply integral part of our Spark-driven analytics workflow. Using Spark + Parquet, weâve built a blazing fast, storage-efficient, query-efficient data lake and a suite of tools to accompany it.
We will give a technical overview of how Parquet works and how recent improvements from Tungsten enable SparkSQL to take advantage of this design to provide fast queries by overcoming two major bottlenecks of distributed analytics: communication costs (IO bound) and data decoding (CPU bound).
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
Â
Migrating large databases at Facebook from InnoDB to MyRocks and HBase to MyRocks resulted in significant space savings of 2-4x and improved write performance by up to 10x. Various techniques were used for the migrations such as creating new MyRocks instances without downtime, loading data efficiently, testing on shadow instances, and promoting MyRocks instances as masters. Ongoing work involves optimizations like direct I/O, dictionary compression, parallel compaction, and dynamic configuration changes to further improve performance and efficiency.
DNS is critical network infrastructure and securing it against attacks like DDoS, NXDOMAIN, hijacking and Malware/APT is very important to protecting any business.
Logstash is a tool for managing logs that allows for input, filter, and output plugins to collect, parse, and deliver logs and log data. It works by treating logs as events that are passed through the input, filter, and output phases, with popular plugins including file, redis, grok, elasticsearch and more. The document also provides guidance on using Logstash in a clustered configuration with an agent and server model to optimize log collection, processing, and storage.
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
Â
Apache Flink is a distributed stream processing framework that allows users to process and analyze data in real-time. At LinkedIn, we developed a fully managed stream processing platform on Flink running on K8s to power hundreds of stream processing pipelines in production. This platform is the backbone for other infra systems like Search, Espresso (internal document store) and feature management etc. We provide a rich authoring and testing environment which allows users to create, test, and deploy their streaming jobs in a self-serve fashion within minutes. Users can focus on their business logic, leaving the Flink platform to take care of management aspects such as split deployment, resource provisioning, auto-scaling, job monitoring, alerting, failure recovery and much more. In this talk, we will introduce the overall platform architecture, highlight the unique value propositions that it brings to stream processing at LinkedIn and share the experiences and lessons we have learned.
Data Security at Scale through Spark and Parquet EncryptionDatabricks
Â
Apple logo is a trademark of Apple Inc. This presentation discusses Parquet encryption at scale using Spark and Parquet. It covers goals of Parquet modular encryption including data privacy, integrity, and performance. It demonstrates writing and reading encrypted Parquet files in Spark and discusses the Apache community roadmap for further integration of Parquet encryption.
Kafka is a distributed messaging system that allows for publishing and subscribing to streams of records, known as topics. Producers write data to topics and consumers read from topics. The data is partitioned and replicated across clusters of machines called brokers for reliability and scalability. A common data format like Avro can be used to serialize the data.
File service architecture and network file systemSukhman Kaur
Â
Distributed file systems allow users to access and share files located on multiple computer systems. They provide transparency so that clients can access local and remote files in the same way. Issues include maintaining consistent concurrent updates and caching files for improved performance. Network File System (NFS) is an open standard protocol that allows remote file access like a local file system. It uses remote procedure calls and has evolved through several versions to support features like locking, caching, and security.
The Google File System (GFS) is a scalable distributed file system designed by Google to provide reliable, scalable storage and high performance for large datasets and workloads. It uses low-cost commodity hardware and is optimized for large files, streaming reads and writes, and high throughput. The key aspects of GFS include using a single master node to manage metadata, chunking files into 64MB chunks distributed across multiple chunk servers, replicating chunks for reliability, and optimizing for large sequential reads and appends. GFS provides high availability, fault tolerance, and data integrity through replication, fast recovery, and checksum verification.
Watch this talk here: https://www.confluent.io/online-talks/apache-kafka-architecture-and-fundamentals-explained-on-demand
This session explains Apache Kafkaâs internal design and architecture. Companies like LinkedIn are now sending more than 1 trillion messages per day to Apache Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
This talk provides a comprehensive overview of Kafka architecture and internal functions, including:
-Topics, partitions and segments
-The commit log and streams
-Brokers and broker replication
-Producer basics
-Consumers, consumer groups and offsets
This session is part 2 of 4 in our Fundamentals for Apache Kafka series.
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
Â
Video and slides synchronized, mp3 and slide download available at URL http://bit.ly/2mAKgJi.
Ian Nowland and Joel Barciauskas talk about the challenges Datadog faces as the company has grown its real-time metrics systems that collect, process, and visualize data to the point they now handle trillions of points per day. They also talk about how the architecture has evolved, and what they are looking to in the future as they architect for a quadrillion points per day. Filmed at qconnewyork.com.
Ian Nowland is the VP Engineering Metrics and Alerting at Datadog. Joel Barciauskas currently leads Datadog's distribution metrics team, providing accurate, low latency percentile measures for customers across their infrastructure.
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
Â
CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. "
At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."
Introduction to memcached, a caching service designed for optimizing performance and scaling in the web stack, seen from perspective of MySQL/PHP users. Given for 2nd year students of professional bachelor in ICT at Kaho St. Lieven, Gent.
Apache Kafka is a high-throughput distributed messaging system that allows for both streaming and offline log processing. It uses Apache Zookeeper for coordination and supports activity stream processing and real-time pub/sub messaging. Kafka bridges the gaps between pure offline log processing and traditional messaging systems by providing features like batching, transactions, persistence, and support for multiple consumers.
Apache Kafka is an open-source distributed event streaming platform used for building real-time data pipelines and streaming apps. It was developed by LinkedIn in 2011 to solve problems with data integration and processing. Kafka uses a publish-subscribe messaging model and is designed to be fast, scalable, and durable. It allows both streaming and storage of data and acts as a central data backbone for large organizations.
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
Â
Moving from Lambda and Kappa Architectures to Kappa+ at Uber
Kappa+ is a new approach developed at Uber to overcome the limitations of the Lambda and Kappa architectures. Whether your realtime infrastructure processes data at Uber scale (well over a trillion messages daily) or only a fraction of that, chances are you will need to reprocess old data at some point.
There can be many reasons for this. Perhaps a bug fix in the realtime code needs to be retroactively applied (aka backfill), or there is a need to train realtime machine learning models on last few months of data before bringing the models online. Kafka's data retention is limited in practice and generally insufficient for such needs. So data must be processed from archives. Aside from addressing such situations, enabling efficient stream processing on archived as well as realtime data also broadens the applicability of stream processing.
This talk introduces the Kappa+ architecture which enables the reuse of streaming realtime logic (stateful and stateless) to efficiently process any amounts of historic data without requiring it to be in Kafka. We shall discuss the complexities involved in such kind of processing and the specific techniques employed in Kappa+ to tackle them.
Log Management
Log Monitoring
Log Analysis
Need for Log Analysis
Problem with Log Analysis
Some of Log Management Tool
What is ELK Stack
ELK Stack Working
Beats
Different Types of Server Logs
Example of Winlog beat, Packetbeat, Apache2 and Nginx Server log analysis
Mimikatz
Malicious File Detection using ELK
Practical Setup
Conclusion
Kafka is an open-source distributed commit log service that provides high-throughput messaging functionality. It is designed to handle large volumes of data and different use cases like online and offline processing more efficiently than alternatives like RabbitMQ. Kafka works by partitioning topics into segments spread across clusters of machines, and replicates across these partitions for fault tolerance. It can be used as a central data hub or pipeline for collecting, transforming, and streaming data between systems and applications.
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
Â
Data Orchestration Summit
www.alluxio.io/data-orchestration-summit-2019
November 7, 2019
Apache Iceberg - A Table Format for Hige Analytic Datasets
Speaker:
Ryan Blue, Netflix
For more Alluxio events: https://www.alluxio.io/events/
The document discusses Apache HBase replication, which asynchronously copies data between HBase clusters. It uses a push-based architecture shipping write-ahead log (WAL) entries similarly to MySQL replication. Replication provides eventual consistency and preserves the atomicity of individual updates. Administrators can configure replication by setting parameters and managing peer clusters and queues stored in Zookeeper. Replicated edits flow from the replication source on a region server to the remote replication sink where they are applied.
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
Â
What if you could get the simplicity, convenience, interoperability, and storage niceties of an old-fashioned CSV with the speed of a NoSQL database and the storage requirements of a gzipped file? Enter Parquet.
At The Weather Company, Parquet files are a quietly awesome and deeply integral part of our Spark-driven analytics workflow. Using Spark + Parquet, weâve built a blazing fast, storage-efficient, query-efficient data lake and a suite of tools to accompany it.
We will give a technical overview of how Parquet works and how recent improvements from Tungsten enable SparkSQL to take advantage of this design to provide fast queries by overcoming two major bottlenecks of distributed analytics: communication costs (IO bound) and data decoding (CPU bound).
Migrating from InnoDB and HBase to MyRocks at FacebookMariaDB plc
Â
Migrating large databases at Facebook from InnoDB to MyRocks and HBase to MyRocks resulted in significant space savings of 2-4x and improved write performance by up to 10x. Various techniques were used for the migrations such as creating new MyRocks instances without downtime, loading data efficiently, testing on shadow instances, and promoting MyRocks instances as masters. Ongoing work involves optimizations like direct I/O, dictionary compression, parallel compaction, and dynamic configuration changes to further improve performance and efficiency.
DNS is critical network infrastructure and securing it against attacks like DDoS, NXDOMAIN, hijacking and Malware/APT is very important to protecting any business.
Logstash is a tool for managing logs that allows for input, filter, and output plugins to collect, parse, and deliver logs and log data. It works by treating logs as events that are passed through the input, filter, and output phases, with popular plugins including file, redis, grok, elasticsearch and more. The document also provides guidance on using Logstash in a clustered configuration with an agent and server model to optimize log collection, processing, and storage.
Building a fully managed stream processing platform on Flink at scale for Lin...Flink Forward
Â
Apache Flink is a distributed stream processing framework that allows users to process and analyze data in real-time. At LinkedIn, we developed a fully managed stream processing platform on Flink running on K8s to power hundreds of stream processing pipelines in production. This platform is the backbone for other infra systems like Search, Espresso (internal document store) and feature management etc. We provide a rich authoring and testing environment which allows users to create, test, and deploy their streaming jobs in a self-serve fashion within minutes. Users can focus on their business logic, leaving the Flink platform to take care of management aspects such as split deployment, resource provisioning, auto-scaling, job monitoring, alerting, failure recovery and much more. In this talk, we will introduce the overall platform architecture, highlight the unique value propositions that it brings to stream processing at LinkedIn and share the experiences and lessons we have learned.
Data Security at Scale through Spark and Parquet EncryptionDatabricks
Â
Apple logo is a trademark of Apple Inc. This presentation discusses Parquet encryption at scale using Spark and Parquet. It covers goals of Parquet modular encryption including data privacy, integrity, and performance. It demonstrates writing and reading encrypted Parquet files in Spark and discusses the Apache community roadmap for further integration of Parquet encryption.
Kafka is a distributed messaging system that allows for publishing and subscribing to streams of records, known as topics. Producers write data to topics and consumers read from topics. The data is partitioned and replicated across clusters of machines called brokers for reliability and scalability. A common data format like Avro can be used to serialize the data.
File service architecture and network file systemSukhman Kaur
Â
Distributed file systems allow users to access and share files located on multiple computer systems. They provide transparency so that clients can access local and remote files in the same way. Issues include maintaining consistent concurrent updates and caching files for improved performance. Network File System (NFS) is an open standard protocol that allows remote file access like a local file system. It uses remote procedure calls and has evolved through several versions to support features like locking, caching, and security.
The Google File System (GFS) is a scalable distributed file system designed by Google to provide reliable, scalable storage and high performance for large datasets and workloads. It uses low-cost commodity hardware and is optimized for large files, streaming reads and writes, and high throughput. The key aspects of GFS include using a single master node to manage metadata, chunking files into 64MB chunks distributed across multiple chunk servers, replicating chunks for reliability, and optimizing for large sequential reads and appends. GFS provides high availability, fault tolerance, and data integrity through replication, fast recovery, and checksum verification.
Cloud computing UNIT 2.1 presentation inRahulBhole12
Â
Cloud storage allows users to store files online through cloud storage providers like Apple iCloud, Dropbox, Google Drive, Amazon Cloud Drive, and Microsoft SkyDrive. These providers offer various amounts of free storage and options to purchase additional storage. They allow files to be securely uploaded, accessed, and synced across devices. The best cloud storage provider depends on individual needs and preferences regarding storage space requirements and features offered.
Google is a multi-billion dollar company. It's one of the big power players on the World Wide Web and beyond. The company relies on a distributed computing system to provide users with the infrastructure they need to access, create and alter data.
Surely Google buys state-of-the-art computers and servers to keep things running smoothly, right?
Wrong. The machines that power Google's operations aren't cutting-edge power computers with lots of bells and whistles. In fact, they're relatively inexpensive machines running on Linux operating systems. How can one of the most influential companies on the Web rely on cheap hardware? It's due to the Google File System (GFS), which capitalizes on the strengths of off-the-shelf servers while compensating for any hardware weaknesses. It's all in the design.
Google uses the GFS to organize and manipulate huge files and to allow application developers the research and development resources they require. The GFS is unique to Google and isn't for sale. But it could serve as a model for file systems for organizations with similar needs.
Basics of Kafka and IBM Cloud Event Streams. Includes all the major topics of Kafka, like Brokers, Clusters, Topics, Partitions, Producers, Consumers, Streams, and Connectors. What Event Stream offers more than just Kafka. Some difference between Kafka and IBM MQ.
Network File System (NFS) is a distributed file system protocol that allows users to access and share files located on remote computers as if they were local. NFS runs on top of RPC and supports operations like file reads, writes, lookups and locking. It uses a stateless client-server model where clients make requests to NFS servers, which are responsible for file storage and operations. NFS provides mechanisms for file sharing, locking, caching and replication to enable reliable access and performance across a network.
Leveraging Structured Data To Reduce Disk, IO & Network BandwidthPerforce
Â
Most of the data that is pulled out of an SCM like Perforce Helix is common across multiple workspaces. Leveraging this fact means only fetching the data once from the repository. By creating cheap copies or clones of this data on demand, it is possible to dramatically reduce the load on the network, disks and Perforce servers, while making near-instant workspaces available to users.
This document discusses distributed file systems. It begins by defining key terms like filenames, directories, and metadata. It then describes the goals of distributed file systems, including network transparency, availability, and access transparency. The document outlines common distributed file system architectures like client-server and peer-to-peer. It also discusses specific distributed file systems like NFS, focusing on their protocols, caching, replication, and security considerations.
Distributed file systems allow files to be shared across multiple computers even without other inter-process communication. There are three main naming schemes for distributed files: 1) mounting remote directories locally, 2) combining host name and local name, and 3) using a single global namespace. File caching schemes aim to reduce network traffic by storing recently accessed files in local memory. Key decisions for caching schemes include cache location (client/server memory or disk) and how/when modifications are propagated to servers.
How to improve the network of Docker? How to integrate with OpenVSwitch? How to apply more fine-grained QoS limitation and monitor the resource usage of containers? How to improve Docker in non-intrusive ways? This slides shared by Li Yulai, the Chief Architect of SpeedyCloud, is to answer questions above. Visit http://www.speedycloud.cn to find out more.
Lustre is an open-source, object-based file system designed for large clusters providing petabytes of storage and high throughput. It features object protocols, intent-based locking, and adaptive locking policies for concurrency along with aggressive caching.
NFSv4 was motivated by issues with prior versions like lack of guarantees on caches, failure semantics, and data coherency. It uses a stateful protocol with compound operations, lease-based locking, delegation to clients, and close-open consistency to provide distributed transparent access across heterogeneous networks.
Security was improved using GSS-API and requiring implementations support Kerberos v5 and LIPKey authentication.
The Google File System (GFS) is designed to provide reliable, scalable storage for large files on commodity hardware. It uses a single master server to manage metadata and coordinate replication across multiple chunk servers. Files are split into 64MB chunks which are replicated across servers and stored as regular files. The system prioritizes high throughput over low latency and provides fault tolerance through replication and checksumming to detect data corruption.
The Hadoop Distributed File System (HDFS) has a master/slave architecture with a single NameNode that manages the file system namespace and regulates client access, and multiple DataNodes that store and retrieve blocks of data files. The NameNode maintains metadata and a map of blocks to files, while DataNodes store blocks and report their locations. Blocks are replicated across DataNodes for fault tolerance following a configurable replication factor. The system uses rack awareness and preferential selection of local replicas to optimize performance and bandwidth utilization.
This document discusses distributed file systems. It describes distributed file systems as implementing a common file system that can be shared across autonomous computers. A client-server model is presented where file servers store files and clients access them. Key services like name servers and caching are described. Design issues around naming, caching, writing policies, availability, scalability, and semantics are also summarized.
This document summarizes Linux memory analysis capabilities in the Volatility framework. It discusses general plugins that recover process, network, and system information from Linux memory images. It also describes techniques for detecting rootkits by leveraging kmem_cache structures and recovering hidden processes. Additionally, it covers analyzing live CDs by recovering the in-memory filesystem and analyzing Android memory images at both the kernel and Dalvik virtual machine levels.
The document summarizes the Google File System (GFS). It discusses the key points of GFS's design including:
- Files are divided into fixed-size 64MB chunks for efficiency.
- Metadata is stored on a master server while data chunks are stored on chunkservers.
- The master manages file system metadata and chunk locations while clients communicate with both the master and chunkservers.
- GFS provides features like leases to coordinate updates, atomic appends, and snapshots for consistency and fault tolerance.
This document discusses distributed file systems. It defines a distributed file system as a classical file system model that is distributed across multiple machines to promote sharing of dispersed files. The key aspects covered are that clients, servers, and storage are dispersed across machines and clients should view a distributed file system the same way as a centralized file system, with the distribution hidden at a lower level. Performance concerns for distributed file systems include throughput and response time.
This document summarizes a lecture on the Google File System (GFS). Some key points:
1. GFS was designed for large files and high scalability across thousands of servers. It uses a single master and multiple chunkservers to store and retrieve large file chunks.
2. Files are divided into 64MB chunks which are replicated across servers for reliability. The master manages metadata and chunk locations while clients access chunkservers directly for reads/writes.
3. Atomic record appends allow efficient concurrent writes. Snapshots create instantly consistent copies of files. Leases and replication order ensure consistency across servers.
Spark is a powerhouse for large datasets, but when it comes to smaller data workloads, its overhead can sometimes slow things down. What if you could achieve high performance and efficiency without the need for Spark?
At S&P Global Commodity Insights, having a complete view of global energy and commodities markets enables customers to make data-driven decisions with confidence and create long-term, sustainable value. đ
Explore delta-rs + CDC and how these open-source innovations power lightweight, high-performance data applications beyond Spark! đ
DevOpsDays Atlanta 2025 - Building 10x Development Organizations.pptxJustin Reock
Â
Building 10x Organizations with Modern Productivity Metrics
10x developers may be a myth, but 10x organizations are very real, as proven by the influential study performed in the 1980s, âThe Coding War Games.â
Right now, here in early 2025, we seem to be experiencing YAPP (Yet Another Productivity Philosophy), and that philosophy is converging on developer experience. It seems that with every new method we invent for the delivery of products, whether physical or virtual, we reinvent productivity philosophies to go alongside them.
But which of these approaches actually work? DORA? SPACE? DevEx? What should we invest in and create urgency behind today, so that we donât find ourselves having the same discussion again in a decade?
HCL Nomad Web â Best Practices and Managing Multiuser Environmentspanagenda
Â
Webinar Recording: https://www.panagenda.com/webinars/hcl-nomad-web-best-practices-and-managing-multiuser-environments/
HCL Nomad Web is heralded as the next generation of the HCL Notes client, offering numerous advantages such as eliminating the need for packaging, distribution, and installation. Nomad Web client upgrades will be installed âautomaticallyâ in the background. This significantly reduces the administrative footprint compared to traditional HCL Notes clients. However, troubleshooting issues in Nomad Web present unique challenges compared to the Notes client.
Join Christoph and Marc as they demonstrate how to simplify the troubleshooting process in HCL Nomad Web, ensuring a smoother and more efficient user experience.
In this webinar, we will explore effective strategies for diagnosing and resolving common problems in HCL Nomad Web, including
- Accessing the console
- Locating and interpreting log files
- Accessing the data folder within the browserâs cache (using OPFS)
- Understand the difference between single- and multi-user scenarios
- Utilizing Client Clocking
Dev Dives: Automate and orchestrate your processes with UiPath MaestroUiPathCommunity
Â
This session is designed to equip developers with the skills needed to build mission-critical, end-to-end processes that seamlessly orchestrate agents, people, and robots.
đ Here's what you can expect:
- Modeling: Build end-to-end processes using BPMN.
- Implementing: Integrate agentic tasks, RPA, APIs, and advanced decisioning into processes.
- Operating: Control process instances with rewind, replay, pause, and stop functions.
- Monitoring: Use dashboards and embedded analytics for real-time insights into process instances.
This webinar is a must-attend for developers looking to enhance their agentic automation skills and orchestrate robust, mission-critical processes.
đ¨âđŤ Speaker:
Andrei Vintila, Principal Product Manager @UiPath
This session streamed live on April 29, 2025, 16:00 CET.
Check out all our upcoming Dev Dives sessions at https://community.uipath.com/dev-dives-automation-developer-2025/.
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...SOFTTECHHUB
Â
I started my online journey with several hosting services before stumbling upon Ai EngineHost. At first, the idea of paying one fee and getting lifetime access seemed too good to pass up. The platform is built on reliable US-based servers, ensuring your projects run at high speeds and remain safe. Let me take you step by step through its benefits and features as I explain why this hosting solution is a perfect fit for digital entrepreneurs.
Quantum Computing Quick Research Guide by Arthur MorganArthur Morgan
Â
This is a Quick Research Guide (QRG).
QRGs include the following:
- A brief, high-level overview of the QRG topic.
- A milestone timeline for the QRG topic.
- Links to various free online resource materials to provide a deeper dive into the QRG topic.
- Conclusion and a recommendation for at least two books available in the SJPL system on the QRG topic.
QRGs planned for the series:
- Artificial Intelligence QRG
- Quantum Computing QRG
- Big Data Analytics QRG
- Spacecraft Guidance, Navigation & Control QRG (coming 2026)
- UK Home Computing & The Birth of ARM QRG (coming 2027)
Any questions or comments?
- Please contact Arthur Morgan at [email protected].
100% human made.
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep DiveScyllaDB
Â
Want to learn practical tips for designing systems that can scale efficiently without compromising speed?
Join us for a workshop where weâll address these challenges head-on and explore how to architect low-latency systems using Rust. During this free interactive workshop oriented for developers, engineers, and architects, weâll cover how Rustâs unique language features and the Tokio async runtime enable high-performance application development.
As you explore key principles of designing low-latency systems with Rust, you will learn how to:
- Create and compile a real-world app with Rust
- Connect the application to ScyllaDB (NoSQL data store)
- Negotiate tradeoffs related to data modeling and querying
- Manage and monitor the database for consistently low latencies
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxshyamraj55
Â
Weâre bringing the TDX energy to our community with 2 power-packed sessions:
đ ď¸ Workshop: MuleSoft for Agentforce
Explore the new version of our hands-on workshop featuring the latest Topic Center and API Catalog updates.
đ Talk: Power Up Document Processing
Dive into smart automation with MuleSoft IDP, NLP, and Einstein AI for intelligent document workflows.
AI and Data Privacy in 2025: Global TrendsInData Labs
Â
In this infographic, we explore how businesses can implement effective governance frameworks to address AI data privacy. Understanding it is crucial for developing effective strategies that ensure compliance, safeguard customer trust, and leverage AI responsibly. Equip yourself with insights that can drive informed decision-making and position your organization for success in the future of data privacy.
This infographic contains:
-AI and data privacy: Key findings
-Statistics on AI data privacy in the todayâs world
-Tips on how to overcome data privacy challenges
-Benefits of AI data security investments.
Keep up-to-date on how AI is reshaping privacy standards and what this entails for both individuals and organizations.
Linux Support for SMARC: How Toradex Empowers Embedded DevelopersToradex
Â
Toradex brings robust Linux support to SMARC (Smart Mobility Architecture), ensuring high performance and long-term reliability for embedded applications. Hereâs how:
⢠Optimized Torizon OS & Yocto Support â Toradex provides Torizon OS, a Debian-based easy-to-use platform, and Yocto BSPs for customized Linux images on SMARC modules.
⢠Seamless Integration with i.MX 8M Plus and i.MX 95 â Toradex SMARC solutions leverage NXPâs i.MX 8 M Plus and i.MX 95 SoCs, delivering power efficiency and AI-ready performance.
⢠Secure and Reliable â With Secure Boot, over-the-air (OTA) updates, and LTS kernel support, Toradex ensures industrial-grade security and longevity.
⢠Containerized Workflows for AI & IoT â Support for Docker, ROS, and real-time Linux enables scalable AI, ML, and IoT applications.
⢠Strong Ecosystem & Developer Support â Toradex offers comprehensive documentation, developer tools, and dedicated support, accelerating time-to-market.
With Toradexâs Linux support for SMARC, developers get a scalable, secure, and high-performance solution for industrial, medical, and AI-driven applications.
Do you have a specific project or application in mind where you're considering SMARC? We can help with Free Compatibility Check and help you with quick time-to-market
For more information: https://www.toradex.com/computer-on-modules/smarc-arm-family
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfAbi john
Â
Analyze the growth of meme coins from mere online jokes to potential assets in the digital economy. Explore the community, culture, and utility as they elevate themselves to a new era in cryptocurrency.
Role of Data Annotation Services in AI-Powered ManufacturingAndrew Leo
Â
From predictive maintenance to robotic automation, AI is driving the future of manufacturing. But without high-quality annotated data, even the smartest models fall short.
Discover how data annotation services are powering accuracy, safety, and efficiency in AI-driven manufacturing systems.
Precision in data labeling = Precision on the production floor.
Increasing Retail Store Efficiency How can Planograms Save Time and Money.pptxAnoop Ashok
Â
In today's fast-paced retail environment, efficiency is key. Every minute counts, and every penny matters. One tool that can significantly boost your store's efficiency is a well-executed planogram. These visual merchandising blueprints not only enhance store layouts but also save time and money in the process.
Semantic Cultivators : The Critical Future Role to Enable AIartmondano
Â
By 2026, AI agents will consume 10x more enterprise data than humans, but with none of the contextual understanding that prevents catastrophic misinterpretations.
TrsLabs - Fintech Product & Business ConsultingTrs Labs
Â
Hybrid Growth Mandate Model with TrsLabs
Strategic Investments, Inorganic Growth, Business Model Pivoting are critical activities that business don't do/change everyday. In cases like this, it may benefit your business to choose a temporary external consultant.
An unbiased plan driven by clearcut deliverables, market dynamics and without the influence of your internal office equations empower business leaders to make right choices.
Getting things done within a budget within a timeframe is key to Growing Business - No matter whether you are a start-up or a big company
Talk to us & Unlock the competitive advantage
TrsLabs - Fintech Product & Business ConsultingTrs Labs
Â
Ad
The Google Chubby lock service for loosely-coupled distributed systems
1. Lecture:
Google
Chubby
lock
service
h#p://research.google.com/archive/chubby.html
10/15/2014
Romain
Jaco7n
[email protected]
ZooKeeper
i am your father !
Chubby
MASTER
3. Introduc9on
Abstract
⢠Chubby
lock
service
is
intended
for
use
within
a
loosely-Ââcoupled
distributed
system
consis7ng
large
number
of
machines
(10.000)
connected
by
a
high-Ââspeed
network
â Provides
coarse-Ââgrained
locking
â And
reliable
(low-Ââvolume)
storage
⢠Chubby
provides
an
interface
much
like
a
distributed
file
system
with
advisory
locks
â Whole
file
read
and
writes
opera9on
(no
seek)
â Advisory
locks
â No9fica9on
of
various
events
such
as
file
modifica9on
⢠Design
emphasis
â Availability
â Reliability
â But
not
for
high
performance
/
high
throughput
⢠Chubby
uses
asynchronous
consensus:
PAXOS
with
lease
7mers
to
ensure
liveness
3
4. Agenda
⢠Introduc9on
⢠Design
⢠Mechanisms
for
scaling
⢠Experience
⢠Summary
âDeath
Star
tea
infuserâ
4
5. Design
Googleâs
ra7onale
(2006)
Design
choice:
Lock
Service
or
Client
PAXOS
Library
?
⢠Client
PAXOS
Library
?
â Depend
on
NO
other
servers
(besides
the
name
service
âŚ)
â Provide
a
standard
framework
for
programmers
⢠Lock
Service
?
â Make
Initial Death Star design!
it
easier
to
add
availability
to
a
prototype,
and
to
maintain
exis9ng
program
structure
and
communica9on
pa#erns
â Reduces
the
number
of
servers
on
which
a
client
depends
by
offering
both
a
name
service
with
consistent
client
caching
and
allowing
clients
to
store
and
fetch
small
quan99es
of
data
â Lock-Ââbased
interface
is
more
familiar
to
programmers
â Lock
service
use
several
replicas
to
achieve
high
availability
(=
quorums),
but
even
a
single
client
can
obtain
lock
and
make
progress
safely
Ă ď
Lock
service
reduces
the
number
of
servers
needed
for
a
reliable
client
system
to
make
progress
5
6. Design
Googleâs
ra7onale
(2006)
⢠Two
key
design
decisions
â Google
chooses
lock
service
(
but
also
provide
a
PAXOS
client
library
independently
from
Chubby
for
specific
projects
)
â Serve
small
files
to
permit
elected
primaries
(
=
client
applica9on
masters
)
to
adver9se
themselves
and
their
parameters
⢠Decisions
based
on
Google's
expected
usage
and
environment
â Allow
thousands
of
clients
to
observe
Chubby
files
Ă ď
Events
no7fica7on
mechanism
to
avoid
polling
by
clients
that
wish
to
know
change
â Consistent
caching
seman7cs
prefered
by
developers
and
caching
of
files
protects
lock
service
from
intense
polling
â Security
mechanisms
(
access
control
)
â Provide
only
coarse-Ââgrained
locks
(
long
dura9on
lock
=
low
lock-Ââacquisi9on
rate
=
less
load
on
the
lock
server
)
6
7. Design
System
structure
⢠Two
main
components
that
communicate
via
RPC
â A
replica
server
â A
library
linked
against
client
applica9ons
⢠A
Chubby
cell
consists
of
small
set
of
servers
(
typically
5
)
knows
as
replicas
â Replicas
use
a
distributed
consensus
protocol
(
PAXOS
)
to
elect
a
master
and
replicate
logs
â Read
and
Write
requests
are
sa9sfied
by
the
master
alone
â If
a
master
fails,
other
replicas
run
the
elec9on
protocol
when
their
master
lease
expire
(
new
master
elected
in
few
seconds
)
⢠Clients
find
the
master
by
sending
master
loca7on
requests
to
the
replicas
listed
in
the
DNS
â Non
master
replicas
respond
by
returning
the
iden9ty
of
the
master
â Once
a
client
has
located
the
master,
client
directs
all
requests
to
it
either
un9l
it
ceases
to
respond,
or
un9l
it
indicates
that
it
is
no
longer
the
master
7
Chubby
cell
Master
5.5.5.5
1
-Ââ
DNS
request
=
chubby.deathstar.sith
?
Chubby
lib
Applica7on
Client
DNS
servers
8.8.4.4
8.8.8.8
1.1.1.1
2.2.2.2
3.3.3.3
4.4.4.4
chubby.deathstar.sith
=
1.1.1.1
,
2.2.2.2
,
3.3.3.3,
4.4.4.4
,
5.5.5.5
Master
=
5.5.5.5
2
-Ââ
Master
loca9on
?
3
-Ââ
ini9ates
Chubby
session
with
the
master
8. Design
PAXOS
distributed
consensus
⢠Chubby
cell
with
N
=
3
replicas
â Replicas
use
a
distributed
consensus
protocol
to
elect
a
master
(PAXOS).
Quorum
=
2
for
N
=
3
â The
master
must
obtain
votes
from
a
majority
of
replicas
that
promise
to
not
elect
a
different
master
for
an
interval
of
a
few
seconds
(=master
lease)
â The
master
lease
is
periodically
renewed
by
the
replicas
provided
the
master
con9nues
to
win
a
majority
of
the
vote
â During
its
master
lease,
the
master
maintains
copies
of
a
simple
database
with
replicas
(ordered
replicated
logs)
â Write
request
are
propagated
via
the
consensus
protocol
to
all
replicas
(PAXOS)
â Read
requests
are
sa9sfied
by
the
master
alone
â If
a
master
fails,
other
replicas
run
the
elec9on
protocol
when
their
master
lease
expire
(new
master
elected
in
few
seconds)
8
Prepare
=
please
votes
for
me
and
promises
not
to
vote
for
someone
else
during
12
seconds
Promise
=
OK
i
vote
for
you
and
promise
not
to
vote
for
someone
else
during
12
seconds
if
i
received
quorum
of
Promise
then
i
am
the
Master
an
i
can
send
many
Accept
during
my
lease
(Proposer
vote
for
himself)
Accept
=
update
your
replicated
logs
with
this
Write
client
request
Accepted
=
i
have
write
in
my
log
your
Write
client
request
if
a
replica
received
quorum
of
Accepted
then
the
Write
is
commited
(replica
sends
an
Accepted
to
himself)
Re-ÂâPrepare
=
i
love
to
be
the
Master,
please
re-Ââvotes
for
me
before
the
end
the
lease
so
i
can
extend
my
lease
quorum
quorum
quorum
quorum
9. Design
Files
&
directories
⢠Chubby
exports
a
file
system
interface
simpler
than
Unix
â Tree
of
files
and
directories
with
name
components
separated
by
slashes
â Each
directory
contains
a
list
of
child
files
and
directories
(collec9vely
called
nodes)
â Each
file
contains
a
sequence
of
un-Ââinterpreted
bytes
â No
symbolic
or
hard
links
â No
directory
modified
9mes,
no
last-Ââaccess
9mes
(to
make
easier
to
cache
file
meta-Ââdata)
â No
path-Ââdependent
permission
seman9cs:
file
is
controlled
by
the
permissions
on
the
file
itself
9
The
remaining
of
the
name
is
interpreted
within
the
named
Chubby
cell
/ls/dc-tatooine/bigtable/root-tablet!
The
ls
prefix
is
common
to
all
Chubby
names:
stands
for
lock
service
Second
component
dc-tatooine
is
the
name
of
the
Chubby
cell.
It
is
resolved
to
one
or
more
Chubby
servers
via
DNS
lookup
10. Design
Files
&
directories
:
Nodes
⢠Nodes
(=
files
or
directories)
may
be
either
permanent
or
ephemeral
⢠Ephemeral
files
are
used
as
temporary
files,
and
act
as
indicators
to
others
that
a
client
is
alive
⢠Any
nodes
may
be
deleted
explicitly
â Ephemeral
nodes
files
are
also
deleted
if
no
client
has
them
open
â Ephemeral
nodes
directories
are
also
deleted
if
they
are
empty
⢠Any
node
can
act
as
an
advisory
reader/writer
lock
10
11. Design
Files
&
directories
:
Metadata
⢠3
ACLs
â Three
names
of
access
control
lists
(ACL)
used
to
control
reading,
wri7ng
and
changing
the
ACL
names
for
the
node
â Node
inherits
the
ACL
names
of
its
parent
directory
on
crea9on
â ACLs
are
themselves
files
located
in
â/ls/dc-tatooine/aclâ
(ACL
file
consist
of
simple
lists
of
names
of
principals)
â Users
are
authen9cated
by
a
mechanism
built
into
the
Chubby
RPC
system
⢠4
monotonically
increasing
64-Ââbit
numbers
1. Instance
number:
greater
than
the
instance
number
of
any
previous
node
with
the
same
name
2. Content
genera7on
number
(files
only):
increases
when
the
fileâs
contents
are
wri#en
3. Lock
genera7on
number:
increases
when
the
nodeâs
lock
transi9ons
from
free
to
held
4. ACL
genera7on
number:
increases
when
the
nodeâs
ACL
names
are
wri#en
⢠64-Ââbit
checksum
11
12. Design
Files
&
directories
:
Handles
⢠Clients
open
nodes
to
obtain
Handles
(analogous
to
UNIX
file
descriptors)
⢠Handles
include
:
â Check
digits:
prevent
clients
from
crea9ng
or
guessing
handles
Ă ď
full
access
control
checks
performed
only
when
handles
are
created
â A
sequence
number:
Master
can
know
whether
a
handle
was
generated
by
it
or
a
previous
master
â Mode
informa7on:
(provided
at
open
9me)
to
allow
the
master
to
recreate
its
state
if
an
old
handle
is
presented
to
a
newly
restarted
master
12
13. Design
Locks,
Sequencers
and
Lock-Ââdelay
⢠Each
Chubby
file
and
directory
can
act
as
a
reader-Ââwriter
lock
(locks
are
advisory)
⢠Acquiring
a
lock
in
either
mode
requires
write
permission
â Exclusive
mode
(writer):
One
client
may
hold
the
lock
â Shared
mode
(reader):
Any
number
of
client
handles
may
hold
the
lock
⢠Lock
holder
can
request
a
Sequencer
:
opaque
byte
string
describing
the
state
of
the
lock
immediately
aher
acquisi9on
â Name
of
the
lock
+
Lock
mode
(exclusive
or
shared)
+
Lock
genera9on
number
⢠Sequencer
usage
â Applica9onâs
master
can
generate
a
sequencer
and
send
it
with
any
internal
order
sends
to
other
servers
â Applica9onâs
servers
that
receive
orders
from
a
master
can
check
with
Chubby
if
the
sequencer
is
s9ll
good
(=
not
a
stale
master)
⢠Lock-Ââdelay
:
Lock
server
prevents
other
clients
from
claiming
the
lock
during
lock-Ââdelay
period
if
lock
becomes
free
â client
may
specify
any
look-Ââdelay
up
to
60
seconds
â This
limit
prevents
a
faulty
client
from
making
a
lock
unavailable
for
an
arbitrary
long
9me
â Lock
delay
protects
unmodified
servers
and
clients
from
everyday
problems
caused
by
message
delays
and
restarts
âŚ
13
14. Design
Events
⢠Session
events
can
be
received
by
applica7on
â Jeopardy:
when
session
lease
9meout
and
Grace
period
begins
(see
Fail-Ââover
later
;-Ââ)
â Safe:
when
the
session
is
known
to
have
survived
a
communica9on
problem
â Expired:
if
the
session
9meout
⢠Handle
events:
clients
may
subscribe
to
a
range
of
events
when
they
create
a
Handle
(=Open
phase)
â File
contents
modified
â Child
node
added/removed/modified
â Master
failed
over
â A
Handle
(and
itâs
lock)
has
become
invalid
â Lock
acquired
â Conflic7ng
lock
request
from
another
client
⢠These
events
are
delivered
to
the
clients
asynchronously
via
an
up-Ââcall
from
the
Chubby
library
⢠Mike
Burrows:
âThe
last
two
events
menMoned
are
rarely
used,
and
with
hindsight
could
have
been
omiOed.â
14
15. Design
API
⢠Open/Close
node
name
â func Open( ) !Handles
are
created
by
Open()
and
destroyed
by
Close()!
â func Close( ) !This
call
never
failed
⢠Poison
â func Poison( ) !Allows
a
client
to
cancel
Chubby
calls
made
by
other
threads
without
fear
of
de-Ââalloca9ng
the
memory
being
accessed
by
them
⢠Read/Write
full
contents
â func GetContentsAndStat( ) !Atomic
reading
of
the
en9re
content
and
metadata!
â func GetStat( ) ! !Reading
of
the
metadata
only!
â func ReadDir( ) ! !Reading
of
names
and
metadata
of
the
directory!
â func SetContents() ! !Atomic
wri9ng
of
the
en9re
content!
⢠ACL
â func SetACL( ) ! !Change
ACL
for
a
node!
⢠Delete
node
â func Delete( ) ! !If
it
has
no
children!
⢠Lock
â func Acquire( ) !Acquire
a
lock!
â func TryAcquire( ) !Try
to
acquire
a
poten9ally
conflic9ng
lock
by
sending
âconflic9ng
lock
requestâ
to
the
holder!
â func Release( ) !Release
a
lock!
⢠Sequencer
â func SetSequencer( ) !Returns
a
sequencer
that
describes
any
lock
held
by
this
Handle
â func GetSequencer( )
Associate
a
sequencer
with
a
Handle.
Subsequent
opera9ons
on
the
Handle
failed
if
the
sequencer
is
no
longer
valid
â func CheckSequencer( )
Checks
whether
a
sequencer
is
valid
15
16. Applica9on
scheduler
Design
Design
n°1
⢠Primary
elec9on
example
without
sequencer
usage,
but
with
lock-Ââdelay
(worst
design)
16
Master
Applica9on
workers
The
first
server
to
get
the
exclusive
lock
on
file
â/ls/empire/deathstar/masterâ
is
the
Master,
writes
its
name
on
the
file
and
sets
lock-Ââdelay
to
1
minute;
the
two
others
servers
will
only
receive
events
no9fica9ons
about
this
file
Worker
executes
the
request
Execute order 66 !
Master
sends
an
order
to
a
worker
If
the
lock
on
â/ls/empire/deathstar/
masterâ
becomes
free
because
the
holder
has
failed
or
become
inaccessible,
the
lock
server
will
prevent
others
backup
servers
from
claiming
the
lock
during
the
lock-Ââdelay
of
1
minute
Chubby
cell
PAXOS
leader
17. Design
Design
n°2
⢠Primary
elec9on
example
with
sequencer
usage
(best
design)
17
Master
Applica9on
scheduler
Applica9on
workers
The
first
server
to
get
the
exclusive
lock
on
file
â/ls/empire/deathstar/masterâ
is
the
Master,
write
its
name
on
the
file
and
get
a
sequencer;
the
two
others
servers
will
only
receive
events
no9fica9ons
about
this
file
Worker
must
check
the
sequencer
before
execu7ng
the
request
checked
against
workerâs
Chubby
cache
Chubby
cell
PAXOS
leader
Execute order 66 !
Master
sends
an
order
to
a
worker
by
adding
the
sequencer
to
the
request
18. Applica9on
scheduler
Design
Design
n°3
⢠Primary
elec9on
example
with
sequencer
usage
(op7mized
design)
18
Chubby
cell
PAXOS
leader
Master
Army
of
workers
Destroy
Alderaan !
Master
sends
an
order
to
a
worker
by
adding
the
sequencer
to
the
request
Worker
must
check
the
sequencer
before
execu7ng
the
request
check
against
the
most
recent
sequencer
that
the
server
has
observed
if
the
worker
does
not
wish
to
maintain
session
with
Chubby
The
first
server
to
get
the
exclusive
lock
on
file
â/ls/empire/deathstar/masterâ
is
the
Master,
write
its
name
on
the
file
and
get
a
sequencer;
the
two
others
servers
will
only
receive
events
no9fica9ons
about
this
file
19. Design
Caching
⢠Clients
cache
file
data
and
node
metadata
in
a
consistent,
write-Ââthrough
in
memory
cache
â Cache
maintained
by
a
lease
mechanism
(client
that
allowed
its
cache
lease
to
expire
then
informs
the
lock
server)
â Cache
kept
consistent
by
invalida7ons
sent
by
the
master,
which
keeps
a
list
of
what
each
client
may
be
caching
⢠When
file
data
or
metadata
is
to
be
changed
â Master
block
modifica9on
while
sending
invalida9ons
for
the
data
to
every
client
that
may
cached
it
â Client
that
receives
of
an
invalida9on
flushes
the
invalidated
state
and
acknowledges
by
making
its
next
KeepAlive
call
â The
modifica9on
proceeds
only
aher
the
server
knows
that
each
client
has
invalidated
its
cache,
either
by
acknowledged
the
invalida9on,
or
because
the
client
allowed
its
cache
lease
to
expire
19
Client
PAXOS
Chubby
cell
leader
2
-Ââ
Master
loca9on
?
Client
Client
Write
change
to
the
Chubby
replicated
database
I
want
to
change
file
content
of
âls/empire/tuxâ
I
have
in
cache
file
âls/empire/tuxâ
I
have
in
cache
file
âls/empire/tuxâ
Request
for
changing
content
of
file
âXâ
Master
sends
cache
invalida9on
for
clients
that
cache
âXâ
Clients
flushes
caches
for
file
âXâ
and
acknowledge
the
Master
Master
acknowledges
the
writer
about
change
done
20. Design
Sessions
and
KeepAlives
:
Handles,
locks,
and
cached
data
all
remain
valid
while
session
remains
valid
⢠Client
requests
a
new
session
on
first
contac7ng
the
master
of
a
Chubby
cell
⢠Client
ends
the
session
explicitly
either
when
it
terminates,
or
if
the
session
has
been
idle
â Session
idle
=
no
opens
handles
and
no
call
for
a
minute
⢠Each
session
has
an
associated
lease
interval
â Lease
interval
=
master
guarantees
not
to
terminate
the
session
unilaterally
un9l
the
lease
9meout
â Lease
7meout
=
end
of
the
lease
interval
grace period
jeopardy 20
21. Design
Sessions
and
KeepAlives
⢠Master
advances
the
lease
7meout
â On
session
crea7on
â When
a
master
fail-Ââover
occurs
â When
it
responds
to
a
KeepAlive
RPC
from
the
client:
⢠On
receiving
a
KeepAlive
(1),
Master
blocks
the
RPC
un7l
clientâs
previous
lease
interval
is
close
to
expiring
⢠Master
later
allows
the
RPC
to
return
to
the
client
(2)
and
informs
it
about
the
new
lease
7meout
(=
lease
M2)
⢠Master
use
default
extension
of
12
second,
overload
master
may
use
higher
values
⢠Client
ini7ates
a
KeepAlive
immediately
aler
receiving
the
previous
reply
Protocol
op7miza7on
:
KeepAlive
reply
is
used
to
transmit
events
and
cache
invalida7ons
back
to
the
client
grace period
jeopardy 21
block
RPC
22. Design
Fail-Ââovers
⢠When
a
master
fails
or
otherwise
loses
mastership
â Discards
its
in-Ââmemory
state
about
sessions,
handles,
and
locks
â Session
lease
9mer
is
stopped
â If
a
master
elec7on
occurs
quickly,
clients
can
contact
the
new
master
before
their
local
lease
7mer
expire
â If
the
elec7on
takes
a
long
7me,
client
flush
their
caches
(=
JEOPARDY
)
and
wait
for
the
GRACE
PERIOD
(
45
seconds
)
while
trying
to
find
the
new
master
grace period
jeopardy 22
23. Design
Fail-Ââovers
:
Newly
elected
masterâs
tasks
(initial design)!
1. Picks
a
new
client
epoch
number
(clients
are
required
to
present
on
every
call)
2. Respond
to
master-Ââloca7on
requests,
but
does
not
at
first
process
incoming
session-Âârelated
opera9ons
3. Builds
in-Ââmemory
data
structures
for
sessions
and
locks
recorded
in
the
database.
Session
leases
are
extended
to
the
maximum
that
the
previous
master
may
have
been
using
4. Lets
clients
perform
KeepAlives,
but
no
other
session-Âârelated
opera9ons
5. Emits
a
fail-Ââover
event
to
each
session:
clients
flush
their
caches
and
warn
applica9ons
that
other
events
may
have
been
lost
6. Waits
un7l
each
session
expire
or
acknowledges
the
fail-Ââover
event
7. Now,
allows
all
opera7ons
to
proceed
8. If
a
client
uses
a
handle
created
prior
to
the
fail-Ââover,
the
Master
recreates
the
in-Ââmemory
representa7on
of
the
handle
and
then
honors
the
call
9. Aler
some
interval
(1
minute),
master
deletes
ephemeral
files
that
have
no
open
file
handles:
clients
should
refresh
handles
on
ephemeral
files
during
this
interval
aher
a
fail-Ââover
23
24. Design
Database
implementa7on
⢠Simple
key/value
database
using
write
ahead
logging
and
snapshopng
⢠Chubby
needs
atomic
opera7ons
only
(no
general
transac7ons)
⢠Database
log
is
distributed
among
the
replicas
using
a
distributed
consensus
protocol
(PAXOS)
Backup
⢠Every
few
hours,
Master
writes
a
snapshot
of
its
database
to
a
GFS
file
server
in
a
different
building
=
no
cyclic
dependencies
(
because
local
GFS
uses
local
Chubby
cell
âŚ
)
⢠Backup
used
for
disaster
recovery
⢠Backup
used
for
ini7alizing
the
database
of
a
newly
replaced
replica
=
no
load
on
others
replicas
24
25. Design
Mirroring
⢠Chubby
allows
a
collec7on
of
files
to
be
mirrored
from
one
cell
to
another
â Mirroring
is
fast:
small
files
and
the
event
mechanism
to
inform
immediately
if
a
file
is
added,
deleted,
or
modified
â Unreachable
mirror
remains
unchanged
un9l
connec9vity
is
restored:
updated
files
iden9fied
by
comparing
checksums
â Used
to
copy
configura9on
files
to
various
compu9ng
clusters
distributed
around
the
world
⢠A
special
âglobalâ
Chubby
cell
â Subtree
â/ls/global/masterâ
mirrored
to
the
subtree
â/ls/cell/slaveâ
in
every
other
Chubby
cell
â The
âglobalâ
cell
replicas
are
located
in
widely-Ââseparated
parts
of
the
world
â Usage:
⢠Chubbyâs
own
ACLs
⢠Various
files
in
which
Chubby
cells
and
other
systems
adver9se
their
presence
to
monitoring
services
⢠Pointers
to
allow
clients
to
locate
large
data
sets
(Bigtable
cells)
and
many
configura9ons
files
25
26. 26
Agenda
⢠Introduc9on
⢠Design
⢠Mechanisms
for
scaling
⢠Experience
⢠Summary
âJudge
me
by
my
size,
do
you
?â
-Ââ
Yoda
27. Mechanisms
for
scaling
+
90.000
clients
communica7ng
with
a
Chubby
master
!
⢠Chubbyâs
clients
are
individual
processes
!
Chubby
handles
more
clients
than
expected
⢠Effec7ve
scaling
techniques
=
reduce
communica7on
with
the
master
â Minimize
RTT:
Arbitrary
number
of
Chubby
cells:
clients
almost
always
use
nearby
cell
(found
with
DNS)
to
avoid
reliance
on
remotes
machine
(=
1
chubby
cell
in
a
datacenter
for
several
thousand
machines)
â Minimize
KeepAlives
load:
Increase
lease
9mes
from
the
default=12s
up
to
around
60s
under
heavy
load
(=
fewer
KeepAlive
RPC
to
process)
â Op7mized
caching:
Chubby
clients
cache
file
data,
metadata,
absence
of
files,
and
open
handles
â Protocol-Ââconversion
servers:
Servers
that
translate
the
Chubby
protocol
into
less-Ââcomplex
protocols
(DNS,
âŚ)
27
Chubby
cell
Master
Chubby
clients
READ
&
WRITE
28. Mechanisms
for
scaling
Proxies
⢠Chubbyâs
protocol
can
be
proxied
â Same
protocol
on
both
sides
â Proxy
can
reduce
server
load
by
handling
both
KeepAlive
and
read
requests
â Proxy
cannot
reduce
write
traffic
⢠Proxies
allow
a
significant
increase
in
the
number
of
clients
â If
external
proxy
proxy
handle
N
clients,
KeepAlive
traffic
is
reduced
by
a
factor
of
N
(
could
be
10.000
or
more
!
Jď
)
â Proxy
cache
can
reduce
read
traffic
by
at
most
the
mean
amount
of
read-Ââsharing
28
Chubby
clients
Master
Chubby
cell
WRITE
READ
Replicas
as
internal
proxies
29. Mechanisms
for
scaling
Par77oning
(
Intended
to
enable
large
Chubby
cells
with
li#le
communica9on
between
the
par99ons
)
⢠The
code
can
par77on
the
namespace
by
directory
â Chubby
cell
=
N par77ons
â Each
par99on
has
a
set
of
replicas
and
a
master
â Every
node
D/C
in
directory
D
would
be
stored
on
the
par77on
P(D/C) = hash(D)
mod
N
â Metadata
for
D
may
be
stored
on
a
different
par99on
P(D) = hash(Dâ) mod N,
where
Dâ
is
the
parent
of
D!
â Few
opera9ons
s9ll
require
cross-Ââpar99on
communica9on
⢠ACL:
one
par99on
may
use
another
for
permissions
checks
(
only
Open()
and
Delete()
calls
requires
ACL
checks
)
⢠When
a
directory
is
deleted:
a
cross-Ââpar99on
call
may
be
needed
to
ensure
that
the
directory
is
empty
⢠Unless
number
of
par77ons
N
is
large,
each
client
would
contact
the
majority
of
the
par77ons
â Par77oning
reduces
read
and
write
traffic
on
any
given
par77on
by
a
factor
of
N
âş
â Par77oning
does
not
necessarily
reduce
KeepAlive
traffic
âŚ
#
⢠ParMMoning
implemented
in
the
code
but
not
acMvated
because
Google
does
not
need
it
(2006)
29
30. Agenda
⢠Introduc9on
⢠Design
⢠Mechanisms
for
scaling
⢠Experience
⢠Summary
30
âDo.
Or
do
not.
There
is
no
try.â
-Ââ
Yoda
31. Experience
(2006)
Use
and
behavior
:
Sta7s7cs
taken
as
a
snapshot
of
a
Chubby
cell
(
RPC
rate
over
10
minutes
period
)
31
32. Experience
(2006)
Use
and
behavior
Typical
causes
of
outages
⢠61
outages
over
a
period
of
a
few
weeks
amoun7ng
to
700
cell-Ââdays
of
data
in
total
â 52
outages
<
30
seconds
!
most
applica7ons
are
not
affected
significantly
by
Chubby
outages
under
30
seconds
â 4
caused
by
network
maintenance
â 2
caused
by
suspected
network
connec9vity
problems
â 2
caused
by
sohware
errors
â 1
caused
by
overload
Few
dozens
cell-Ââyears
of
opera7on
⢠Data
lost
on
6
occasions
â 4
database
errors
â 2
operator
error
Overload
⢠Typically
occurs
when
more
than
90.000
sessions
are
ac9ve
or
simultaneous
millions
of
reads
32
33. Experience
(2006)
Java
clients
⢠Chubby
is
in
C++
like
most
Googleâs
infrastructure
⢠Problem:
a
growing
number
of
systems
are
being
wri#en
in
Java
â Java
programmers
dislike
Java
Na9ve
Interface
(slow
and
cumbersome)
for
accessing
non-Ââna9ve
libraries
â Chubbyâs
C++
client
library
is
7.000
lines
:
maintaining
a
Java
library
version
is
delicate
and
too
expensive
⢠Solu7on:
Java
users
run
copies
of
a
protocol-Ââconversion
server
that
exports
a
simple
RPC
protocol
that
correspond
closely
to
Chubbyâs
client
API
⢠Mike
Burrown
(2006):
âEven
with
hindsight,
it
is
not
obvious
how
we
might
have
avoided
the
cost
of
wriMng,
running
and
maintaining
this
addiMonal
serverâ
33
Java
applica7on
JVM
Chubby
protocol-Ââconversion
daemon
Chubby
cell
Master
127.0.0.1:42
34. Experience
(2006)
Use
as
a
name
service
⢠Chubby
was
designed
as
a
lock
service,
but
popular
use
was
as
a
name
server
⢠DNS
caching
is
based
on
7me
â Inconsistent
caching
even
with
small
TTL
=
DNS
data
discarded
when
not
refreshed
within
TTL
period
â A
low
TTL
overloads
DNS
servers
⢠Chubby
caching
use
explicit
invalida7ons
â Consistent
caching
â No
polling
⢠Chubby
DNS
server
â Another
protocol-Ââconversion
server
that
makes
the
naming
data
stored
within
Chubby
available
to
DNS
clients:
for
easing
the
transi9on
from
DNS
names
to
Chubby
names,
and
to
accommodate
exis9ng
applica9ons
that
cannot
be
converted
easily
such
as
browsers
34
Chubby
protocol-Ââconversion
server
Chubby
cell
Master
DNS
request
Chubby
requests
NEW
server
OLD
server
35. Experience
(2006)
Problems
with
fail-Ââover
⢠Original
design
requires
master
to
write
new
sessions
to
the
database
as
they
are
created
â Overhead
on
Berkeley
DB
version
of
the
lock
server
!!!
⢠New
design
avoid
recording
sessions
in
the
database
â Recreate
sessions
in
the
same
way
the
master
currently
recreates
Handles
Ă ď
new
elected
masterâtask
n°8
â A
new
master
must
now
wait
a
full
worst-Ââcase
lease-Ââ7meout
before
allowing
opera7ons
to
proceed
⢠It
cannot
know
whether
all
sessions
have
checked
in
Ă ď
new
elected
masterâtask
n°6
â Proxy
fail-Ââover
made
possible
because
proxy
servers
can
now
manage
sessions
that
the
master
is
not
aware
of
⢠Extra
opera9on
available
only
on
trusted
proxy
servers
to
take
over
a
client
from
another
when
a
proxy
fails
35
Chubby
external
proxy
server
2
Chubby
cell
Master
Chubby
external
proxy
server
1
Example:
external
Proxy
server
fail-Ââover
36. Design
Problems
with
fail-Ââover
:
Newly
elected
masterâs
tasks
(New design)!
1. Picks
a
new
client
epoch
number
(clients
are
required
to
present
on
every
call)
2. Respond
to
master-Ââloca7on
requests,
but
does
not
at
first
process
incoming
session-Âârelated
opera9ons
3. Builds
in-Ââmemory
data
structures
for
sessions
and
locks
recorded
in
the
database.
Session
leases
are
extended
to
the
maximum
that
the
previous
master
may
have
been
using
4. Lets
clients
perform
KeepAlives,
but
no
other
session-Âârelated
opera7ons
5. Emits
a
fail-Ââover
event
to
each
session:
clients
flush
their
caches
and
warn
applica9ons
that
other
events
may
have
been
lost
6. Waits
un7l
each
session
expire
or
acknowledges
the
fail-Ââover
event
7. Now,
allows
all
opera7ons
to
proceed
8. If
a
client
uses
a
handle
created
prior
to
the
fail-Ââover,
the
Master
recreates
the
in-Ââmemory
representa7on
of
the
session
and
the
handle
and
then
honors
the
call
9. Aler
some
interval
(1
minute),
master
deletes
ephemeral
files
that
have
no
open
file
handles:
clients
should
refresh
handles
on
ephemeral
files
during
this
interval
aher
a
fail-Ââover
36
37. Experience
(2006)
Abusive
clients
⢠Many
services
use
shared
Chubby
cells:
need
to
isolate
clients
from
the
misbehavior
of
others
Problems
encountered:
1. Lack
of
aggressive
caching
â Developers
regularly
write
loops
that
retry
indefinitely
when
a
file
is
not
present,
or
poll
a
file
by
opening
it
and
closing
it
repeatedly
â Need
to
cache
the
absence
of
file
and
to
reuse
open
file
handles
â Requires
to
spend
more
9me
on
DEVOPS
educa9on
but
in
the
end
it
was
easier
to
make
repeated
Open()
calls
cheap
âŚ
2. Lack
of
quotas
â Chubby
was
never
intended
to
be
used
as
a
storage
system
for
large
amounts
of
data
!
â File
size
limit
introduced:
256
KB
3. Publish/subscribe
â Chubby
design
not
made
for
using
its
event
mechanisms
as
a
publish/subscribe
system
!
â Project
review
about
Chubby
usage
and
growth
predic7ons
(RPC
rate,
disk
space,
number
of
files)
Ă ď
need
to
track
the
bad
usages
âŚ
37
38. Experience
(2006)
Lessons
learned
⢠Developers
rarely
consider
availability
â Inclined
to
treat
a
service
like
Chubby
like
as
though
it
were
always
available
â Fail
to
appreciate
the
difference
between
a
service
group
up
and
that
service
being
available
to
their
applica9ons
â API
choices
can
affect
the
way
developers
chose
to
handle
Chubby
outages:
many
DEVs
chose
to
crash
theirs
apps
when
a
master
fail-Ââover
take
place,
but
the
first
intent
was
for
clients
to
check
for
possible
changes
âŚ
⢠3
mechanisms
to
prevent
DEVs
from
being
over-Ââop7mis7c
about
Chubby
availability
1. Project
review
2. Supply
libraries
that
perform
high-Ââlevel
tasks
so
that
DEVs
are
automa9cally
isolated
from
Chubby
outages
3. Post-Ââmortem
of
each
Chubby
outage:
eliminates
bugs
in
Chubby
and
Ops
precedure+
reducing
Apps
sensi9vity
to
Chubbyâs
availability
38
39. Experience
(2006)
Opportuni7es
design
changes
⢠Fine
grained
locking
could
be
ignored
â DEVs
must
remove
unnecessary
communica9on
to
op9mize
their
Apps
Ă ď
finding
a
way
to
use
coarse-Ââgrained
locking
⢠Poor
API
choice
have
unexpected
affects
â One
mistake:
means
for
cancelling
long-Âârunning
calls
are
the
Close()
and
Poison()
RPCs,
which
also
discard
the
server
state
for
the
handle
âŚ
!
may
add
a
Cancel()
RPC
to
allow
more
sharing
of
open
handles
⢠RPC
use
affects
transport
protocols
â KeepAlives
used
both
for
refreshing
the
clientâs
lease,
and
for
passing
events
and
cache
invalida9ons
from
the
master
to
the
client:
TCPâs
back
off
policies
pay
no
a#en9on
to
higher-Ââlevel
9meouts
such
as
Chubby
leases,
so
TCP-Ââbased
KeepAlives
led
to
many
lost
sessions
at
7me
of
high
network
conges7on
!
forced
to
send
KeepAlive
RPCs
via
UDP
rather
than
TCP
âŚ
â May
augment
the
protocol
with
an
addi7onal
TCP-Ââbased
GetEvent()
RPC
which
would
be
used
to
communicate
events
and
invalida7ons
in
the
normal
case,
used
in
the
same
way
KeepAlives.
KeepAlive
reply
would
s7ll
contain
a
list
of
unacknowledged
events
so
that
events
must
eventually
be
acknowledged
39
40. Agenda
⢠Introduc9on
⢠Design
⢠Mechanisms
for
scaling
⢠Experience
⢠Summary
40
âMay
the
lock
service
be
with
you.â
41. Summary
Chubby
lock
service
⢠Chubby
is
a
distributed
lock
service
for
coarse-Ââgrained
synchroniza7on
of
distributed
systems
â Distributed
consensus
among
few
replicas
for
fault-Ââtolerance
â Consistent
client-Ââside
caching
â Timely
no9fica9on
of
updates
â Familiar
file
system
interface
⢠Become
the
primary
Google
internal
name
service
â Common
rendez-Ââvous
mechanism
for
systems
such
as
MapReduce
â To
elect
a
primary
from
redundant
replicas
(GFS
and
Bigtable)
â Standard
repository
for
files
that
require
high-Ââavailability
(ACLs)
â Well-Ââknown
and
available
loca9on
to
store
a
small
amount
of
meta-Ââdata
(=
root
of
the
distributed
data
structures)
⢠Bigtable
usage
â To
elect
a
master
â To
allow
the
master
to
discover
the
servers
its
controls
â To
permit
clients
to
find
the
master
41