Alluxio, Inc.

124 Followers

274 SlideShares 124 Followers 1 Following

We are the creators and top committers of Alluxio, formerly Tachyon, and we are revolutionizing the way you store, access, and manage data. Alluxio is a memory speed virtual distributed file system. Open source software is critical to the modern enterprise software landscape. Alluxio is open source under the Apache license, and we are committed to maintaining this model. Alluxio is one of the fastest growing open source projects in the Data Ecosystem. With five years of open source history, Alluxio has attracted more than 1000 contributors from over 300 companies. For more info: https://www.alluxio.org/docs/master/en/Overview.html

alluxio storage big data data orchestration open source cloud machine learning distributed computing presto hybrid cloud spark model training cloud computing deep learning file system caching data management infrastructure artificial intelligence analytics summit gpu data platform memory llm data architecture tachyon project aws compute cloud storage ai multi cloud alluxio day performance hadoop hdfs separation of compute and storage hive s3 data data analytics apache spark distributed systems data loading gpu utilization kubernetes data engineering sql aws s3 distributed data caching distributed storage pytorch data lake alluxio engineering meetup model inference emr data locality ai infra object store tachyon uber architecture object stores tech talk release rocksdb tensorflow cache computer fuse fine tuning posix local cache intel google dataproc query engine data science nlp cloud bursting unified namespace metadata parquet data lakes orchestration facebook query trino ray cloud migration apache iceberg ml software use case apache hudi hybrid cloud bursting etl software development overview office hour compute storage separation demo python raft community software engineering ceph nvidia apache ozone database gpu analytics object storage memory centric olap scale tencent deep learning applications data ecosystem on-prem product release datasapiens under file system zero copy bursting analytics zoo scalable datalakes amazon emr nfs conference structured data management rakuten data stores confluent kyligence poshmark ecommerce presto caching deployment data stack grpc microsoft zookeeper memory-centric jd amazon web services data warehouse kafka baidu fluid product school alibaba infra preference tuning decoupling compute and storage amplab ctrip qunar mesosphere global namespace walmartlabs business intelligence data unification distributed query sogou pingo virtualization zero-copy burst qiniu ryte talking data security amazon elastic mapreduce tachyon nexus storage system tutorial bigdata unified developers strata developer in-memory storage datawarehouse 2.0 preview distributed system momo financial services multi-tiering search queries high performance computing framework computing generative ai cv api model traiing devops transparent uri analytics and ai cloud architecture twitter virtual file system apache ranger hybrid big data netapp bilibili data tagging open data platform metadata management shadow cache tiktok cache layer prometheus metrics grafana optane persistent memory raptorx disaggregated storage tableflow table rag apache pinot agentic ai engineering distributed cache pipeline model distribution deepleanring modeltraining machinelearning filesystem deepseek inferencing optimization cpu anyscale hpc nvme zoom user experience a/b testing experimentation testing statistics visual search milvus vector database rapids accelerator storagequery s3 api analytic workloads public cloud high performance high-performance metadata services structured data services catalog service spark workloads remote data software testing unified data zero copy hybrid bursting mapr cloud workloads dc/os object store analytics on-premise compute e-commerce datasets pipeline api usability concurrency iceberg netflix alibaba cloud gene computing data lake analytics dask aspect analytics webinar terraform eks t3go walkme unisound atlas starburst robinhood data catalog paypal gimel sql workloads jd.com distributed applications ing tech dataproc google cloud hybrid data lake helixa comcast china unicom aunalytics hub hybrid shannondb structured data

Activity
About

Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar's streaming processing

8 years ago • 2081 Views

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon): An Open Source Memory Speed Virtual Distributed Storage - Gene Pang, Software Engineer, Alluxio

Data Con LA • 8 years ago

Alluxio, Inc.

Presentations

Alluxio Presentation at AMPLab Summer Retreat 2016

Open Source Memory Speed Virtual Distributed Storage

Accessing Data Anywhere with Unified Namespace

Getting Started with Alluxio + Spark + S3

Alluxio Keynote at Strata+Hadoop World Beijing 2016

Alluxio Use Cases at Strata+Hadoop World Beijing 2016

Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016

Accelerating Machine Learning Pipelines with Alluxio at Alluxio Meetup 2016

Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016

Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead

Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage

The Missing Piece of On-Demand Clusters

Spark Summit EU talk by Jiri Simsa

Alluxio: Unify Data at Memory Speed; 2016-11-18

ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spark with Alluxio at Spark Summit Boston 2017

Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017

Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017

Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017

Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017

Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Conference San Jose 2017

Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x performance improvement to Qunar's streaming processing

Best Practices for Using Alluxio with Spark

Alluxio Mesos Meetup - SMACK to SMAACK

Best Practices for Using Alluxio with Spark

Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio

Accelerating Spark Workloads in a Mesos Environment with Alluxio

Using Alluxio in Tencent's News and Personalized Push Services

Kyligence Leverages Alluxio to Accelerate OLAP in the Cloud

Using Alluxio to Accelerate Compute Frameworks in JD

Alluxio's Use and Practice in Didi

Likes

Alluxio - Virtual Unified File System

Alluxio in MOMO

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon): An Open Source Memory Speed Virtual Distributed Storage - Gene Pang, Software Engineer, Alluxio