Cloudera - The Modern Platform for Analytics

Mar 16, 2018Download as PPTX, PDF4 likes2,548 views

This presentation provides an overview of Cloudera and how a modern platform for Machine Learning and Analytics better enables a data-driven enterprise.

1© Cloudera, Inc. All rights reserved.
The Modern Platform for Analytics

2© Cloudera, Inc. All rights reserved.
We believe
data can make what is impossible
today, possible tomorrow

3© Cloudera, Inc. All rights reserved.
We empower
people to transform complex data
into clear and actionable insights
DRIVE
CUSTOMER INSIGHTS
CONNECT
PRODUCTS & SERVICES (IoT)
PROTECT
BUSINESS

4© Cloudera, Inc. All rights reserved.
We deliver
the modern platform for
machine learning and advanced analytics
RUNS ANYWHERE
Cloud
Multi-cloud
On-prem
SCALABLE
Elastic
Cost-effective
Lower TCO
ENTERPRISE GRADE
Secure
Performant
Compliant

6© Cloudera, Inc. All rights reserved.
The data-driven enterprise
IoT explosion of new data
30B
connected
devices
440x
more data
Enterprises re-architect to
modernize IT infrastructure
open source
cloud
machine
learning

7© Cloudera, Inc. All rights reserved.
A complete, integrated enterprise platform
OPERATIONS
DATA
MANAGEMENT
STRUCTURED UNSTRUCTURED
PROCESS, ANALYZE, SERVE
UNIFIED SERVICES
RESOURCE MANAGEMENT SECURITY
NoSQL
STORE
INTEGRATE
BATCH STREAM SQL SEARCH OTHER
OTHERFILESYSTEM RELATIONAL

8© Cloudera, Inc. All rights reserved.
Enterprise grade customer support
8
1st
and only Hadoop
vendor to have
support certified
100+
service factors audited
12
identified criteria
areas
1yr
renewal cycle for
certification

9© Cloudera, Inc. All rights reserved.
Administration Made Easy
Cloudera Manager
Focus on the solution, not the
cluster, with the only complete
administration tool for Apache
Hadoop.
• Unified configuration,
management and monitoring
across all services
• Online installation and upgrades
• Direct connection to Cloudera
Support
• 3rd Party Extensibility

10© Cloudera, Inc. All rights reserved.
Cloudera Navigator
Trusted for Production
Mature: 3+ years in production
Built into Cloudera Enterprise’s core
Used in production across 100s of
customers and industries
Compliance-Ready
Only distribution to pass PCI audit
Interoperable
Integrates with leading third-party
tools
The only integrated data management and governance platform for Hadoop

11© Cloudera, Inc. All rights reserved.
• Easier SQL workload migration with Migration
Page
○ Gauge migration effort at a glance and
highlight query patterns requiring more
work to migrate
○ Provide migration artifacts as kick-off
point for migration projects
■ Rewrite queries to fix common
syntax errors
■ DDL and Sqoop scripts to migrate
data
Cloudera Navigator Optimizer

12© Cloudera, Inc. All rights reserved.
Cloudera Director for cluster lifecycle management
Easy
• Single pane of glass for all cloud
infrastructure
• Create templates to run applications in a pre-
optimized manner
Flexible
• Multi-cloud: AWS, Azure, GCP
• Hourly pricing with auto billing & metering
• Spot instance/block support
Enterprise-grade
• Integration across Cloudera Enterprise
• Management of CDH deployments at scale
• Deeply integrated with Cloudera Manager

13© Cloudera, Inc. All rights reserved.
Elastic
Portable
Enterprise-grade

15© Cloudera, Inc. All rights reserved.
Cloudera Data Science Workbench

16© Cloudera, Inc. All rights reserved.
Cloudera Enterprise
16
The modern platform for machine learning and analytics, optimized for the
cloud
EXTENSIBLE
SERVICES
CORE
SERVICES DATA
ENGINEERIN
G
OPERATIONA
L DATABASE
ANALYTIC
DATABASE
DATA CATALOG
INGEST &
REPLICATION
SECURITY GOVERNANCE
WORKLOAD
MANAGEMENT
DATA
SCIENCE
S
3
ADL
S
HDF
S
KUD
U
STORAGE
SERVICES

17© Cloudera, Inc. All rights reserved.
Customer Successes for EDH & SDX
Couldn’t solve predictive maintenance goals
EDH delivers:
● Ingest telematics in real-time
● Machine learning to predict failures
● Analytics to minimize service downtime
● Protect sensitive and regulated data
● Consistent security and governance
● “SDX is the key to making that happen” - CIO
Drug R&D too slow and expensive
EDH delivers:
● Self-service analytics
● Meet HIPAA regulations
● >5 petabytes from 2100 silos
● Using Spark, Impala, & Search side-by-side
● With Anaconda, AtScale, Cloudwick, Kinetica,
StreamSets, Tamr, Trifacta, & Zoomdata

18© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
SI and
reseller
Platform
and cloud
Data
systems
Software
and OEM
2500+ partner network

19© Cloudera, Inc. All rights reserved.
Thank you

Discuss the impact and opportunity of using Generative AI to support your development and creative teams * Explore business challenges in content creation * Cost-per-unit of different types of content * Use AI to reduce cost-per-unit * New partnerships being formed that will have a material impact on the way we search and engage with content Part 4 of a 9 Part Research Series named "What matters in AI" published on www.andremuscat.com

Intelligent Agent PPT ON SLIDESHARE IN ARTIFICIAL INTELLIGENCEKhushboo Pal

n artificial intelligence, an intelligent agent (IA) is an autonomous entity which acts, directing its activity towards achieving goals (i.e. it is an agent), upon an environment using observation through sensors and consequent actuators (i.e. it is intelligent).An intelligent agent is a program that can make decisions or perform a service based on its environment, user input and experiences. These programs can be used to autonomously gather information on a regular, programmed schedule or when prompted by the user in real time. Intelligent agents may also be referred to as a bot, which is short for robot.Examples of intelligent agents AI assistants, like Alexa and Siri, are examples of intelligent agents as they use sensors to perceive a request made by the user and the automatically collect data from the internet without the user's help. They can be used to gather information about its perceived environment such as weather and time. Infogate is another example of an intelligent agent, which alerts users about news based on specified topics of interest. Autonomous vehicles could also be considered intelligent agents as they use sensors, GPS and cameras to make reactive decisions based on the environment to maneuver through traffic. Examples of intelligent agents AI assistants, like Alexa and Siri, are examples of intelligent agents as they use sensors to perceive a request made by the user and the automatically collect data from the internet without the user's help. They can be used to gather information about its perceived environment such as weather and time. Infogate is another example of an intelligent agent, which alerts users about news based on specified topics of interest. Autonomous vehicles could also be considered intelligent agents as they use sensors, GPS and cameras to make reactive decisions based on the environment to maneuver through traffic.

Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra

Explore Microsoft Power Platform Center of ExcellenceNanddeep Nachan

The document discusses the Microsoft Power Platform Center of Excellence (CoE) Starter Kit. It provides an overview of the CoE Starter Kit and its modules, including core components, governance components, nurture components, and theming components. It describes how to set up the CoE Starter Kit and its modules as well as some limitations. References for more information on the CoE, CoE Starter Kit, and core components are also provided.

The Basics of MongoDBvaluebound

In this presentation, Raghavendra BM of Valuebound has discussed the basics of MongoDB - an open-source document database and leading NoSQL database. ---------------------------------------------------------- Get Socialistic Our website: http://valuebound.com/ LinkedIn: http://bit.ly/2eKgdux Facebook: https://www.facebook.com/valuebound/ Twitter: http://bit.ly/2gFPTi8

Lecture 9 Markov decision processVARUN KUMAR

NumPyAbhijeetAnand88

This document contains a presentation by Abhijeet Anand on NumPy. It introduces NumPy as a Python library for working with arrays, which aims to provide array objects that are faster than traditional Python lists. NumPy arrays benefit from being stored continuously in memory, unlike lists. The presentation covers 1D, 2D and 3D arrays in NumPy and basic array properties and operations like shape, size, dtype, copying, sorting, addition, subtraction and more.

Apache Iceberg: An Architectural Look Under the CoversScyllaDB

Data Lakes have been built with a desire to democratize data - to allow more and more people, tools, and applications to make use of data. A key capability needed to achieve it is hiding the complexity of underlying data structures and physical data storage from users. The de-facto standard has been the Hive table format addresses some of these problems but falls short at data, user, and application scale. So what is the answer? Apache Iceberg. Apache Iceberg table format is now in use and contributed to by many leading tech companies like Netflix, Apple, Airbnb, LinkedIn, Dremio, Expedia, and AWS. Watch Alex Merced, Developer Advocate at Dremio, as he describes the open architecture and performance-oriented capabilities of Apache Iceberg. You will learn: • The issues that arise when using the Hive table format at scale, and why we need a new table format • How a straightforward, elegant change in table format structure has enormous positive effects • The underlying architecture of an Apache Iceberg table, how a query against an Iceberg table works, and how the table’s underlying structure changes as CRUD operations are done on it • The resulting benefits of this architectural design

Stl meetup cloudera platform - january 2020Adam Doyle

Snowflake OverviewSnowflake Computing

Data Lake OverviewJames Serra

The data lake has become extremely popular, but there is still confusion on how it should be used. In this presentation I will cover common big data architectures that use the data lake, the characteristics and benefits of a data lake, and how it works in conjunction with a relational data warehouse. Then I’ll go into details on using Azure Data Lake Store Gen2 as your data lake, and various typical use cases of the data lake. As a bonus I’ll talk about how to organize a data lake and discuss the various products that can be used in a modern data warehouse.

Introducing Databricks DeltaDatabricks

The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks

Many had dubbed 2020 as the decade of data. This is indeed an era of data zeitgeist. From code-centric software development 1.0, we are entering software development 2.0, a data-centric and data-driven approach, where data plays a central theme in our everyday lives. As the volume and variety of data garnered from myriad data sources continue to grow at an astronomical scale and as cloud computing offers cheap computing and data storage resources at scale, the data platforms have to match in their abilities to process, analyze, and visualize at scale and speed and with ease — this involves data paradigm shifts in processing and storing and in providing programming frameworks to developers to access and work with these data platforms. In this talk, we will survey some emerging technologies that address the challenges of data at scale, how these tools help data scientists and machine learning developers with their data tasks, why they scale, and how they facilitate the future data scientists to start quickly. In particular, we will examine in detail two open-source tools MLflow (for machine learning life cycle development) and Delta Lake (for reliable storage for structured and unstructured data). Other emerging tools such as Koalas help data scientists to do exploratory data analysis at scale in a language and framework they are familiar with as well as emerging data + AI trends in 2021. You will understand the challenges of machine learning model development at scale, why you need reliable and scalable storage, and what other open source tools are at your disposal to do data science and machine learning at scale.

Apache Atlas: Governance for your DataDataWorks Summit/Hadoop Summit

As organizations pursue Big Data initiatives to capture new opportunities for data-driven insights, data governance has become table stakes both from the perspective of external regulatory compliance as well as business value extraction internally within an enterprise. This session will introduce Apache Atlas, a project that was incubated by Hortonworks along with a group of industry leaders across several verticals including financial services, healthcare, pharma, oil and gas, retail and insurance to help address data governance and metadata needs with an open extensible platform governed under the aegis of Apache Software Foundation. Apache Atlas empowers organizations to harvest metadata across the data ecosystem, govern and curate data lakes by applying consistent data classification with a centralized metadata catalog. In this talk, we will present the underpinnings of the architecture of Apache Atlas and conclude with a tour of governance capabilities within Apache Atlas as we showcase various features for open metadata modeling, data classification, visualizing cross-component lineage and impact. We will also demo how Apache Atlas delivers a complete view of data movement across several analytic engines such as Apache Hive, Apache Storm, Apache Kafka and capabilities to effectively classify, discover datasets.

Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit

Apache Atlas provides centralized metadata services and cross-component dataset lineage tracking for Hadoop components. It aims to enable transparent, reproducible, auditable and consistent data governance across structured, unstructured, and traditional database systems. The near term roadmap includes dynamic access policy driven by metadata and enhanced Hive integration. Apache Atlas also pursues metadata exchange with non-Hadoop systems and third party vendors through REST APIs and custom reporters.

Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst

Data Sharing with SnowflakeSnowflake Computing

Every day, businesses across a wide variety of industries share data to support insights that drive efficiency and new business opportunities. However, existing methods for sharing data involve great effort on the part of data providers to share data, and involve great effort on the part of data customers to make use of that data. However, existing approaches to data sharing (such as e-mail, FTP, EDI, and APIs) have significant overhead and friction. For one, legacy approaches such as e-mail and FTP were never intended to support the big data volumes of today. Other data sharing methods also involve enormous effort. All of these methods require not only that the data be extracted, copied, transformed, and loaded, but also that related schemas and metadata must be transported as well. This creates a burden on data providers to deconstruct and stage data sets. This burden and effort is mirrored for the data recipient, who must reconstruct the data. As a result, companies are handicapped in their ability to fully realize the value in their data assets. Snowflake Data Sharing allows companies to grant instant access to ready-to-use data to any number of partners or data customers without any data movement, copying, or complex pipelines. Using Snowflake Data Sharing, companies can derive new insights and value from data much more quickly and with significantly less effort than current data sharing methods. As a result, companies now have a new approach and a powerful new tool to get the full value out of their data assets.

Introduction to Azure DatabricksJames Serra

Databricks is a Software-as-a-Service-like experience (or Spark-as-a-service) that is a tool for curating and processing massive amounts of data and developing, training and deploying models on that data, and managing the whole workflow process throughout the project. It is for those who are comfortable with Apache Spark as it is 100% based on Spark and is extensible with support for Scala, Java, R, and Python alongside Spark SQL, GraphX, Streaming and Machine Learning Library (Mllib). It has built-in integration with many data sources, has a workflow scheduler, allows for real-time workspace collaboration, and has performance improvements over traditional Apache Spark.

Building End-to-End Delta Pipelines on GCPDatabricks

Delta has been powering many production pipelines at scale in the Data and AI space since it has been introduced for the past few years. Built on open standards, Delta provides data reliability, enhances storage and query performance to support big data use cases (both batch and streaming), fast interactive queries for BI and enabling machine learning. Delta has matured over the past couple of years in both AWS and AZURE and has become the de-facto standard for organizations building their Data and AI pipelines. In today’s talk, we will explore building end-to-end pipelines on the Google Cloud Platform (GCP). Through presentation, code examples and notebooks, we will build the Delta Pipeline from ingest to consumption using our Delta Bronze-Silver-Gold architecture pattern and show examples of Consuming the delta files using the Big Query Connector.

Zero to Snowflake Presentation Brett VanderPlaats

This document outlines an agenda for a 90-minute workshop on Snowflake. The agenda includes introductions, an overview of Snowflake and data warehousing, demonstrations of how users utilize Snowflake, hands-on exercises loading sample data and running queries, and discussions of Snowflake architecture and capabilities. Real-world customer examples are also presented, such as a pharmacy building new applications on Snowflake and an education company using it to unify their data sources and achieve a 16x performance improvement.

Modernizing to a Cloud Data ArchitectureDatabricks

Organizations with on-premises Hadoop infrastructure are bogged down by system complexity, unscalable infrastructure, and the increasing burden on DevOps to manage legacy architectures. Costs and resource utilization continue to go up while innovation has flatlined. In this session, you will learn why, now more than ever, enterprises are looking for cloud alternatives to Hadoop and are migrating off of the architecture in large numbers. You will also learn how elastic compute models’ benefits help one customer scale their analytics and AI workloads and best practices from their experience on a successful migration of their data and workloads to the cloud.

Moving to Databricks & DeltaDatabricks

At wetter.com we build analytical B2B data products and heavily use Spark and AWS technologies for data processing and analytics. I explain why we moved from AWS EMR to Databricks and Delta and share our experiences from different angles like architecture, application logic and user experience. We will look how security, cluster configuration, resource consumption and workflow changed by using Databricks clusters as well as how using Delta tables simplified our application logic and data operations.

Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY

With the aid of any number of data management and processing tools, data flows through multiple on-prem and cloud storage locations before it’s delivered to business users. As a result, IT teams — including IT Ops, DataOps, and DevOps — are often overwhelmed by the complexity of creating a reliable data pipeline that includes the automation and observability they require. The answer to this widespread problem is a centralized data pipeline orchestration solution. Join Stonebranch’s Scott Davis, Global Vice President and Ravi Murugesan, Sr. Solution Engineer to learn how DataOps teams orchestrate their end-to-end data pipelines with a platform approach to managing automation. Key Learnings: - Discover how to orchestrate data pipelines across a hybrid IT environment (on-prem and cloud) - Find out how DataOps teams are empowered with event-based triggers for real-time data flow - See examples of reports, dashboards, and proactive alerts designed to help you reliably keep data flowing through your business — with the observability you require - Discover how to replace clunky legacy approaches to streaming data in a multi-cloud environment - See what’s possible with the Stonebranch Universal Automation Center (UAC)

Unified MLOps: Feature Stores & Model DeploymentDatabricks

If you’ve brought two or more ML models into production, you know the struggle that comes from managing multiple data sets, feature engineering pipelines, and models. This talk will propose a whole new approach to MLOps that allows you to successfully scale your models, without increasing latency, by merging a database, a feature store, and machine learning. Splice Machine is a hybrid (HTAP) database built upon HBase and Spark. The database powers a one of a kind single-engine feature store, as well as the deployment of ML models as tables inside the database. A simple JDBC connection means Splice Machine can be used with any model ops environment, such as Databricks. The HBase side allows us to serve features to deployed ML models, and generate ML predictions, in milliseconds. Our unique Spark engine allows us to generate complex training sets, as well as ML predictions on petabytes of data. In this talk, Monte will discuss how his experience running the AI lab at NASA, and as CEO of Red Pepper, Blue Martini Software and Rocket Fuel, led him to create Splice Machine. Jack will give a quick demonstration of how it all works.

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner

Don’t underestimate the Hidden Technical Debt in Machine Learning Systems. Leverage Apache Kafka’s open ecosystem as a scalable and flexible Event Streaming Platform to build one pipeline for real-time and batch use cases. Use Streaming Machine Learning with Apache Kafka, Tiered Storage, and TensorFlow IO to simplify your big data architecture. Tiered Storage for Kafka provides: - one platform for all data processing - an event-based source of truth for materialized views - no need for a pipeline between Kafka and a Data Lake like Hadoop Benefits: - cost reduction - long-term backup - performance isolation (real-time and historical analysis in the same cluster) Use Cases for Reprocessing Historical Events: - New consumer application - Error-handling - Compliance / regulatory processing - Query and analyze existing events - Model training

[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1

Intro to Delta LakeDatabricks

Delta Lake brings reliability, performance, and security to data lakes. It provides ACID transactions, schema enforcement, and unified handling of batch and streaming data to make data lakes more reliable. Delta Lake also features lightning fast query performance through its optimized Delta Engine. It enables security and compliance at scale through access controls and versioning of data. Delta Lake further offers an open approach and avoids vendor lock-in by using open formats like Parquet that can integrate with various ecosystems.

The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks

A traditional data team has roles including data engineer, data scientist, and data analyst. However, many organizations are finding success by integrating a new role – the analytics engineer. The analytics engineer develops a code-based data infrastructure that can serve both analytics and data science teams. He or she develops re-usable data models using the software engineering practices of version control and unit testing, and provides the critical domain expertise that ensures that data products are relevant and insightful. In this talk we’ll talk about the role and skill set of the analytics engineer, and discuss how dbt, an open source programming environment, empowers anyone with a SQL skillset to fulfill this new role on the data team. We’ll demonstrate how to use dbt to build version-controlled data models on top of Delta Lake, test both the code and our assumptions about the underlying data, and orchestrate complete data pipelines on Apache Spark™.

A deep dive into running data analytic workloads in the cloudCloudera, Inc.

This document discusses running data analytic workloads in the cloud using Cloudera Altus. It introduces Altus, which provides a platform-as-a-service for analyzing and processing data at scale in public clouds. The document outlines Altus features like low cost per-hour pricing, end-user focus, and cloud-native deployment. It then describes hands-on examples using Altus Data Engineering for ETL and the Altus Analytic Database for exploration and analytics. Workload analytics capabilities are also introduced for troubleshooting and optimizing jobs.

Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.

More Related Content

What's hot (20)

Stl meetup cloudera platform - january 2020Adam Doyle

Snowflake OverviewSnowflake Computing

Data Lake OverviewJames Serra

Introducing Databricks DeltaDatabricks

The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks

Apache Atlas: Governance for your DataDataWorks Summit/Hadoop Summit

Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit

Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst

Data Sharing with SnowflakeSnowflake Computing

Introduction to Azure DatabricksJames Serra

Building End-to-End Delta Pipelines on GCPDatabricks

Zero to Snowflake Presentation Brett VanderPlaats

Modernizing to a Cloud Data ArchitectureDatabricks

Moving to Databricks & DeltaDatabricks

Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY

Unified MLOps: Feature Stores & Model DeploymentDatabricks

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner

[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1

Intro to Delta LakeDatabricks

The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks

Stl meetup cloudera platform - january 2020Adam Doyle

Snowflake OverviewSnowflake Computing

Data Lake OverviewJames Serra

Introducing Databricks DeltaDatabricks

The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...Databricks

Apache Atlas: Governance for your DataDataWorks Summit/Hadoop Summit

Apache Atlas: Tracking dataset lineage across Hadoop componentsDataWorks Summit/Hadoop Summit

Streaming Real-time Data to Azure Data Lake Storage Gen 2Carole Gunst

Data Sharing with SnowflakeSnowflake Computing

Introduction to Azure DatabricksJames Serra

Building End-to-End Delta Pipelines on GCPDatabricks

Zero to Snowflake Presentation Brett VanderPlaats

Modernizing to a Cloud Data ArchitectureDatabricks

Moving to Databricks & DeltaDatabricks

Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY

Unified MLOps: Feature Stores & Model DeploymentDatabricks

Apache Kafka, Tiered Storage and TensorFlow for Streaming Machine Learning wi...Kai Wähner

[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1

Intro to Delta LakeDatabricks

The Modern Data Team for the Modern Data Stack: dbt and the Role of the Analy...Databricks

Similar to Cloudera - The Modern Platform for Analytics (20)

A deep dive into running data analytic workloads in the cloudCloudera, Inc.

Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.

Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.

Cloudera SDXCloudera, Inc.

Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.

Cloudera Altus makes it easier for data engineers, ETL developers, and anyone who regularly works with raw data to process that data in the cloud efficiently and cost effectively. In this webinar we introduce our new platform-as-a-service offering and explore challenges associated with data processing in the cloud today, how Altus abstracts cluster overhead to deliver easy, efficient data processing, and unique features and benefits of Cloudera Altus.

Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.

Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera, Inc.

Neueste Studien zeigen, dass Data Scientisten und Analysten bis zu 80% ihrer Zeit dafür nutzen, Daten zu reinigen und vorzubereiten. Eine ohnehin schon zeitaufwändige Aufgabe kann in der Cloud noch weiter erschwert werden, da das Cluster Management und Operations die Komplexität noch erhöhen. Nutzer wünschen sich daher, diese komplexen Workflows zu vereinheitlichen und zu vereinfachen. Um Big Data und Machine Learning Initiativen voranzutreiben, benötigen Unternehmen eine skalierbare und überall verfügbare Plattform. Diese muss Self-Service ermöglichen und Datensilos eliminieren.

The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.

How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformCloudera, Inc.

The document discusses building multi-disciplinary analytics applications on a shared data platform. It describes challenges with traditional fragmented approaches using multiple data silos and tools. A shared data platform with Cloudera SDX provides a common data experience across workloads through shared metadata, security, and governance services. This approach optimizes key design goals and provides business benefits like increased insights, agility, and decreased costs compared to siloed environments. An example application of predictive maintenance is given to improve fleet performance.

Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit

Big data platforms are being asked to support an ever increasing range of workloads and compute environments, including large-scale machine learning and public and private clouds. In this talk, we will discuss some emerging capabilities around cloud-native machine learning and data engineering, including running machine learning and Spark workloads directly on Kubernetes, and share our vision of the road ahead for ML and AI in the cloud.

Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxDataStax

Google Cloud Platform delivers the industry’s leading cloud-based services to create anything from simple websites to complex applications. DataStax delivers Apache Cassandra™, the leading distributed database technology, to the enterprise. Together, DataStax Enterprise on Google Cloud Platform delivers the performance, agility, infinite elasticity and innovation organizations need to build high-performance, highly-available online applications. Join Allan Naim, Global Product Lead at Google Cloud Platform and Darshan Rawal, Sr. Director of Product Management at DataStax as they share their expertise on why DataStax and Google Cloud Platform deliver the industry’s most robust Infrastructure-as-a Service (IaaS) platform and how your organization find success with NoSQL and Cloud services. View to learn how to: - Handle more than 1 Million requests per second for data-intensive online applications with Apache Cassandra on Google Cloud Platform - Leverage the technology infrastructure and global network powering Google’s search engine with DataStax to deploy blazing-fast and always-on applications - Transform your business into a data-driven company, a change that is critical as future success and discoveries hinge on the ability to quickly take action on data

Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.

Hybrid is the New NormalDataWorks Summit

It’s becoming clear that enterprises need more than one cloud. Hybrid enables enterprises to optimize how their business works – public cloud for elasticity and scale, multi-cloud for redundancy and choice, and on-premises for performance and privacy. Cloudera delivers a hybrid cloud solution that works where enterprises work, with the agility, security and governance enterprise IT needs, and the self-service analytics business people and enterprise data professionals demand. In this session, we will talk about how Cloudera helps deliver hybrid solutions for enterprises and will run a hands-on Cloudera PaaS demo to exhibit: - Altus Environment Setup - Configure Altus SDX - Spin-up transient clusters with Altus - Execute workload on Altus Data Engineering clusters - Run interactive queries on object store with Altus Data Warehouse - Job Analytics with Workload Experience Manager (WXM) Speaker: Junaid Rao, Senior Cloud Sales Engineer, Cloudera

Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.

Comment développer une stratégie Big Data dans le cloud public avec l'offre P...Cloudera, Inc.

Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp

Enterprise Hadoop in the Cloud. In Minutes. | How to Run Cloudera Enterprise ...Cloudera, Inc.

Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.

The document discusses how data has become a strategic asset for businesses and how a modern data platform can help organizations drive customer insights, improve products and services, lower business risks, and modernize IT. It provides examples of companies using analytics to personalize customer solutions, detect sepsis early to save lives, and protect the global finance system. The document also outlines the evolution of Hadoop platforms and how Cloudera Enterprise provides a common workload pattern to store, process, and analyze data across different workloads and databases in a fast, easy, and secure manner.

Big data journey to the cloud 5.30.18 asher bartchCloudera, Inc.

Optimize your cloud strategy for machine learning and analyticsCloudera, Inc.

Join industry superstars Mike Olson (Cloudera CSO and co-founder) and Jim Curtis (451 Research senior analyst) as they outline the best practices for cloud-based machine learning and analytics in this “can’t miss” webinar. Hot topics include: Why enterprises are moving their analytics to the public cloud How to select the best cloud deployment model Design tricks that make cloud economics work Success stories, cautionary tales, and lessons learned James will share 451 Research findings and offer insights learned from surveying both the vendor landscape and enterprise practitioners. . Mike will regale you with his vision for the future of multi-disciplinary machine learning and analytics in hybrid- and multi-cloud environments 3 things to learn: Why enterprises are moving their analytics to the public cloud How to select the best cloud deployment model Design tricks that make cloud economics work