Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Mark Rittman
The document discusses using Hadoop and NoSQL technologies like Apache HBase to perform social network analysis on Twitter data related to a company's website and blog. It describes ingesting tweet and website log data into Hadoop HDFS and processing it with tools like Hive. Graph algorithms from Oracle Big Data Spatial & Graph were then used on the property graph stored in HBase to identify influential Twitter users and communities. This approach provided real-time insights at scale compared to using a traditional relational database.
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Mark Rittman
Hadoop and NoSQL platforms initially focused on Java developers and slow but massively-scalable MapReduce jobs as an alternative to high-end but limited-scale analytics RDBMS engines. Apache Hive opened-up Hadoop to non-programmers by adding a SQL query engine and relational-style metadata layered over raw HDFS storage, and since then open-source initiatives such as Hive Stinger, Cloudera Impala and Apache Drill along with proprietary solutions from closed-source vendors have extended SQL-on-Hadoop’s capabilities into areas such as low-latency ad-hoc queries, ACID-compliant transactions and schema-less data discovery – at massive scale and with compelling economics.
In this session we’ll focus on technical foundations around SQL-on-Hadoop, first reviewing the basic platform Apache Hive provides and then looking in more detail at how ad-hoc querying, ACID-compliant transactions and data discovery engines work along with more specialised underlying storage that each now work best with – and we’ll take a look to the future to see how SQL querying, data integration and analytics are likely to come together in the next five years to make Hadoop the default platform running mixed old-world/new-world analytics workloads.
Using Oracle Big Data Discovey as a Data Scientist's ToolkitMark Rittman
As delivered at Trivadis Tech Event 2016 - how Big Data Discovery along with Python and pySpark was used to build predictive analytics models against wearables and smart home data
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Mark Rittman
As presented at OGh SQL Celebration Day in June 2016, NL. Covers new features in Big Data SQL including storage indexes, storage handlers and ability to install + license on commodity hardware
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsMark Rittman
This is a session for Oracle DBAs and devs that looks at the cutting edge big data techs like Spark, Kafka etc, and through demos shows how Hadoop is now a a real-time platform for fast analytics, data integration and predictive modeling
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?Mark Rittman
There are many options for providing SQL access over data in a Hadoop cluster, including proprietary vendor products along with open-source technologies such as Apache Hive, Cloudera Impala and Apache Drill; customers are using those to provide reporting over their Hadoop and relational data platforms, and looking to add capabilities such as calculation engines, data integration and federation along with in-memory caching to create complete analytic platforms. In this session we’ll look at the options that are available, compare database vendor solutions with their open-source alternative, and see how emerging vendors are going beyond simple SQL-on-Hadoop products to offer complete “data fabric” solutions that bring together old-world and new-world technologies and allow seamless offloading of archive data and compute work to lower-cost Hadoop platforms.
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
This talk focus is on what a data reservoir is, how it related to the RDBMS DW, and how Big Data Discovery provides access to it to business and BI users
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
Mark Rittman, founder of Rittman Mead, discusses Oracle's approach to hybrid BI deployments and how it aligns with Gartner's vision of a modern BI platform. He explains how Oracle BI 12c supports both traditional top-down modeling and bottom-up data discovery. It also enables deploying components on-premises or in the cloud for flexibility. Rittman believes the future is bi-modal, with IT enabling self-service analytics alongside centralized governance.
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Mark Rittman
Mark Rittman gave a presentation on the future of analytics on Oracle Big Data Appliance. He discussed how Hadoop has enabled highly scalable and affordable cluster computing using technologies like MapReduce, Hive, Impala, and Parquet. Rittman also talked about how these technologies have improved query performance and made Hadoop suitable for both batch and interactive/ad-hoc querying of large datasets.
Unlock the value in your big data reservoir using oracle big data discovery a...Mark Rittman
The document discusses Oracle Big Data Discovery and how it can be used to analyze and gain insights from data stored in a Hadoop data reservoir. It provides an example scenario where Big Data Discovery is used to analyze website logs, tweets, and website posts and comments to understand popular content and influencers for a company. The data is ingested into the Big Data Discovery tool, which automatically enriches the data. Users can then explore the data, apply additional transformations, and visualize relationships to gain insights.
The Future of Analytics, Data Integration and BI on Big Data PlatformsMark Rittman
The document discusses the future of analytics, data integration, and business intelligence (BI) on big data platforms like Hadoop. It covers how BI has evolved from old-school data warehousing to enterprise BI tools to utilizing big data platforms. New technologies like Impala, Kudu, and dataflow pipelines have made Hadoop fast and suitable for analytics. Machine learning can be used for automatic schema discovery. Emerging open-source BI tools and platforms, along with notebooks, bring new approaches to BI. Hadoop has become the default platform and future for analytics.
Mark Rittman presented on how a tweet about a smart kettle went viral. He analyzed the tweet data using Oracle Big Data Spatial and Graph on a Hadoop cluster. Over 3,000 tweets were captured from over 30 countries in 48 hours. Key influencers were identified using PageRank and by their large number of followers. Visualization tools like Cytoscape and Tom Sawyer Perspectives showed how the tweet spread over time and geography. The analysis revealed that the tweet went viral after being shared by the influential user @erinscafe on the first day.
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
Most DBAs are aware something interesting is going on with big data and the Hadoop product ecosystem that underpins it, but aren't so clear about what each component in the stack does, what problem each part solves and why those problems couldn't be solved using the old approach. We'll look at where it's all going with the advent of Spark and machine learning, what's happening with ETL, metadata and analytics on this platform ... why IaaS and datawarehousing-as-a-service will have such a big impact, sooner than you think
Build a simple data lake on AWS using a combination of services, including AWS Glue Data Catalog, AWS Glue Crawlers, AWS Glue Jobs, AWS Glue Studio, Amazon Athena, Amazon Relational Database Service (Amazon RDS), and Amazon S3.
Link to the blog post and video: https://garystafford.medium.com/building-a-simple-data-lake-on-aws-df21ca092e32
The document discusses the evolution of big data architectures from Hadoop and MapReduce to Lambda architecture and stream processing frameworks. It notes the limitations of early frameworks in terms of latency, scalability, and fault tolerance. Modern architectures aim to unify batch and stream processing for low latency queries over both historical and new data.
There is a fundamental shift underway in IT to include open, software defined, distributed systems like Hadoop. As a result, every Oracle professional should strive to learn these new technologies or risk being left behind. This session is designed specifically for Oracle database professionals so they can better understand SQL on Hadoop and the benefits it brings to the enterprise. Attendees will see how SQL on Hadoop compares to Oracle in areas such as data storage, data ingestion, and SQL processing. Various live demos will provide attendees with a first-hand look at these new world technologies. Presented at Collaborate 18.
Big Data 2.0: ETL & Analytics: Implementing a next generation platformCaserta
In our most recent Big Data Warehousing Meetup, we learned about transitioning from Big Data 1.0 with Hadoop 1.x with nascent technologies to the advent of Hadoop 2.x with YARN to enable distributed ETL, SQL and Analytics solutions. Caserta Concepts Chief Architect Elliott Cordo and an Actian Engineer covered the complete data value chain of an Enterprise-ready platform including data connectivity, collection, preparation, optimization and analytics with end user access.
Access additional slides from this meetup here:
http://www.slideshare.net/CasertaConcepts/big-data-warehousing-meetup-january-20
For more information on our services or upcoming events, please visit http://www.actian.com/ or http://www.casertaconcepts.com/.
Many organizations focus on the licensing cost of Hadoop when considering migrating to a cloud platform. But other costs should be considered, as well as the biggest impact, which is the benefit of having a modern analytics platform that can handle all of your use cases. This session will cover lessons learned in assisting hundreds of companies to migrate from Hadoop to Databricks.
Doug Bateman, a principal data engineering instructor at Databricks, presented on how to build a Lakehouse architecture. He began by introducing himself and his background. He then discussed the goals of describing key Lakehouse features, explaining how Delta Lake enables it, and developing a sample Lakehouse using Databricks. The key aspects of a Lakehouse are that it supports diverse data types and workloads while enabling using BI tools directly on source data. Delta Lake provides reliability, consistency, and performance through its ACID transactions, automatic file consolidation, and integration with Spark. Bateman concluded with a demo of creating a Lakehouse.
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
Keynote during BiDaTA 2013 in Genoa, a special track of the ADBIS 2013 conference. URL: http://dbdmg.polito.it/bidata2013/index.php/keynote-presentation
RDX Insights Presentation - Microsoft Business IntelligenceChristopher Foot
May's RDX Insights Series Presentation focuses on Microsoft's BI products. We begin with an overview of Power BI, SSIS, SSAS and SSRS and how the products integrate with each other. The webinar continues with a detailed discussion on how to use Power BI to capture, model, transform, analyze and visualize key business metrics. We’ll finish with a Power BI demo highlighting some of its most beneficial and interesting features.
Lambda architecture for real time big dataTrieu Nguyen
- The document discusses the Lambda Architecture, a system designed by Nathan Marz for building real-time big data applications. It is based on three principles: human fault-tolerance, data immutability, and recomputation.
- The document provides two case studies of applying Lambda Architecture - at Greengar Studios for API monitoring and statistics, and at eClick for real-time data analytics on streaming user event data.
- Key lessons discussed are keeping solutions simple, asking the right questions to enable deep analytics and profit, using reactive and functional approaches, and turning data into useful insights.
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
"Strategies for supporting near real time analytics, OLAP, and interactive data exploration" - Dr. Jeremy Engle, Engineering Manager Data Team at Jellyvision
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...StampedeCon
This session will be a detailed recount of the design, implementation, and launch of the next-generation Shutterstock Data Platform, with strong emphasis on conveying clear, understandable learnings that can be transferred to your own organizations and projects. This platform was architected around the prevailing use of Kafka as a highly-scalable central data hub for shipping data across your organization in batch or streaming fashion. It also relies heavily on Avro as a serialization format and a global schema registry to provide structure that greatly improves quality and usability of our data sets, while also allowing the flexibility to evolve schemas and maintain backwards compatibility.
As a company, Shutterstock has always focused heavily on leveraging open source technologies in developing its products and infrastructure, and open source has been a driving force in big data more so than almost any other software sub-sector. With this plethora of constantly evolving data technologies, it can be a daunting task to select the right tool for your problem. We will discuss our approach for choosing specific existing technologies and when we made decisions to invest time in home-grown components and solutions.
We will cover advantages and the engineering process of developing language-agnostic APIs for publishing to and consuming from the data platform. These APIs can power some very interesting streaming analytics solutions that are easily accessible to teams across our engineering organization.
We will also discuss some of the massive advantages a global schema for your data provides for downstream ETL and data analytics. ETL into Hadoop and creation and maintenance of Hive databases and tables becomes much more reliable and easily automated with historically compatible schemas. To complement this schema-based approach, we will cover results of performance testing various file formats and compression schemes in Hadoop and Hive, the massive performance benefits you can gain in analytical workloads by leveraging highly optimized columnar file formats such as ORC and Parquet, and how you can use good old fashioned Hive as a tool for easily and efficiently converting exiting datasets into these formats.
Finally, we will cover lessons learned in launching this platform across our organization, future improvements and further design, and the need for data engineers to understand and speak the languages of data scientists and web, infrastructure, and network engineers.
Data Con LA 2020
Description
In this session, I introduce the Amazon Redshift lake house architecture which enables you to query data across your data warehouse, data lake, and operational databases to gain faster and deeper insights. With a lake house architecture, you can store data in open file formats in your Amazon S3 data lake.
Speaker
Antje Barth, Amazon Web Services, Sr. Developer Advocate, AI and Machine Learning
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Mark Rittman
- Mark Rittman presented on deploying full OBIEE systems to Oracle Cloud. This involves migrating the data warehouse to Oracle Database Cloud Service, updating the RPD to connect to the cloud database, and uploading the RPD to Oracle BI Cloud Service. Using the wider Oracle PaaS ecosystem allows hosting a full BI platform in the cloud.
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
This talk focus is on what a data reservoir is, how it related to the RDBMS DW, and how Big Data Discovery provides access to it to business and BI users
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
Mark Rittman, founder of Rittman Mead, discusses Oracle's approach to hybrid BI deployments and how it aligns with Gartner's vision of a modern BI platform. He explains how Oracle BI 12c supports both traditional top-down modeling and bottom-up data discovery. It also enables deploying components on-premises or in the cloud for flexibility. Rittman believes the future is bi-modal, with IT enabling self-service analytics alongside centralized governance.
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Mark Rittman
Mark Rittman gave a presentation on the future of analytics on Oracle Big Data Appliance. He discussed how Hadoop has enabled highly scalable and affordable cluster computing using technologies like MapReduce, Hive, Impala, and Parquet. Rittman also talked about how these technologies have improved query performance and made Hadoop suitable for both batch and interactive/ad-hoc querying of large datasets.
Unlock the value in your big data reservoir using oracle big data discovery a...Mark Rittman
The document discusses Oracle Big Data Discovery and how it can be used to analyze and gain insights from data stored in a Hadoop data reservoir. It provides an example scenario where Big Data Discovery is used to analyze website logs, tweets, and website posts and comments to understand popular content and influencers for a company. The data is ingested into the Big Data Discovery tool, which automatically enriches the data. Users can then explore the data, apply additional transformations, and visualize relationships to gain insights.
The Future of Analytics, Data Integration and BI on Big Data PlatformsMark Rittman
The document discusses the future of analytics, data integration, and business intelligence (BI) on big data platforms like Hadoop. It covers how BI has evolved from old-school data warehousing to enterprise BI tools to utilizing big data platforms. New technologies like Impala, Kudu, and dataflow pipelines have made Hadoop fast and suitable for analytics. Machine learning can be used for automatic schema discovery. Emerging open-source BI tools and platforms, along with notebooks, bring new approaches to BI. Hadoop has become the default platform and future for analytics.
Mark Rittman presented on how a tweet about a smart kettle went viral. He analyzed the tweet data using Oracle Big Data Spatial and Graph on a Hadoop cluster. Over 3,000 tweets were captured from over 30 countries in 48 hours. Key influencers were identified using PageRank and by their large number of followers. Visualization tools like Cytoscape and Tom Sawyer Perspectives showed how the tweet spread over time and geography. The analysis revealed that the tweet went viral after being shared by the influential user @erinscafe on the first day.
New World Hadoop Architectures (& What Problems They Really Solve) for Oracle...Rittman Analytics
Most DBAs are aware something interesting is going on with big data and the Hadoop product ecosystem that underpins it, but aren't so clear about what each component in the stack does, what problem each part solves and why those problems couldn't be solved using the old approach. We'll look at where it's all going with the advent of Spark and machine learning, what's happening with ETL, metadata and analytics on this platform ... why IaaS and datawarehousing-as-a-service will have such a big impact, sooner than you think
Build a simple data lake on AWS using a combination of services, including AWS Glue Data Catalog, AWS Glue Crawlers, AWS Glue Jobs, AWS Glue Studio, Amazon Athena, Amazon Relational Database Service (Amazon RDS), and Amazon S3.
Link to the blog post and video: https://garystafford.medium.com/building-a-simple-data-lake-on-aws-df21ca092e32
The document discusses the evolution of big data architectures from Hadoop and MapReduce to Lambda architecture and stream processing frameworks. It notes the limitations of early frameworks in terms of latency, scalability, and fault tolerance. Modern architectures aim to unify batch and stream processing for low latency queries over both historical and new data.
There is a fundamental shift underway in IT to include open, software defined, distributed systems like Hadoop. As a result, every Oracle professional should strive to learn these new technologies or risk being left behind. This session is designed specifically for Oracle database professionals so they can better understand SQL on Hadoop and the benefits it brings to the enterprise. Attendees will see how SQL on Hadoop compares to Oracle in areas such as data storage, data ingestion, and SQL processing. Various live demos will provide attendees with a first-hand look at these new world technologies. Presented at Collaborate 18.
Big Data 2.0: ETL & Analytics: Implementing a next generation platformCaserta
In our most recent Big Data Warehousing Meetup, we learned about transitioning from Big Data 1.0 with Hadoop 1.x with nascent technologies to the advent of Hadoop 2.x with YARN to enable distributed ETL, SQL and Analytics solutions. Caserta Concepts Chief Architect Elliott Cordo and an Actian Engineer covered the complete data value chain of an Enterprise-ready platform including data connectivity, collection, preparation, optimization and analytics with end user access.
Access additional slides from this meetup here:
http://www.slideshare.net/CasertaConcepts/big-data-warehousing-meetup-january-20
For more information on our services or upcoming events, please visit http://www.actian.com/ or http://www.casertaconcepts.com/.
Many organizations focus on the licensing cost of Hadoop when considering migrating to a cloud platform. But other costs should be considered, as well as the biggest impact, which is the benefit of having a modern analytics platform that can handle all of your use cases. This session will cover lessons learned in assisting hundreds of companies to migrate from Hadoop to Databricks.
Doug Bateman, a principal data engineering instructor at Databricks, presented on how to build a Lakehouse architecture. He began by introducing himself and his background. He then discussed the goals of describing key Lakehouse features, explaining how Delta Lake enables it, and developing a sample Lakehouse using Databricks. The key aspects of a Lakehouse are that it supports diverse data types and workloads while enabling using BI tools directly on source data. Delta Lake provides reliability, consistency, and performance through its ACID transactions, automatic file consolidation, and integration with Spark. Bateman concluded with a demo of creating a Lakehouse.
Hadoop is dead - long live Hadoop | BiDaTA 2013 Genoalarsgeorge
Keynote during BiDaTA 2013 in Genoa, a special track of the ADBIS 2013 conference. URL: http://dbdmg.polito.it/bidata2013/index.php/keynote-presentation
RDX Insights Presentation - Microsoft Business IntelligenceChristopher Foot
May's RDX Insights Series Presentation focuses on Microsoft's BI products. We begin with an overview of Power BI, SSIS, SSAS and SSRS and how the products integrate with each other. The webinar continues with a detailed discussion on how to use Power BI to capture, model, transform, analyze and visualize key business metrics. We’ll finish with a Power BI demo highlighting some of its most beneficial and interesting features.
Lambda architecture for real time big dataTrieu Nguyen
- The document discusses the Lambda Architecture, a system designed by Nathan Marz for building real-time big data applications. It is based on three principles: human fault-tolerance, data immutability, and recomputation.
- The document provides two case studies of applying Lambda Architecture - at Greengar Studios for API monitoring and statistics, and at eClick for real-time data analytics on streaming user event data.
- Key lessons discussed are keeping solutions simple, asking the right questions to enable deep analytics and profit, using reactive and functional approaches, and turning data into useful insights.
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
"Strategies for supporting near real time analytics, OLAP, and interactive data exploration" - Dr. Jeremy Engle, Engineering Manager Data Team at Jellyvision
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...StampedeCon
This session will be a detailed recount of the design, implementation, and launch of the next-generation Shutterstock Data Platform, with strong emphasis on conveying clear, understandable learnings that can be transferred to your own organizations and projects. This platform was architected around the prevailing use of Kafka as a highly-scalable central data hub for shipping data across your organization in batch or streaming fashion. It also relies heavily on Avro as a serialization format and a global schema registry to provide structure that greatly improves quality and usability of our data sets, while also allowing the flexibility to evolve schemas and maintain backwards compatibility.
As a company, Shutterstock has always focused heavily on leveraging open source technologies in developing its products and infrastructure, and open source has been a driving force in big data more so than almost any other software sub-sector. With this plethora of constantly evolving data technologies, it can be a daunting task to select the right tool for your problem. We will discuss our approach for choosing specific existing technologies and when we made decisions to invest time in home-grown components and solutions.
We will cover advantages and the engineering process of developing language-agnostic APIs for publishing to and consuming from the data platform. These APIs can power some very interesting streaming analytics solutions that are easily accessible to teams across our engineering organization.
We will also discuss some of the massive advantages a global schema for your data provides for downstream ETL and data analytics. ETL into Hadoop and creation and maintenance of Hive databases and tables becomes much more reliable and easily automated with historically compatible schemas. To complement this schema-based approach, we will cover results of performance testing various file formats and compression schemes in Hadoop and Hive, the massive performance benefits you can gain in analytical workloads by leveraging highly optimized columnar file formats such as ORC and Parquet, and how you can use good old fashioned Hive as a tool for easily and efficiently converting exiting datasets into these formats.
Finally, we will cover lessons learned in launching this platform across our organization, future improvements and further design, and the need for data engineers to understand and speak the languages of data scientists and web, infrastructure, and network engineers.
Data Con LA 2020
Description
In this session, I introduce the Amazon Redshift lake house architecture which enables you to query data across your data warehouse, data lake, and operational databases to gain faster and deeper insights. With a lake house architecture, you can store data in open file formats in your Amazon S3 data lake.
Speaker
Antje Barth, Amazon Web Services, Sr. Developer Advocate, AI and Machine Learning
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Mark Rittman
- Mark Rittman presented on deploying full OBIEE systems to Oracle Cloud. This involves migrating the data warehouse to Oracle Database Cloud Service, updating the RPD to connect to the cloud database, and uploading the RPD to Oracle BI Cloud Service. Using the wider Oracle PaaS ecosystem allows hosting a full BI platform in the cloud.
Lessons from Building Large-Scale, Multi-Cloud, SaaS Software at DatabricksDatabricks
The cloud has become one of the most attractive ways for enterprises to purchase software, but it requires building products in a very different way from traditional software
Getting Into the Business Intelligence Game: Migrating OBIA to the CloudDatavail
This presentation discusses best practice architecture for migrating the Oracle BI Applications to the cloud. It focuses on the Oracle cloud platform and database services, with a nod to infrastructure services, to lay out the idea of the hybrid cloud, and variations of the new age cloud BI/DW architecture for your analytics environment to succeed while operating at the same reliability or better all the while benefiting from what the cloud offers best.
Practical Tips for Oracle Business Intelligence Applications 11g ImplementationsMichael Rainey
The document provides practical tips for Oracle Business Intelligence Applications 11g implementations. It discusses scripting installations and configurations, LDAP integration challenges, implementing high availability, different methods for data extracts, and simplifying disaster recovery. Specific tips include scripting all processes, configuring the ODI agent JVM and connection pools for performance, understanding external LDAP authentication in ODI, implementing active-active high availability for ODI agents, choosing the right data extract method based on latency and volume, and using DataGuard and CNAMEs to simplify failover for disaster recovery.
Today, data lakes are widely used and have become extremely affordable as data volumes have grown. However, they are only meant for storage and by themselves provide no direct value. With up to 80% of data stored in the data lake today, how do you unlock the value of the data lake? The value lies in the compute engine that runs on top of a data lake.
Join us for this webinar where Ahana co-founder and Chief Product Officer Dipti Borkar will discuss how to unlock the value of your data lake with the emerging Open Data Lake analytics architecture.
Dipti will cover:
-Open Data Lake analytics - what it is and what use cases it supports
-Why companies are moving to an open data lake analytics approach
-Why the open source data lake query engine Presto is critical to this approach
Building and Deploying Large Scale SSRS using Lessons Learned from Customer D...Denny Lee
This document discusses lessons learned from deploying large scale SQL Server Reporting Services (SSRS) environments based on customer scenarios. It covers the key aspects of success, scaling out the architecture, performance optimization, and troubleshooting. Scaling out involves moving report catalogs to dedicated servers and using a scale out deployment architecture. Performance is optimized through configurations like disabling report history and tuning memory settings. Troubleshooting utilizes logs, monitoring, and diagnosing issues like out of memory errors.
Presentation by Mark Rittman, Technical Director, Rittman Mead, on ODI 11g features that support enterprise deployment and usage. Delivered at BIWA Summit 2013, January 2013.
Modernisation of BI Business Intelligence and Data Warehouse Solutions at Tra...David Pui
Modernisation of BI Business Intelligence and Data Warehouse Solutions at Tranz Rail.
David Pui's led the modernisation of the BI and Data Warehouse Solutions whilst he was the Senior Technologist at Tranz Rail.
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
Whether to take data ingestion cycles off the ETL tool and the data warehouse or to facilitate competitive Data Science and building algorithms in the organization, the data lake – a place for unmodeled and vast data – will be provisioned widely in 2020.
Though it doesn’t have to be complicated, the data lake has a few key design points that are critical, and it does need to follow some principles for success. Avoid building the data swamp, but not the data lake! The tool ecosystem is building up around the data lake and soon many will have a robust lake and data warehouse. We will discuss policy to keep them straight, send data to its best platform, and keep users’ confidence up in their data platforms.
Data lakes will be built in cloud object storage. We’ll discuss the options there as well.
Get this data point for your data lake journey.
With Power BI you can bring your BI architecture to the next level.
Architecture it's very important topic in a business intelligence project, let's discover which are right questions and possible scenarios to integrate Power BI in an existing environment or to build a new one from scratch.
We'll talkabout how to choose the right Storage Modes, how to design a refreshing policy, how to use dataflows to decouple and to lift the transformation process on Cloud and more.
Holistics - The Perfect Companion for Amazon Redshift
Amazon Redshift is great for the storing and processing of your data. However, there is still a significant effort involved in managing your data pipeline and analytics infrastructure.
Holistics provide Amazon Redshift customers with an end-to-end user interface across their data imports, data transformation and data presentation to address the complexities and challenges of managing their data analytics infrastructure pipeline.
Oracle Database 12c includes over 500 new features. Some key new features include:
- Oracle Database 12c Express (EM Express) which replaces Database Control and has less features than Database Control but does not require Java or an app server.
- New online capabilities like online DDL operations with no DDL locking, online move of partitions with no impact to queries, and online statistics gathering for bulk loads.
- Adaptive SQL Plan Management which allows the optimizer to select a more optimal plan at execution time based on current statistics.
- Multitenant architecture which allows consolidation of multiple databases into one container database with pluggable databases.
Exploring All options to move your Oracle Databases to the Oracle CloudAlex Zaballa
This document discusses various options for migrating Oracle databases to the Oracle Cloud. It begins with an introduction to Alex Zaballa and his background and experience. It then discusses Accenture Enkitec Group's capabilities in Oracle Engineered Systems implementations and Oracle technologies. The remainder of the document discusses specific methods for migrating databases to the Oracle Cloud, including using Oracle Database Cloud Service, choosing appropriate migration methods based on factors like database version and downtime tolerance, and techniques like using Oracle Database Cloud Backup Module or Data Pump to perform the migration.
The Summer 2016 release of Informatica Cloud is packed with many new platform features including :
- Cloud Data Integration Hub that supports publish and subscribe integration patterns that automate and streamline integration across cloud and on-premise sources
- Innovative features like stateful time sensitive variables, and advanced data transformations like unions and sequences
- Intelligent and dynamic data masking of sensitive data to save development and QA time.
-Cloud B2B Gateway is the leading data exchange platform for enterprises and it’ partners and customers providing end-to-end data monitoring capabilities and support for highest level of data quality.
- Enhancements to native connectors for popular cloud applications like Workday, SAP Success Factors, Oracle, SugarCRM, MongoDB, Teradata Cloud, SAP Concur, Salesforce Financial Services Cloud
And much more!
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
Mark Rittman from Rittman Mead presented on Oracle Big Data Discovery. He discussed how many organizations are running big data initiatives involving loading large amounts of raw data into data lakes for analysis. Oracle Big Data Discovery provides a visual interface for exploring, analyzing, and transforming this raw data. It allows users to understand relationships in the data, perform enrichments, and prepare the data for use in tools like Oracle Business Intelligence.
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
Mark Rittman, CTO of Rittman Mead, gave a keynote presentation on big data for Oracle developers and DBAs with a focus on Apache Spark, real-time analytics, and predictive analytics. He discussed how Hadoop can provide flexible, cheap storage for logs, feeds, and social data. He also explained several Hadoop processing frameworks like Apache Spark, Apache Tez, Cloudera Impala, and Apache Drill that provide faster alternatives to traditional MapReduce processing.
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...Mark Rittman
OBIEE12c comes with an updated version of Essbase that focuses entirely in this release on the query acceleration use-case. This presentation looks at this new release and explains how the new BI Accelerator Wizard manages the creation of Essbase cubes to accelerate OBIEE query performance
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Mark Rittman
This document summarizes a presentation about adding a Hadoop-based data reservoir to an Oracle data warehouse. The presentation discusses using a data reservoir to store large amounts of raw customer data from various sources to enable 360-degree customer analysis. It describes loading and integrating the data reservoir with the data warehouse using Oracle tools and how organizations can use it for more personalized customer marketing through advanced analytics and machine learning.
What is Big Data Discovery, and how it complements traditional business anal...Mark Rittman
Data Discovery is an analysis technique that complements traditional business analytics, and enables users to combine, explore and analyse disparate datasets to spot opportunities and patterns that lie hidden within your data. Oracle Big Data discovery takes this idea and applies it to your unstructured and big data datasets, giving users a way to catalogue, join and then analyse all types of data across your organization.
In this session we'll look at Oracle Big Data Discovery and how it provides a "visual face" to your big data initatives, and how it complements and extends the work that you currently do using business analytics tools.
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Mark Rittman
Presentation from the Rittman Mead BI Forum 2015 masterclass, pt.2 of a two-part session that also covered creating the Discovery Lab. Goes through setting up Flume log + twitter feeds into CDH5 Hadoop using ODI12c Advanced Big Data Option, then looks at the use of OBIEE11g with Hive, Impala and Big Data SQL before finally using Oracle Big Data Discovery for faceted search and data mashup on-top of Hadoop
End to-end hadoop development using OBIEE, ODI, Oracle Big Data SQL and Oracl...Mark Rittman
This document discusses an end-to-end example of using Hadoop, OBIEE, ODI and Oracle Big Data Discovery to analyze big data from various sources. It describes ingesting website log data and Twitter data into a Hadoop cluster, processing and transforming the data using tools like Hive and Spark, and using the results for reporting in OBIEE and data discovery in Oracle Big Data Discovery. ODI is used to automate the data integration process.
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015Mark Rittman
Slides from a two-day OBIEE11g seminar in Dubai, February 2015, at the Oracle University Expert Summit. Covers the following topics:
1. OBIEE 11g Overview & New Features
2. Adding Exalytics and In-Memory Analytics to OBIEE 11g
3. Source Control and Concurrent Development for OBIEE
4. No Silver Bullets - OBIEE 11g Performance in the Real World
5. Oracle BI Cloud Service Overview, Tips and Techniques
6. Moving to Oracle BI Applications 11g + ODI
7. Oracle Essbase and Oracle BI EE 11g Integration Tips and Techniques
8. OBIEE 11g and Predictive Analytics, Hadoop & Big Data
BIWA2015 - Bringing Oracle Big Data SQL to OBIEE and ODIMark Rittman
The document discusses Oracle's Big Data SQL, which brings Oracle SQL capabilities to Hadoop data stored in Hive tables. It allows querying Hive data using standard SQL from Oracle Database and viewing Hive metadata in Oracle data dictionary tables. Big Data SQL leverages the Hive metastore and uses direct reads and SmartScan to optimize queries against HDFS and Hive data. This provides a unified SQL interface and optimized query processing for both Oracle and Hadoop data.
UKOUG Tech'14 Super Sunday : Deep-Dive into Big Data ETL with ODI12cMark Rittman
This document discusses using Hadoop and Hive for ETL work. It provides an overview of using Hadoop for distributed processing and storage of large datasets. It describes how Hive provides a SQL interface for querying data stored in Hadoop and how various Apache tools can be used to load, transform and store data in Hadoop. Examples of using Hive to view table metadata and run queries are also presented.
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Mark Rittman
Delivered as a one-day seminar at the SIOUG and HROUG Oracle User Group Conferences, October 2014
In this presentation we cover some key Hadoop concepts including HDFS, MapReduce, Hive and NoSQL/HBase, with the focus on Oracle Big Data Appliance and Cloudera Distribution including Hadoop. We explain how data is stored on a Hadoop system and the high-level ways it is accessed and analysed, and outline Oracle’s products in this area including the Big Data Connectors, Oracle Big Data SQL, and Oracle Business Intelligence (OBI) and Oracle Data Integrator (ODI).
Part 4 - Hadoop Data Output and Reporting using OBIEE11gMark Rittman
Delivered as a one-day seminar at the SIOUG and HROUG Oracle User Group Conferences, October 2014.
Once insights and analysis have been produced within your Hadoop cluster by analysts and technical staff, it’s usually the case that you want to share the output with a wider audience in the organisation. Oracle Business Intelligence has connectivity to Hadoop through Apache Hive compatibility, and other Oracle tools such as Oracle Big Data Discovery and Big Data SQL can be used to visualise and publish Hadoop data. In this final session we’ll look at what’s involved in connecting these tools to your Hadoop environment, and also consider where data is optimally located when large amounts of Hadoop data need to be analysed alongside more traditional data warehouse datasets
Part 2 - Hadoop Data Loading using Hadoop Tools and ODI12cMark Rittman
Delivered as a one-day seminar at the SIOUG and HROUG Oracle User Group Conferences, October 2014.
There are many ways to ingest (load) data into a Hadoop cluster, from file copying using the Hadoop Filesystem (FS) shell through to real-time streaming using technologies such as Flume and Hadoop streaming. In this session we’ll take a high-level look at the data ingestion options for Hadoop, and then show how Oracle Data Integrator and Oracle GoldenGate leverage these technologies to load and process data within your Hadoop cluster. We’ll also consider the updated Oracle Information Management Reference Architecture and look at the best places to land and process your enterprise data, using Hadoop’s schema-on-read approach to hold low-value, low-density raw data, and then use the concept of a “data factory” to load and process your data into more traditional Oracle relational storage, where we hold high-density, high-value data.
Telangana State, India’s newest state that was carved from the erstwhile state of Andhra
Pradesh in 2014 has launched the Water Grid Scheme named as ‘Mission Bhagiratha (MB)’
to seek a permanent and sustainable solution to the drinking water problem in the state. MB is
designed to provide potable drinking water to every household in their premises through
piped water supply (PWS) by 2018. The vision of the project is to ensure safe and sustainable
piped drinking water supply from surface water sources
AI Competitor Analysis: How to Monitor and Outperform Your CompetitorsContify
AI competitor analysis helps businesses watch and understand what their competitors are doing. Using smart competitor intelligence tools, you can track their moves, learn from their strategies, and find ways to do better. Stay smart, act fast, and grow your business with the power of AI insights.
For more information please visit here https://www.contify.com/
Just-in-time: Repetitive production system in which processing and movement of materials and goods occur just as they are needed, usually in small batches
JIT is characteristic of lean production systems
JIT operates with very little “fat”
Mieke Jans is a Manager at Deloitte Analytics Belgium. She learned about process mining from her PhD supervisor while she was collaborating with a large SAP-using company for her dissertation.
Mieke extended her research topic to investigate the data availability of process mining data in SAP and the new analysis possibilities that emerge from it. It took her 8-9 months to find the right data and prepare it for her process mining analysis. She needed insights from both process owners and IT experts. For example, one person knew exactly how the procurement process took place at the front end of SAP, and another person helped her with the structure of the SAP-tables. She then combined the knowledge of these different persons.
GenAI for Quant Analytics: survey-analytics.aiInspirient
Pitched at the Greenbook Insight Innovation Competition as apart of IIEX North America 2025 on 30 April 2025 in Washington, D.C.
Join us at survey-analytics.ai!
3. [email protected] www.rittmanmead.com @rittmanmead
About the Speaker
• Mark Rittman, Oracle ACE Director, Oracle BI, DW & Big Data
• 14 Years Experience with Oracle Technology
• Regular columnist for Oracle Magazine
• Author of two Oracle Press Oracle BI books
- Oracle Business Intelligence Developers Guide
- Oracle Exalytics Revealed
• Writer for Rittman Mead Blog :
http://www.rittmanmead.com/blog
• Past Editor of Oracle Scene Magazine,
BIRT SIG Chair, ODTUG Board Member
• Co-founder and CTO for Rittman Mead
3
12. [email protected] www.rittmanmead.com @rittmanmead
Oracle BI Cloud Service - What Is It?
• Oracle Business Intelligence, re-imagined for the cloud
• Runs as part of Oracle Public Cloud, part of wider Oracle Platform-as-a-Service
• Pay monthly, min 10 users, rolling upgrades and new features
• Entirely thin-client, simplified administration
• Comes with single DB schema,
50GB storage
• Aimed at departmental use-cases
- Sharing data from a spreadsheet
- Team reporting
- Development sandboxes
12
13. [email protected] www.rittmanmead.com @rittmanmead
Oracle Database Schema Provided with BICS
•Each instance of BICS comes with Oracle Schema Cloud Service
•ApEx environment with 10GB storage attached
•Able to run PL/SQL packages (with 5m timeout)
•Either create tables, views etc from ApEX,
or use data uploader in BICS
13
Click to launch
ApEX Home Page
14. [email protected] www.rittmanmead.com @rittmanmead
Built-in Application Express Database Developer Tool
• Full ApEx environment for application building, table creation, SQL queries
• Can be used to build supporting applications, administration screens for OBIEE
application
• Make use of PL/SQL functionality
- Data cleansing
- Call Web Service REST APIs
- More complex processing
and calculations
• DB Cloud Service dashboard has tools
for data and application exports
14
15. [email protected] www.rittmanmead.com @rittmanmead
Options for Uploading Data to BI Cloud Service (DB Cloud)
• Use ApEx front-end and tools within it (data upload, data modeller etc)
• Use SQL*Developer and SFTP data upload facility
• Use BI Cloud Service Data Uploader
15
16. [email protected] www.rittmanmead.com @rittmanmead
Oracle BICS Initial Use-Case : Departmental Reporting
• Oracle BI Cloud Service in initial, standalone form aimed at departmental
reporting
- 50GB storage in single schema, limited ETL access
- Single subject area and limited RPD features
• Other common use-cases include
- Development environments
- Spin-off data discovery sandboxes
- Dedicated SaaS reporting applications
• But … part of wider Oracle Cloud platform
- Can we use that to host a whole BI system?
16
17. But can I run my whole BI+DW
platform on Oracle Cloud?
18. [email protected] www.rittmanmead.com @rittmanmead
Where Do I Put my Data Warehouse?
• I’d like to migrate to cloud, but my BI System uses a full Oracle DW database
- How do I host that in the cloud? And replicate it there in the first place?
18
19. [email protected] www.rittmanmead.com @rittmanmead
I Need Access to Full Admin Tool + RPD…
• We need access to a full Admin Tool, multiple data sources, federation
- BICS Thin Client Data Modeller isn’t enough…
19
23. [email protected] www.rittmanmead.com @rittmanmead
Oracle Cloud Can Host Full BI Platforms … Here’s How
• Host Full Oracle Data Warehouse in Oracle DBaaS
• Upload and run on-premise RPD including use of multiple databases
- Including on-premise through new WebLogic connector
• Run full ETL using ODI
• Use VA without 12c upgrade
• Still have partner support
for key stages + new
development …
- But take away IT hassles
23
24. [email protected] www.rittmanmead.com @rittmanmead
Wider Oracle Public Cloud PaaS Ecosystem
• Similar to other public cloud services (AWS, Azure etc) at IaaS layer
• But with platform (PaaS) and software (SaaS) layers as differentiator
• Can we use Database, Storage and other layers to host full OBIEE platform?
- And what about ETL, identity management, file storage and so on?
24
25. [email protected] www.rittmanmead.com @rittmanmead
Hosting Full OBIEE Platforms in Oracle Public Cloud
• Leverages BICS’s new “Upload RPD Data
Models to the Cloud” feature
- Migrate supporting DW to full Oracle DBaaS
- Update on-prem RPD to connect to DBaaS
- Upload RPD (+now catalog!) to BICS
• Create new Oracle Cloud users for BICS
• ETL can connect via SQL*Net, JDBC etc
• Use for wider use-cases than BICS incl.
- Host full production platform (or test, dev)
- Create development branches, etc
25
Oracle BICS
Oracle DBaaS
RPD
Upload
ETL ToolsBI Administration On-Premise Source DB
Data Uploads
via SQL*Net
26. [email protected] www.rittmanmead.com @rittmanmead
BICS Upload RPD to Cloud Key Steps
1. Make sure on-premise data model is consistent, no errors etc
2. Create DBaas Instance, Upload on-premise database to cloud
3. Update RPD Connection Pool settings to point to DBaaS instance
4. Recreate any user accounts, application roles etc
5. Create new dashboards and reports
26
27. [email protected] www.rittmanmead.com @rittmanmead
Example Migration : Full OBIEE11g SampleApp v406
• OBIEE11g SampleApp v506 - based on 11.1.1.9 release and delivered on VM
• Comes with Oracle 12c Database with multiple schemas, MVs, partitioning etc
- Also uses Essbase, Hadoop etc but out of scope for this exercise
• Objective is to migrate OBIEE, DW database and security elements
- Verify that ODI can still connect and load DW in Oracle Cloud
- Look at options for additional DB sources,
Essbase, Hadoop sources etc
27
28. [email protected] www.rittmanmead.com @rittmanmead
Creating a Full Database Instance in Oracle DBaaS
• Initial step is to create an Oracle Database Cloud Service Instance
• Options for 11g or 12c Release Database, High / Extreme Performance Options
- Determines scope of DB features available (OLAP, partitioning etc)
2
1
2
29. [email protected] www.rittmanmead.com @rittmanmead
Oracle Database Cloud Service Connectivity / Configuration
• Database runs in a VM, has Listener and other processes running
• Database parameters can be altered, but DB Options set by DBaaS version
- OLAP, Partitioning Options not present may affect DB uploads
• Ports blocked by default, need to be opened for SQL*Net etc access
29
30. [email protected] www.rittmanmead.com @rittmanmead
Managing and Monitoring Database Cloud Service
• Oracle Enterprise Manager Database Express 12c for DBA tasks
- Features determined by DBaaS edition
(regular, High or Extreme Performance)
• Virtualised Cloud Hosting monitored by
Oracle DBaaS Monitor
- RDBMS storage, alerts, processes etc
- OS-level monitoring
- Listener
- Backups etc
30
32. [email protected] www.rittmanmead.com @rittmanmead
Configuring On-Premise RPD to Connect to DBaaS
• Update Connection Pool settings in on-prem RPD before upload to BICS
- Note : DBaaS instance must be in same datacenter as BICS
• Note - No need to change overall DB connection setting in BICS Console
32
33. [email protected] www.rittmanmead.com @rittmanmead
Steps to Upload RPD to BI Cloud Service
• Backup (snapshot) current BICS environment if required
• Select Snapshots > Replace Data Model
• Use Browse button to select RPD from desktop, then upload to BICS
33
34. [email protected] www.rittmanmead.com @rittmanmead
Additional BICS Capabilities when using Uploaded RPD
• Allows multiple subject areas vs. single one in standard BICS
• Multiple DBaaS instances can be mapped into RPD for federated queries
• Full access to RPD features - vertical/horizontal federation etc
• Data Mashups also available (but not VA)
34
35. [email protected] www.rittmanmead.com @rittmanmead
Additional Post Upload Configuration : Mapping
• Mapping is now available in BICS, requires further Administrator configuration
• Single map provider (OracleMaps) with set of associated layers
• As with on-premise, map layers then need to be linked to subject area columns
35
1
2
3
36. [email protected] www.rittmanmead.com @rittmanmead
Configuring Security and Recreating Users, Roles
• On-premise users need to have corresponding new account created in BICS
- Also creates an Oracle Cloud login for BICS identity domain - separate to OTN login
• BICS licensed per user, min is 10 per pod with test and prod instances
• Application roles also need to be recreated and users added
36
1
2
3
37. [email protected] www.rittmanmead.com @rittmanmead
SSO between Oracle SaaS and BICS
37
• Allows an Oracle Cloud customer the
ability to have SSO between different
Cloud Apps, e.g. BICS and PBCS
• Requirements:
- Services need to be ordered in
the same data center
- Customer must activate
services under the same
identity domain
• When in the same IDM, a single
user can be authorized for one or
more available services.
38. [email protected] www.rittmanmead.com @rittmanmead
Row-level Security Within RPD
• Row-level security works properly within BICS and uploaded RPDs
• Best practice is to base on application roles, recreate matching ones to proceed
• Regular and row-wise session variables then work as expected
38
39. [email protected] www.rittmanmead.com @rittmanmead
Limitations on BICS with RPD Upload vs On-Premise
• Can only access other DBaaS sources at present - see next slide
- Presume integration with PBCS, Oracle Big Data Cloud Service, SaaS apps to come
- No ability at present to access Essbase, TT or other acceleration layers
• No further editing of the uploaded RPD - but TCM enhancements soon
• Upload the catalog to go with RPD - coming soon
• Further limitations that may possibly be lifted in the future
- Adding of HTML or Javascript to Analyses is disabled
- Some limitations around alternate sorts and other small issues
- Visual Analyser not available yet (multiple subject area issue?)
- No usage tracking (but see MOS Doc.ID 1965207.1)
• No BI Publisher - though can install through JCS + DBaaS
39
40. [email protected] www.rittmanmead.com @rittmanmead
Accessing On-Premise Sources - Now Possible
40
• On-Premise Data Access using WebLogic Connector
• Enables direct query to on-premises accessible databases
- Secure & Optimized for WAN transport
- Free download from OTN, requires WebLogic
- Phase 1: Requires Admin tool & RPD upload
- Phase 2: Thin Client Modeler support
41. [email protected] www.rittmanmead.com @rittmanmead
What About ETL and Data Integration?
• ODI12c (and other tools using SQL*Net, JDBC etc) can connect to DBaaS as
normal
• Either through SSH tunnel and SQL*Net,
or direct from internet if ports enabled
• Current data integration and ETL
can therefore continue as normal
• Makes it possible to create a
hybrid on-prem/cloud solution
On Premise
ODI
SaaS Apps
Oracle Data Integrator
ODI/EDQ
Java
Cloud Service
ODI
Big Data
Cloud Service
Database
Cloud Service
Exadata
Cloud Service
Storage
Cloud Service
Business Intelligence
Cloud Service
42. [email protected] www.rittmanmead.com @rittmanmead
Future Option : Oracle Big Data Preparation Cloud Service
• Oracle Cloud-based Data Preparation Service, aimed at data domain experts
• Takes files and other datasets from Oracle Cloud and prepares for analysis
• Split, transform and obfuscate data before loading into Hadoop data platform
42
43. [email protected] www.rittmanmead.com @rittmanmead
Uses Machine-Learning to Automate Data Recommendations
• Uses Spark MLlib machine learning to
profile data and recognise patterns
• Automates many of the routine data
preparation and profiling work
- Spot credit card, SSN + other sensitive
data, recommends masking
- Suggest appropriate names, datatypes
for columns based on
format and data patterns
• Allows analyst to focus on key tasks
- Example of cloud app consumerization
43
44. [email protected] www.rittmanmead.com @rittmanmead
Cloud Datasource Integration & File Upload/Download
• Primary datasource and target is Oracle Storage Cloud Service (like Amazon S3)
- Oracle Big Data Cloud Service, Oracle DBaaS and others
• User can also upload / download files directly into Big Data Prep Service
44
45. [email protected] www.rittmanmead.com @rittmanmead
How Can We use Big Data Prep Service with BICS?
• User uploads or downloads files (CSV,
TSV, Excel, Word, PDF, etc.) into Oracle
Storage Cloud Service
• Big Data Prep Service loads files
from Storage Cloud for preparation
• Prepared files are written back into
Storage Cloud Service
• User uploads prepared files into BICS
as datasource for ApEx / Schema
Service
- On roadmap : direct integration with VA
+ Answers for data-mashup data files
45
46. [email protected] www.rittmanmead.com @rittmanmead
Proposed Direct BDP Integration for VA / Data-Mashup Files
• Roadmap proposal from Oracle is to make BDP available from BICS homepage
• Load and prepare data to add as Visual Analyser & Answers Mashup data files
46
1
2
3
4
47. [email protected] www.rittmanmead.com @rittmanmead
Direct BICS / BDP Integration using Oracle Data Integrator
• Possible now : Use of ODI to prepare individual files through BDP
• ODI on-premise used to automate file extracts and upload into Storage CS using its
REST APIs
• BDP picks up files available in
Storage CS and prepares them
• Prepared files are written back
into Storage CS
• Prepped files downloaded by ODI
from Storage CS, loaded into BICS
- On roadmap: BDP will be able to
load prepared data directly into BICS
47
48. [email protected] www.rittmanmead.com @rittmanmead
Summary and Next Steps
• It’s now possible (mostly…) possible to migrate full OBIEE systems to cloud
• On-premise RPDs + catalog uploaded into BI Cloud Service
• Linked to DBaaS for DW, can access other DBaaS source + on-prem data
• ODI can still upload data from on-premise sources into DBaaS, or consider BDP
48