0% found this document useful (0 votes)

373 views48 pages

Big Data and How BI Got Its Groove Back

The proliferation of data is one of the most disruptive forces in technology today. "Big data" is a $9. Billion market opportunity, representing only 2% of $407 billion spent on software, storage, and servers. Over the next ten years, we expect Big Data-related computing to increase to $86. Billion.

Uploaded by

Joy Xi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

373 views48 pages

Big Data and How BI Got Its Groove Back

Uploaded by

Joy Xi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Industry Overview - Software

November 15, 2011

Greg McDowell [email protected] (415) 835-3934 Patrick Walravens [email protected] (415) 835-8943 Peter Lowry [email protected] (415) 869-4418

Big Data and Howbullet text here Its Groove Back Insert bullet heading here. Insert BI Got
Insert bullet heading here. Insert bullet text here

FOR DISCLOSURE AND FOOTNOTE INFORMATION, REFER TO THE JMP FACTS AND DISCLOSURES SECTION

TABLE OF CONTENTS
Executive Summary....3 Part I: An Introduction to Big Data A Big Data Primer....4 Big Data Market Opportunity....11 The Data Management Landscape.16 The Resurgence of Business Intelligence..23 Big Data Stock Performance....33 Part II: Initiation Summaries Part III: Privately Held Companies in Big Data Space (hard copy only) JMP Securities Software Team ....44 JMP Facts and Disclosures..46

EXECUTIVE SUMMARY
We believe the proliferation of data is one of the most disruptive forces in technology today. Despite accounting for a small portion of industry revenues today, we believe "Big Data" is poised for rapid growth. The purpose of this report is to help investors better understand "Big Data" and the market opportunity ahead. Part I: An Introduction to Big Data and the Resurgence of Business Intelligence In Part I of this report, we define and size the market opportunities created by Big Data. We define Big Data as data sets of extreme volume and extreme variety. In 2011, we estimate that "Big Data" is a $9.1 billion market opportunity, representing only 2% of the $407 billion spent on software, storage, and servers. We refer to this software, storage, and server spending as enterprise IT spending. Ten years ago, spending on Big Data was minimal due to the fact that data sets (volume) were much smaller, data had less variety, and the velocity of data flowing into organizations was much slower. Over the next ten years, we expect Big Data-related computing to increase to $86.4 billion, representing 11% of all enterprise IT spending and a 10-year CAGR of 25%. The key growth driver of Big Data is the proliferation of data. This proliferation of data has caused enterprises to need new tools and processes to collect data (both structured and unstructured) and also to store, manage, manipulate, analyze, aggregate, combine, and integrate data. In Part I we also discuss the resurgence of the Business Intelligence ("BI") market. We believe the business intelligence landscape is about to go through a major sea change that will radically transform the landscape and the way the industry thinks about analytics. In our view, the two primary reasons for the sea change are Big Data and the consumerization of Enterprise BI driven by trends such as mobile BI. With respect to Big Data, we believe it has become very easy to collect data, but difficult to make sense of that data using traditional BI tools. In other words, as the useful life of information has decreased, so has the utility of traditional BI tools which have historically been very backwards looking. Part II: Key Publicly Traded Companies in the Big Data Space In Part II of this report, we are initiating coverage of the infrastructure software group with a relatively constructive viewpoint. In the current volatile environment for stocks, we believe long-term investors should focus on the positive implications of emerging secular trends such as Big Data that could create significant profit opportunities over the next few years. We are recommending software companies with solid but flexible operating strategies that, in our opinion, will be primary beneficiaries of the Big Data trend. We are initiating coverage on six infrastructure software companies: MicroStrategy Inc. (MSTR), Progress Software Corp. (PRGS), Qlik Technologies (QLIK), Quest Software (QSFT), Teradata Corporation (TDC), and Tibco Software Inc. (TIBX). We are initiating coverage as follows: MicroStrategy with a Market Outperform rating and $140 price target. Progress Software with a Market Perform rating. Qlik Technologies ("QlikTech") with a Market Outperform rating and a $35 price target. Quest Software with a Market Perform rating. Teradata with a Market Outperform rating and $63 price target. TIBCO Software with a Market Outperform rating and a $33 price target. We also discuss the Big Data strategies for eight other publicly traded companies.

Part III- Privately Held Companies in Big Data Space (Available in Hard Copy Only) In Part III of this report, we provide profiles of 100 leading private software companies that are leveraged to benefit from the Big Data trend. Many of these companies approach the Big Data market from different angles, including the NoSQL movement, in-memory databases, columnar databases, Hadoop-related technologies, data grid/data cache solutions, solutions related to open source R, data visualization, predictive analytics, and real-time dashboards. Our favorite private companies include Cloudera, Splunk, Tableau Software, and Talend.
3

WHAT IS BIG DATA?

Big Data is one of the biggest trends in technology as enterprises increasingly need to manage the explosion of data caused by trends like cloud computing, the rise of mobility, globalization, and social media. This proliferation of data has caused enterprises to need new tools and processes to collect data (both structured and unstructured), store data, manage data, manipulate data, analyze data, and aggregate, combine, and integrate data. There is no set definition for Big Data although many third-party firms have provided their perspective. We define Big Data as data sets of extreme volume and extreme variety. In Figure 1 below, we illustrate how five different, influential firms define Big Data. FIGURE 1: The Many Different Definitions of Big Data

IDC "Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis." Forrester "Big data: techniques and technologies that make handling data at extreme scale economical." 451 Group "Big data is a term applied to data sets that are large, complex or dynamic (or a combination thereof) and for which there is a requirement to capture, manage and process the data set in its entirety, such that it is not possible to process the data using traditional software tools and analytic techniques within tolerable time frames." McKinsey Global Institute "Big data" refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze. This definition is intentionally subjective and incorporates a moving definition of how big a dataset needs to be in order to be considered big data (i.e., we don't define big data in terms of being larger than a certain number of terabytes (thousands of gigabytes). We assume that, as technology advances over time, the size of datasets that qualify as big data will also increase." Gartner "When business leaders or data management professionals talk about big data, they often emphasize volume, with minimal consideration of velocity, variety and complexity the other aspects of quantification: Velocity involves streams of data, structured record creation, and availability for access and delivery. Velocity means both how fast data is being produced and how fast the data must be processed to meet demand. Variety includes tabular data (databases), hierarchical data, documents, e-mail, metering data, video, image, audio, stock ticker data, financial transactions and more. Complexity means that different standards, domain rules and even storage formats can exist with each asset type. An information management system for media cannot have only one video solution."
Source: IDC, Forrester, 451 Group, McKinsey and Gartner

We think the third-party analyst firms have done a commendable job in their attempts to define Big Data. We do point out, however, that both vendors and industry analysts have latched onto the concept of the three V's: Volume, Velocity and Variety. We are also seeing firms add additional V's such as Variability and Value.

Many of these firms have also provided some useful illustrations on Big Data, as shown in Figure 2 below. FIGURE 2: Forrester's Four V's of Extreme Scale

Source: http://blogs.forrester.com/brian_hopkins/11-08-29-big_data_brewer_and_a_couple_of_webinars

Gartner takes a similar approach with the "V's", but also adds Complexity, as shown in Figure 3 below. FIGURE 3: Garter's Big Data Graph

Source: Gartner

Just how popular is the term Big Data becoming? A quick look at Google Trends Search Volume Index reveals the popularity of the term, as shown in Figure 4 below: FIGURE 4: Google Trends of Term "Big Data"

Source: Google Trends

We compared the term "Big Data" to "Cloud Computing" and interestingly, the trajectory of "Cloud Computing" in 2008 was very similar. Investors who bought publicly-traded companies leveraged to the cloud-computing trend in 2008 have done well, as evidenced by the price performance of stocks such as RightNow Technologies (RNOW, MP, $37 PT, Walravens), salesforce.com (CRM, MO, $170 PT), and VMware (VMW, MO, $123 PT). A look at the job trend graph on Indeed.com illustrates a similar trend as shown in Figure 5 below: FIGURE 5: Job Trends from Indeed.com for Term "Big Data"

Source: Indeed.com

WHAT ARE THE CAUSES OF BIG DATA?

Why is data proliferating at such a rapid rate? There are numerous reasons, including cloud computing, mobile phones, social media, machine data, web logs, RFID tags, and sensor networks, among others. According to a recent report by McKinsey, 60% of the world's population, or more than four billion people, have a mobile phone. There are 30 million network sensor modes in the transportation, automotive, industrial, utilities, and retail sectors (increasing at a rate of more than 30% a year), and 30 billion pieces of content shared on Facebook every month. Figure 6 below outlines just a few examples.

FIGURE 6: The Growth of Business Transaction and Web Application Data

Source: Informatica

One reason the concept of Big Data even exists is because the world's technological installed capacity to store information has increased by a factor of 113 in a 20-year period (1986-2007), as shown in Figure 7 below. In an excellent article by Martin Hilbert and Priscilla Lopez published in Science magazine, the authors estimated that the total amount of information grew from 2.6 optimallycompressed exabytes in 1986 to 295 optimally-compressed exabytes in 2007. The authors note that "piling up the imagined 404 billion CD-ROM from 2007 would create a stack from the earth to the moon and a quarter of this distance beyond (with 1.2 mm thickness per CD)." In a short span of 20 years we have moved from an almost 100% analog driven world (books, newsprint, x-rays, etc.) in 1986 to a primarily digital driven world in 2007. FIGURE 7: Worlds Technological Installed Capacity to Store Information

Source: Published Online 10 February 2011, Science 1 April 2011: Vol. 332 no. 6025 pp. 60-65, DOI: 10.1126/science.1200970 Article Title: The World's Technological Capacity to Store, Communicate, and Compute Information Article Authors: Martin Hilbert and Priscilla Lopez

We like Figure 8 below because it shows, in MIPS (million instructions per second), the world's technological installed capacity to compute information on general-purpose computers. As shown, we have gone from a world in 1986 where 41% of installed capacity was by pocket calculator, to 2007, when pocket calculators were less than 1%. FIGURE 8: Worlds Technological Installed Capacity to Compute Information on Generalpurpose Computers, in MIPS

Just how fast is this "digital universe" expected to grow looking ahead? According to IDC, as shown in Figure 9 below, in 2009 there were nearly 800,000 petabytes (a million gigabytes) in the digital universe. In 2011, the amount of information created and replicated will surpass 1.8 zettabytes (1.8 trillion gigabytes) - growing by a factor of nine in just five years. By 2020, IDC expects the number to grow to 35 zettabytes, which is a factor of 44 and a CAGR of 40%.

FIGURE 9: IDC Figure on the Growth of the Digital Universe

Source: IDC

The explosion of data is causing new firms and technologies to emerge. Our favorite private company example is Splunk. Splunk is the engine for machine data. It is software which collects, indexes and harnesses any machine data generated by an organizations IT systems and infrastructure - physical, virtual and in the cloud. According to Splunk, machine data is unstructured, massive in scale and contains a categorical record of all transactions, systems, applications, user activities, security threats and fraudulent activity. Splunk can be used in a variety of use cases, including application management, security and compliance, infrastructure and IT Operations Management, and business and web analytics. Almost half of the Fortune 100 and over 2,900 licensed customers in 70 countries use Splunk. Interestingly, beginning with Version 4 Splunk uses MapReduce to retrieve and analyze massive datasets.

BIG DATA MARKET SIZE

In 2011, we estimate that "Big Data" represents a $9.1 billion market opportunity, representing approximately 2% of the $407 billion spent on software, storage, and servers. We refer to this software, storage, and server spending as enterprise IT spending. Ten years ago, spending on Big Data was minimal, due to the fact that data sets were much smaller. Over the next ten years, we expect Big Data-related computing to increase to $86.4 billion, representing 11% of all enterprise IT spending. FIGURE 10: Big Data Estimates: 2011-2021

Source: JMP Estimates

Based on our estimates and IDC estimates, we project that the total Enterprise IT market will grow around 5% over the next 10 years, reaching $676 billion by 2021. FIGURE 11: Total Enterprise IT Spending: 2011-2021

Source: JMP Estimates

Because Big Data is becoming a larger share of enterprise IT spending, it is growing much faster than the overall enterprise IT market. As shown in Figure 12 below, we expect Big Data to grow from $9.1 billion in 2011 to $86.4 billion in 2021, a compound annual growth rate of 25%. FIGURE 12: Big Data Estimates: 2011-2021

Source: JMP Estimates

We arrive at these estimates by making certain assumptions of different components of the Big Data market. In the next section, we break down the different components of the market.

We believe the Big Data market is comprised primarily of three different sub-segments: Business Analytics, Storage, and Servers. In this section we define the total size of these markets and discuss penetration rates of Big Data in each of these markets. Figure 13 below highlights the total size of these markets, based on IDC and JMP estimates. As shown, the total market is around $131.4 billion in 2011 growing to $238.4 billion in 2021. FIGURE 13: Total Market Size of Business Analytics, Storage, and Services
Business Analytics YOY Growth Storage YOY Growth Servers YOY Growth Total Market Size YOY Growth 2011 32.0 45.2 54.1 131.4 2012 35.3 10% 47.5 5% 54.6 1% 137.4 5% 2013 39.0 10% 49.9 5% 55.0 1% 143.9 5% 2014 43.0 10% 50.8 2% 55.0 0% 148.8 3% 2015 47.2 10% 53.3 5% 55.7 1% 156.3 5% 2016 52.2 10% 56.0 5% 58.5 5% 166.7 7% 2017 57.8 11% 58.8 5% 61.4 5% 178.1 7% 2018 64.4 11% 61.7 5% 64.5 5% 190.7 7% 2019 72.2 12% 64.8 5% 67.7 5% 204.7 7% 2020 81.3 13% 68.1 5% 71.1 5% 220.5 8% 2021 92.2 13% 71.5 5% 74.7 5% 238.4 8%

Source: IDC and JMP Securities

The segment that requires the most explanation, in our opinion, is the Business Analytics market. IDC defines the Business Analytics Market as the "combination of the data warehouse (DW) platform software with performance management and analytic applications and business intelligence (BI) and analytic tools." Figure 13 below provides a taxonomy of the Business Analytics market. As shown, there are three overall categories of the business analytics market: BI and Analytics Tools, Data Warehousing Platform software, and Analytic Applications. IDC expects these three markets to grow at a 2010-2015 CAGR of 9.2%, 9.8%, and 7.9%, respectively, with the total Business Analytics Market representing a CAGR of 8.9%. The Business Analytics market is expected to grow from $30.7 billion in 2011 to $43.1 billion in 2015. FIGURE 14: IDC's Business Analytics Taxonomy, 2011

Source: IDC

The key question we had to ask ourselves in trying to size the Big Data market was "What could Big Data's penetration be within each of the three main sub-segments of the market: Business Analytics, Storage, and Servers?" In other words, what percentage of the total market is comprised of projects that can fall under the Big Data definition? As a baseline, we have assumed that around 7% of the size of the Business Analytics, Storage and Servers market in 2011 meets the definition of Big Data. We assume that by 2021, 36% of the Business Analytics, Storage, and Servers market will meet the definition of Big Data. This leads to the breakdown of the $9.1 billion estimate in 2011 and the $86.4 billion estimate in 2021, as shown in Figure 15 below.

FIGURE 15: Total Market Size of Business Analytics, Storage, and Services (in billions)
Business Analytics YOY Growth Storage YOY Growth Servers YOY Growth Big Data YOY Growth $ $ $ $ 2011 2012 2.4 $ 3.2 $ 33% 3.1 $ 3.9 $ 28% 3.6 $ 4.5 $ 24% 9.1 $ 11.6 $ 28% 2013 4.3 33% 5.1 28% 5.6 24% 14.9 28% $ $ $ $ 2014 5.6 32% 6.3 24% 6.8 22% 18.7 26% $ $ $ $ 2015 7.4 31% 8.0 28% 8.4 24% 23.9 27% $ $ $ $ 2016 9.7 32% 10.3 28% 10.8 28% 30.8 29% $ $ $ $ 2017 12.8 32% 13.2 28% 13.8 28% 39.8 29% $ $ $ $ 2018 16.8 32% 15.2 16% 16.0 16% 48.0 21% 2019 $ 22.1 32% $ 17.6 16% $ 18.5 16% $ 58.2 21% $ $ $ $ 2020 29.1 32% 20.3 16% 21.3 16% 70.7 22% $ $ $ $ 2021 38.3 32% 23.5 16% 24.6 16% 86.4 22%

Source: IDC and JMP Securities

WHAT TECH LAYERS ARE IMPACTED BY BIG DATA?

The Big Data movement will impact every layer of the technology stack, in our opinion, from servers to storage to software. In this report we have chosen to focus on two specific layers which we believe will be most impacted by Big Data, in our opinion: the Data Management layer and the Business Intelligence layer. In the next two sections we drill deeper into each level.

DATA MANAGEMENT LAYER

In this section we discuss the Data Management layer. We first take a brief look at the entire data management landscape, and then drill down into two specific areas of the landscape: Hadoop and the NoSQL movement. The data management landscape is fragmenting to handle Big Data, in our opinion. Matt Aslett at The 451 Group produced the below graph which illustrates how new technologies are emerging to handle the big data trend. Our key takeaways of this graph are twofold. First and foremost, it is clear, in our opinion, that there will not be a single "data management" approach to handling the proliferation of data. Second, the traditional players in the data management space will still be significant players, but will increasingly have to compete against upstarts that approach the Big Data opportunity with new technological approaches.

FIGURE 16: Data Management Landscape According to 451 Group

Source: http://blogs.the451group.com/information_management/2011/04/15/nosql-newsql-and-beyond/

One technology in the above graph that we would like to highlight is Hadoop. Hadoop and Big Data are often used in the same breath. While we contend that Big Data is a lot more than just Hadoop, it is useful to understand what Hadoop is, in order to have a better appreciation for the Big Data movement.

According to the Apache Hadoop website, Hadoop is defined as "the Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver highavailability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures." Like the term "Big Data", Hadoop is increasingly popular, as shown in Figure 17 below. FIGURE 17: Google Trends of Term "Hadoop"

Source: Google Trends

A look at the job trend graph on Indeed.com illustrates a similar trend as shown in Figure 18 below: FIGURE 18: Job Trends from Indeed.com for Term "Hadoop"

Source: Google Trends

Hadoop also includes a number of subprojects, as shown in Figure 19 below.

FIGURE 19: Hadoop Subprojects and Related Projects at Apache The project includes these subprojects: Hadoop Common: The common utilities that support the other Hadoop subprojects. Hadoop Distributed File System (HDFS): A distributed file system that provides highthroughput access to application data. Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters. Other Hadoop-related projects at Apache include: Avro: A data serialization system. Cassandra: A scalable multi-master database with no single points of failure. Chukwa: A data collection system for managing large distributed systems. HBase: A scalable, distributed database that supports structured data storage for large tables. Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying. Mahout: A Scalable machine learning and data mining library. Pig: A high-level data-flow language and execution framework for parallel computation. ZooKeeper: A high-performance coordination service for distributed applications.

Source: http://hadoop.apache.org/

There are a number of private companies that are producing commercial offerings around the Hadoop community. Private companies such as Appistry, Cloudera, DataStax, Hortonworks, and MapR Technologies are all worth watching and are highlighted in the private company section of this report. In the next section, we provide real world case studies of how enterprises are using Hadoop and Hadoop related technologies.

HADOOP CASE STUDIES

Hadoop is being used as an analysis tool in a wide array of business situations and industries. To best understand Hadoop, in the section below we provide 12 case studies of how Hadoop is being used in the real world. AOL (AOL, Not Covered) AOL Advertising runs one of the largest online ad serving operations, serving billions of impressions each month to hundreds of millions of people. The company wanted to improve how it targeted its ads to each person by mining large amounts of information about users. With Hadoop, AOL is able to serve up ads for services near where a user is located or ads targeted toward a user's interest by generating hundreds of millions of user profiles and making optimized decisions about real-time ad placement. Groupon Data is one of Groupons most strategic assets. Groupon relies on information from both vendors and customers to make daily deal transactions run smoothly. Groupon realized that it needed better ways to organize and make sense of the data generated by their massive user base for long-term usage. Groupon uses Hadoop as a staging area for all of their extreme data and implemented Hadoop in such a way that it can feed data sets that it has sifted into relational database frameworks designed to simplify access to key customer and business-focused data. This solution allows Groupon to take advantage of the ease of scale of the system and ultimately to be prepared for future growth while consistently gaining new insights into its customers and business. IBM (IBM, Not Covered) In February 2011, IBM's Watson computer competed against and defeated Jeopardy! Champions Brad Rutter and Ken Jennings on the popular game show. Watson used a complex array of natural language processing, semantic analysis, information retrieval, automated reasoning and machine learning to evaluate, understand and answer the questions. It had to determine verbs, nouns, objects and moreover, nuances in the English language, and then look for the best answer. This required tasks to be executed in parallel in order to answer the questions quickly enough to be competitive with the human contestants. Hadoop was used to create Watsons brain, or the database of knowledge, and facilitate preprocessing of enormously large volumes of data in milliseconds. Watson is able to process 90 trillion operations (teraflops) per second and depends on 200 million pages of content and 500 gigabytes of preprocessed information to answer Jeopardy questions. That huge catalog of documents has to be indexed so that Watson can answer the questions within the 3-second time limit. Watson accumulated $77,147 in winnings and won the challenge. LinkedIn (LNKD, Not Covered) Many of LinkedIns products are critically dependent on computationally intensive data mining algorithms. Examples of these include modules like People You May Know, Viewers of This Profile Also Viewed, InMaps and much of the Job matching functionality that they provide to people who post jobs on the site. Using Hadoop, LinkedIn crunches 120 billion relationships per day and scores these relationships using a statistical model to determine the probability that a user may know another LinkedIn member, for example. By using Hadoop, combined with other technologies, LinkedIn is able to blend large-scale data computation with a high volume, low latency site serving to provide extremely accurate and relevant recommendations and information to LinkedIn users. Orbitz (OWW, Not Covered) Orbitz generates hundreds of gigabytes of log data from web traffic each day. Its challenge was that it was expensive and difficult to use existing data and infrastructure for storing and processing this data. Hadoop was selected to provide a solution to the problem of long-term storage and the processing of these large quantities of unstructured or semi-structured data. With the ability to store and analyze more data, Orbitz has used Hadoop to automate the classification process of the data to optimize their hotel rankings, increase bookings, and measure and track website performance. Hadoop was much more scalable than their existing solution and showed a four time improvement in processing time.

Wal-Mart (WMT, Not Covered) Hadoop is part of Wal-Mart's strategy to analyze large amounts of data to better compete against online retailers including Amazon.com. With the increasing role that social networking sites such as Facebook and Twitter are playing in online shopping, Wal-Mart is also looking to glean insights into what consumers want. Wal-Mart uses Hadoop in its keyword campaigns to drive traffic from search engines to Walmart.com. The software collects information about millions of keywords and then comes up with optimal bids for each word. It also allows them to create language models so the site can return more relevant product results when a user searches for a specific product or an item based on that users' Tweets or Facebook posts. Tennessee Valley Authority (TVA) The Tennessee Valley Authority ("TVA") is a federally-owned corporation in the United States that provides flood control, electricity generation, and economic development in the Tennessee Valley. The TVA was selected to collect data from phasor measurement unit ("PMU") devices on behalf of the North American Electric Reliability Corporation ("NERC") to help ensure the reliability of the bulk power system in North America. PMU data includes voltage, current, frequency and location data, and is considered part of the measurement data for the generation and transmission portion of the so-called smart grid. It uses smart-grid field devices to collect data on its power-transmission lines and facilities across the country. These sensors send in data at a rate of 30 times per second and the rate of incoming PMU data was growing very quickly with more and more PMU devices coming online regularly. The TVA was faced with the problem of how to reliably store this data and make it available for use. Hadoop was selected because it solved their storage issues and provided a robust computing platform to analyze the data. It also allowed them to employ commodity hardware and open source software at a fraction of the price of proprietary systems to achieve a much more manageable expenditure curve as its repository grows. Rapleaf Rapleaf helps businesses create more personalized experiences to their customers by providing them with useful information about each customer, such as age, gender, location and interests via their Personalization API. Businesses leverage this insight to better understand their customers in order to personalize deals and offers, show them more relevant content and give them a better experience online and off. Rapleaf has a vast amount of consumer data which includes over a billion email addresses and terabytes of data. Hadoop has allowed Rapleaf to manage and work with this data a scale much more easily than their previous RDBMS systems. They have implemented a batch-oriented process that allows them to ingest and normalize raw data from numerous sources, analyze it and then package the data into easily-served objects. Crossbow Crossbow is an open-source, Hadoop-enabled software pipeline for quickly, accurately, and cheaply analyzing human genomes in the cloud. While human genomes are about 99.9% identical, discovering differences between genomes is the key to understanding many diseases, including how to treat them. While sequencing has undoubtedly become an important and ubiquitous tool, the rapid improvements in sequencing technology have created a firehose problem of how to store and analyze the huge volume of DNA sequence data being generated in a short period of time. Presently, the process of scanning and mapping generates about 100GB of compressed data (read sequences and associated quality scores) for one human genome. Crossbow combines one of the fastest sequence alignment algorithms, Bowtie, with a very accurate genotyping algorithm, SoapSNP, within Hadoop to distribute and accelerate the computation. The pipeline can accurately analyze an entire genome in one day on a 10-node local cluster or in about three hours for less than $100, using a 40-node, 320-core cluster rented from Amazons (AMZN, NC) EC2 utility computing service. Our evaluation against a gold standard of known differences within the individual, shows Crossbow is better than 99% accurate at identifying differences between human genomes. Crossbow will enable the computational analysis without requiring researchers to own or maintain their own computer infrastructure. Bank of America (BAC, Market Perform, Covered By David Trone) With Hadoop, Bank of America has been able to analyze billions of records to gain a better understanding of the impact of new and existing financial products. The bank can now examine things like credit and operational risk of products across different lines of business including home loans, insurance, and online banking.

Disney (DIS, Not Covered) Disney was faced with the challenge of what to do with the increasing amount of data collected from business operations, customers transactions, along with unstructured data created by social media and their various web properties (i.e. ESPN and ABC). Disney's Technology Shared Service Group uses Hadoop as a cost-effective way to analyze and correlate information from all of its different businesses including theme-park attendance, reservations at resort hotels, purchases from Disney stores and viewership of Disney's cable TV programming. General Electric (GE, Not Covered) GE is running several use cases on their Hadoop cluster, which gave them deeper analytic capabilities and insights into their business. The marketing and communications teams can assess how the public perceives the company through sentiment analysis. It uses Hadoop to mine text such as updates on Facebook and Twitter along with news reports and other information on the Internet to understand, with 80-percent accuracy, how consumers feel about GE and its various divisions. They have also built a recommendations engine for their intranet allowing them to display targeted press releases to each user based on their job function, user profile and prior visits to the site. Finally, Hadoop enables them to work with several types of remote monitoring and diagnostic data from their energy and wind business.

NoSQL MOVEMENT
Besides Hadoop, one of the most interesting areas of the Data Management landscape is the NoSQL movement. The NoSQL movement (or sometimes called the "not only SQL" movement) refers to database management systems that tend to be non-relational. The NoSQL movement consists of four primary categories. Key-value stores, Bigtable clones, document databases, and graph databases. Figure 19 below highlights the four NoSQL categories from a "data size" and "data complexity" angle. The first NoSQL category is "key-value stores", which is based on Amazon's Dynamo paper that was published in 2007. The data model of key-value stores is a collection of K-V pairs. Examples include Dynomite, Voldemort, Membrain, and Berkeley DB, among others. The second NoSQL category is BigTable clones, which is based on Google's BigTable paper that was published in 2006. The data model of BigTable is big table, column families. Examples of BigTable include HBase, Hypertable, and Cassandra. The third NoSQL category is document databases. People often think of Lotus Notes when they think of document databases. Examples include CouchDB, MongoDB, and RavenDB. The fourth NoSQL category is graph databases. A graph database "uses graph structures with nodes, edges, and properties to represent and store information." Examples of graph databases include AllegroGraph, Sones, Neo4J, InfiniteGraph and GraphDB.

FIGURE 20: The Four NoSQL Categories

Source: http://www.slideshare.net/emileifrem/nosql-overview-neo4j-intro-and-production-example-qcon-london2010?src=related_normal&rel=8600029

There is controversy around the use of NoSQL solutions. Proponents of NoSQL solutions cite the flexibility, scalability, low price, and NoSQL solutions appropriateness for specific use cases. Critics of the NoSQL movement often cite the maturity level of NoSQL solutions, the lack of commercial support, and the inability for NoSQL databases to work with traditional BI tools. Stepping back, we believe new data management technologies will continue to emerge to handle the Big Data trend. While traditional RDBMS's will likely continue to play a significant role well into the future, the days of one database standard or technology within an organization are quickly coming to an end, in our opinion

BUSINESS INTELLIGENCE, ANALYTICS AND BIG DATA

The second layer of the technology stack most impacted by the Big Data trend is the Business Intelligence layer. In this section we briefly look at the Business Intelligence landscape, we discuss the hottest areas of business intelligence and analytics, three reasons why the BI landscape is going through a major sea change, and four areas within business intelligence that we believe investors should understand, including the differences between Business Analytics versus Business Intelligence, Agile BI Versus Traditional BI, The R Open Source Programming Language, and Data Visualization. Business intelligence (BI) refers to a series of related activities like online analytical processing, querying and reporting to analyze an organizations raw data using standardized software applications (see Figure 21 below).

FIGURE 21: Info Tech Research Group BI Architecture

Source: Info Tech Research Group

What are the hottest areas of business intelligence and analytics? TDWI Research analyst Philip Russom recently put together a thoughtful piece on Big Data Analytics. TDWI surveyed 360 companies across a broad sector of industries and found that Advanced Analytics, Advanced Data Visualization, and Predictive Analytics had the highest commitment levels with the most potential for growth, as shown in Figure 22 below. Gartner's research on the topic supports TDWI's survey, with Gartner noting that lighter footprint, visualization-based analytics and data discovery products are the fastest growing areas in the business intelligence space, growing at 3x the overall BI market, according to Gartner. Gartner expects the data discovery market alone to grow from $591 million in 2011 to $1.6 billion in 2015.

FIGURE 22: Options for Big Data Analytics Plotted by Potential Growth and Commitment

Source: Info Tech Research Group

We believe the business intelligence landscape is about to go through a major sea change that will radically transform the landscape and the way the industry thinks about analytics. We have identified three primary reasons for the sea change: 1) 2) 3) Big Data and the Explosion of Data The Consumerization of Enterprise BI Industry Consolidation

We discuss each below: 1) Big Data and the Explosion of Data Earlier in this report we covered some of the reasons behind the explosion in data, including the precipitous drop in memory prices over the last 10 years and the sheer number of "devices" now collecting information for enterprises. The problem is simply that it has become very easy to collect data, but difficult to make sense of that data using traditional BI tools. In other words, as the useful life of information has decreased, so has the utility of traditional BI tools which have historically been very backwards looking. 2) The Consumerization of Enterprise BI We see users within enterprises increasingly demanding easier-to-use and more intuitive business intelligence solutions. This is driven simply by the consumerization of all enterprise IT. The consumerization of IT has been covered extensively by other sources, but in our view, simply means that there has been a shift from enterprise behavior influencing an individuals behavior at home (i.e., Circa 1995, a person saying I have email at work and I now want email at home) to behavior in the home influencing enterprise IT (i.e., Circa 2011, a person saying what do you mean I can't get my corporate email on my iPhone). We believe the consumerization of BI is being driven by individuals being able to have amazing analytics on their internet and mobile devices. These individuals increasingly insist on having access to analytics in their day-to-day jobs. 3) Industry Consolidation The final reason a major sea change is occurring in the business intelligence space, is simply the change in the vendor landscape. As investors recall, there has been massive consolidation as the heavyweights in the tech industry invested heavily in the space. The most prominent examples include IBM's acquisition of Cognos and SPSS, SAP's acquisition of Business Objects, and Oracle's acquisition of Hyperion.

In the next section, we discuss five areas within business intelligence that we believe investors should understand: 1) 2) 3) 4) 5) Business Analytics Versus Business Intelligence Agile BI Versus Traditional BI Business Intelligence in the Cloud The R Open Source Programming Language Data Visualization

BUSINESS ANALYTICS VERSUS BUSINESS INTELLIGENCE

Business Analytics ("BA") is the structured, iterative approach to exploring an organizations data with a heavy emphasis on statistical analysis. The combination of business knowledge, statistical tools, knowledge of business intelligence / analysis tools, rapid hypothesis generation and most importantly, a methodical approach to analyzing data make the field of business analytics very complicated, unique and the fastest growing field in the business realm. The fundamental differences between business intelligence and business analytics are summarized in Figure 23 below.

FIGURE 23: Business Intelligence and Business Analytics S.No.

1. 2. 3. 4.

Business Intelligence
Identify business trends Understand the timeframe of change in business trend Understand the different elements involved in the business trend Quantify the change in the business trend

Business Analytics
Understand and act on business trends Forecast the possibility of the trend occurring again Understand the implications of the business trend Understand other possible explanations and scenarios associated with the change in business trend Key aspects include (a) Statistical / quantitative analysis (b) Data mining (c) Predictive modeling (d) Multi-variate testing

Key aspects include (a) (b) (c) (d) (e) (f) (g) Reporting (KPIs, metrics) Automated monitoring / alerting Dashboards Scorecards OLAP (Cubes, slice and dice, drill-down) Ad hoc query generation Deployment of solution to multiple devices

Source: JMP Securities

We make the distinction between Business Intelligence and Business Analytics to highlight that the BI industry so far has done a good job with the "Business Intelligence" side of the chart but in many ways we are still in the early innings of the "Business Analytics" side of the chart.

AGILE BUSINESS INTELLIGENCE

According to a study by Gartner, 70% to 80% of business intelligence projects fail. Much of the failure is due to a structured, linear BI-development cycle that spans from four months to one year or more. In the current business environment, underlying business rules that drive measurements change rapidly. In such an environment, the linear development cycle falls apart as the rules have changed by the time the original set of requirements have been deployed to a production environment. Agile business intelligence deployment addresses the concerns of the linear development model through an iterative development process to enable rapid delivery of solutions with the least amount of rework and risk. Agile BI involves agile project management based on taking the standardized documentation and approval and reducing it to a leaner essential process of shorter requirement documents and faster approval processes. Agile BI also requires an agile infrastructure driven by a dynamic data integration layer. Vendors such as Informatica (INFA, MO, $61, Walravens) have created tools to integrate data in a virtual layer with the external source data cached and refreshed as provided by the data owner or as required by the business users. Virtual data integration lends itself to data visualization at a very early stage of the BI development cycle, thereby empowering users to refine requirements at early stages. Agile BI brings a host of challenges in most organizations. A few of these are: (a) Agile BI requires strong involvement from the business which would translate to closely followed resource scheduling and task management practices in the business community. A typical business community is perceived to be relatively less structured in these aspects. (b) A powerful and flexible IT infrastructure that can support faster delivery is a foundational requirement to the success of agile BI. Most IT infrastructures are not upgraded regularly and adopt slightly outdated versions of hardware and software, primarily due to budget constraints. (c) Organizational cultural aspects play a key role in the success of agile BI. Daily team meetings are not viewed favorably in certain organizations and getting stakeholder buy-in for such requests might be difficult. (d) Agile BI requires operations resources to be on their toes all the time. Scheduled server maintenance tasks and server upgrades have to be handled more diligently to ensure the least disruption for the development cycle. (e) Agile development is typically known to fit a regular software / product development cycle due to the inherent need for automating most or all of the QA testing process. Agile BI does not lend itself well to automated testing largely due to complex business rules involved in the development of BI solutions. Testing requirements are also made complex due to the integration of disparate data sources from internal and external data feeds. (f) The difficulty of maintaining a BI team with the least turnover makes agile BI difficult to implement. Most BI teams involve regular turnover of resources. Agile BI requires a standard set of resources to be tied to a BI program stream until the BI solution reaches a critical maturity stage. (g) Defining smaller chunks of scope is fundamental to the definition of agile BI. However, certain BI implementations require looking at a broad range of data sources simultaneously to clearly define business rules. Such requirements would make agile BI difficult to implement. Agile BI happens to be the next big implementation approach for BI across the industry. Most BI vendors are designing innovative tools and methodologies to enable agile BI development. With the rapid growth in the amount of data and constant requirements for new and innovative analysis approaches tied to faster operational turnaround, we believe that agile BI is going to be at the center of any marketed BI software in the next few years.

BUSINESS INTELLIGENCE IN THE CLOUD

Cloud-driven business intelligence refers to a virtualized, dynamically-provisioned, service-oriented, utilization-based, multi-tenant architecture on the internet, through which customers can unleash the power of business intelligence. The power of cloud-driven BI lies in the dynamic provisioning model where hardware and software components are virtually added and utilized as needed. For example, at 9 p.m., only one server component could be required while at 9 a.m., three servers could be required. The utilization of the architecture is dynamic in nature. Such a model yields itself to better workload management and dynamic billing based on CPU utilization per hour and memory utilization per hour, thereby helping to lower IT costs for small and medium businesses. Cloud BI brings a host of advantages that are listed below (a) Flexibility Computing resources could be scaled with few barriers. (b) Time to develop solutions SaaS BI provides the ability to shorten BI implementation windows. (c) Cost reduction Cloud BI helps reduce upfront infrastructure costs for cash-strapped companies. (d) Geographic scalability Given cloud deployment, geographic expansion of solutions is a key advantage, especially in a B2B environment with BI as the core platform. Despite the advantages of a cloud BI solution, there have been concerns among industry players on the wider adoption of the model. Top concerns are listed below: (a) Security Information security and SOX compliance aspects are not easy to address in a cloud BI solution. (b) Control Availability To Business ("ATB"), a standard metric used in B2B-based BI solutions, is not in control of the organization once the BI solution is migrated to the cloud. (c) Performance Performance optimization of a BI environment occasionally requires a nonstandard implementation model that would be customized to specific business needs. Such optimizations may not be easy to implement in a cloud BI solution. (d) Support Cloud BI also poses the challenge of technical support availability in crunch situations. (e) Vendor lock-in A cloud-based BI solution creates external dependencies for migration of new technologies. (f) Speed to activate new services / expand capability and configurability External dependencies occasionally hamper the ability to quickly activate new capabilities and modify the configuration of the implementation.

R OVERVIEW
R is an open-source programming language and software environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions and the R community is noted for its active contributions in terms of packages. According to Rexer's Annual Data Miner Survey in 2010, R has become the most widely used data mining tool, used by 43% of data miners. R is an implementation of the S programming language, developed by Bell Laboratories in 1976 to provide an alternative and more interactive approach to statistical analysis than what had been currently available. R is a free, open-source dialect of S and there is also a commercial version called S-PLUS available. R is also part of the GNU project, created by Ross Ihaka and Robert Gentleman of the University of Auckland, New Zealand. It was initially conceived as both men wanted better-suited technology for their statistics students, who needed to analyze data and produce graphical models of the information and they found existing software difficult to use. What is R and Why is it Used? Recently, the business intelligence sector began taking notice of the many benefits of R programming language, which is particularly adaptive to predictive analytics. It can be used to identify patterns or trends in massive data sets, making it ideal for researching retail, financial, and medical trends. Predictive analytics is an area of statistical analysis that deals with extracting information from data and using it to predict future trends and behavior patterns as well as identifying risks and opportunities. Models capture relationships among many factors to allow the assessment of risk or potential associated with a particular set of conditions, guiding decision making for candidate transactions. As data mining and predictive analytics continue to rapidly accelerate, R provides the tools to support these activities across all industries including, actuarial science, financial services, insurance, telecommunications, retail, travel, healthcare, pharmaceuticals and others. It supports drug companies doing pharmaceutical development, insurance companies performing underwriting and risk assessment, and students performing academic research. R has become one of the most popular and primary programming languages used by statisticians, scientists, and data analysts both commercially and within academia. R's popularity seems to be a result of its usability, extensibility, and roots in open-source. R is an integrated programming environment for data manipulation, calculation and graphical display of data sets. It helps people perform a wide variety of computing tasks by giving them access to various commands and pre-supplied packets. It also allows users to script their own functions (or modify existing ones) to do custom tasks. This provides much of the flexibility of languages such as C, but with the advantage of building upon R's robust numerical routines, data management functions, and graphing tools. Its ease of use has made it especially appealing to people without deep computer programming skills and is making it the de facto standard. According to an article from the New York Times, "It allows statisticians to do very intricate and complicated analyses without knowing the blood and guts of computing systems." Another strength of R is static graphics and the ease with which welldesigned, publication-quality graphs and mathematical symbols can be produced. Speed is also one of the biggest draws for the R programming language, which can process up to 12 gigabytes of data in seconds. Because R has stronger object-oriented programming facilities than most statistical computing languages, it can be more easily customized and extended through the use of user-submitted packages for specific functions or specific areas of study. Advanced users can write C code to manipulate R objects directly or link code written in C, C++ or Fortran to R at run-time. Over 1500 packages exist today. Some examples include: BiodiversityR - offers a graphical interface aimed at simplifying the calculation of environmental trends Emu - analyzing speech patterns Finally, because R is open-source, users have the ability to freely modify existing or create entirely new functions and packages compared with commercial software packages that use proprietary functions to perform the analysis.

Who uses R? R is used by both corporate users and universities, with an estimated two millions users, particularly scientists, programmers and academics who routinely do research. While software from SAS Institute has been the preferred tool, R is gaining popularity, especially within academia. Its ability to perform high-end analytics combined with its open-source, free-distribution model seem to be key in this shift. Corporate customers include Google (Not Covered), Pfizer (NC), Merck (NC), Bank of America (BAC, MP, Trone), Shell (NC) and the InterContinental Group (NC). Google uses R to help it understand trends in ad pricing and illuminate trends in the search data it collects. Pfizer has created customized packages that allow its scientists to manipulate their own data during non-clinical drug studies immediately versus sending that information off to a statistician. A number of financial institutions have also used it to create packages to perform derivatives analysis. Wal-Mart (NC) is also a high-profile user of R, using it to interpret the needs and spending habits of customers. R Disadvantages: While there appear to be many advantages to R, there are also currently some disadvantages. In the eyes of some, SAS Institute is better suited to handle "big data." R is limited by RAM because the data is in memory. R also appears to lack documentation and has limited commercial support. Commercial Releases: In Oct 2011, Oracle (ORCL, MO, $36 PT, Walravens) announced the Big Data Appliance, which integrates R, Apache Hadoop, Oracle Enterprise Linux, a NoSQL database with the Exadata hardware. This is an engineered system optimized for acquiring, organizing and loading unstructured data into Oracle Database 11g. In 2007, Revolution Analytics was founded to provide commercial support for Revolution R, its distribution of R which also includes components developed by the company. It includes additional components such as a web services framework and the ability for reading and writing data in the SAS File Format.

DATA VISUALIZATION
The final area of business intelligence is the data visualization or business discovery market. This is the most exciting component of the business intelligence market, in our opinion, and includes vendors such as QlikTech, Tableau Software, and TIBCO Software's Spotfire. Perhaps the best way to understand the data visualization market is to simply look at the type of graphs/charts these tools can produce. Some of our favorite examples of what enterprises now expect are shown in Figure 24 and 25 below.

FIGURE 24: Example of a Data Visualization Tool

Source: http://globaleconomicanalysis.blogspot.com/search?updated-max=2011-09-02T10%3A27%3A00-05%3A00&max-results=3

Figure 25 below highlights an interactive visualization of an Average Draft Position (ADP) of the CBS Sports Fantasy Football league, powered by Tableau Software. FIGURE 25: Another Data Visualization Example

Source: http://fantasynews.cbssports.com/fantasyfootball

BIG DATA INDEX STOCK PERFORMANCE

Big Data stocks have performed significantly better YTD than the NASDAQ. Through November 10, Big Data stocks are up 9.3% versus down 0.4% for the NASDAQ Composite, as shown in Figure 26 below. The stocks we include in our "Big Data Index" include QLIK, BIRT, TDC, TIBX, INFA, MSTR, PRGS, ORCL, IBM, EMC, NTAP, and CVLT. FIGURE 26: Big Data Stock Index Versus NASDAQ- YTD

Source: JMP Estimates & FactSet

As shown in Figure 27 below, the best performing stock YTD in our Big Data index is TIBCO Software (TIBX), up 44% YTD, followed by MicroStrategy (MSTR), up 38% YTD, compared to a 0.4% decline in NASDAQ. The worst performing stock has been Progress Software (PRGS), down 26% YTD, compared to a 0.4% decline in NASDAQ. FIGURE 27: YTD Individual Stock Performance of Big Data Index Companies

Source: FactSet

While Big Data Stocks have outperformed the NASDAQ YTD, the stocks have also pulled back harder than the NASDAQ since the market started to turn south from July 22nd. Figure 28 below shows the median performance of the "Big Data Index" Compared to NASDAQ from July 22nd to November 10th. As shown, NASDAQ is down 8% while the "Big Data Index" is down 12% FIGURE 28: Big Data Stock Index Versus NASDAQ- Since July 22nd Pullback

Source: FactSet

The Big Data stocks that have pulled back the most since their 52 week highs include Pegasystems, Progress Software, and MicroStrategy, as shown in Figure 29 below. We believe the recent pullback may represent a compelling buying opportunity for investors to build or add to positions in selected Big Data Stocks. FIGURE 29: Percentage Change from 52-Week High

Source: FactSet

BIG DATA VALUATION DISCUSSION

Most Big Data stocks look expensive on traditional valuation metrics such as P/E. Our key observation in analyzing the "Big Data Index" is that these companies are simply growing a lot faster than other technology companies, giving less credence to this metric, in our opinion. Figure 30 below illustrates the median consensus CY12 revenue growth rate for companies in the Big Data index, compared to the median consensus growth rate for NASDAQ companies in the technology sector for which there are estimates. As shown, the "Big Data" companies are expected to grow the top-line at 14% while NASDAQ companies are expected to grow the top line only 9%. FIGURE 30: Expected CY12 "Big Data Index" Revenue Growth Versus Tech Index

Source: FactSet

The fastest growing company in the Big Data Index using consensus estimates is Qlik Technologies, followed by NetApp, Pegasystems, and Informatica. FIGURE 31: Fastest Growing Companies in the "Big Data Index" (CY12 Estimates)

Source: FactSet

Interestingly, on a PEG Basis using 2012 P/E /2012-2013 EPS Growth, the Big Data stocks trade at 1.0x, the same multiple of our NASDAQ technology index. FIGURE 32: PEG Ratio of Big Data Index Versus HW/SW Technology Index

Source: FactSet

One metric we like to focus on is Free Cash Flow. On a TTM basis, our Big Data index trades at.21x EV/TTM FCF versus our NASDAQ technology index of 18x. FIGURE 33: EV/TTM FCF

Source: FactSet

While this multiple seems steep, when we compare it to the expected 2011 revenue growth, the stocks appear inexpensive compared to the NASDAQ technology index. As shown in Figure 34 below, our Big Data Index trades at an EV/TTM FCF divided by CY11 expected growth rate of 1.5x compared to 2.1x for the NASDAQ technology index. FIGURE 34: EV/TTM FCF Divided by CY11 Expected Growth Rate

Source: FactSet

PUBLICLY TRADED COMPANIES IN THE BIG DATA SPACE

Almost every publicly traded technology company now has a Big Data strategy. In this section we provide brief overviews of the solutions that some public technology companies are offering in the Big Data space.

MICROSOFT AND BIG DATA

Microsoft (MSFT, Not Covered) is making its advance into the Big Data space from multiple fronts. Microsoft's strategy is to provide an end-to-end solution that spans the entire process of data capture, loading, analysis, reporting, and visualization. Microsoft has a number of Big Data initiatives, four of which we detail below. Dryad. Dryad was created by Microsoft Research and is a platform to build applications that can process large amounts of unstructured data running on Microsoft's Windows HPC Server. It provides a sophisticated, distributed run-time and associated programming model that allows for analyzing and storing large volumes of data across large clusters of commodity servers. The technologies behind Dryad are the commercial versions of the same technology used by the Bing search engine to process very large quantities of data. Project Daytona. Daytona was developed by the Microsoft Extreme Computing Group's Cloud Research Engagement program and provides the tools and services to use its Azure cloud platform to analyze extremely large data sets. Microsoft is positioning Daytona as a Data Analytics-as-a-Service solution that provides parallel processing of extremely large data sets and allows users to quickly scale up or down virtual machines based on processing power required. It uses a runtime version of Google's MapReduce programming model for processing and analyzing the large data sets. Excel DataScope. In June 2011, Microsoft Research released Excel DataScope, its newest big data analytical and visualization candidate. Excel DataScope lets users upload data to the cloud, extract patterns from data stored in the cloud, identify hidden associations, discover similarities between datasets, and perform time-series forecasting using a familiar spreadsheet user interface called the research ribbon. DataMarket. Microsoft's vision of big data includes combining large volumes of corporate data with third-party data, enabling greater insights or more value. The DataMarket is a cloud-based Azure marketplace for accessing third-party data. It makes it significantly easier for non-expert business users to explore and choose third-party data sources on their own. Microsoft has also recently released a Hadoop connector to its MPP data warehouse offering, allowing developers to integrate data between SQL Server 2008 R2 Parallel Data Warehouse and Hadoop clusters.

ORACLE AND BIG DATA

Oracle's (ORCL, Market Outperform, Walravens) Exadata Database Machine is a database solution that focuses on Online Transaction Processing (OLTP), data warehousing, and the consolidation of workloads. Exadata was originally conceived to provide large enterprises a more effective and optimized solution for both sequential and random I/O performance. This means it can handle scanintensive data warehousing applications just as optimally as highly-concurrent OLTP applications. Oracle's vision is that applications, data, and storage, which have traditionally been separate, require tighter alignment for optimal performance. Exadata addresses this. Exadata is a strategic platform that unifies the entire IT stack under a single technology vendor and delivers an easy to manage, fast, efficient, scalable, and cost effective solution. Exadata has been architected to deliver extreme performance by leveraging a massively parallel grid architecture using Real Application Clusters and Exadata Smart Flash Cache to dramatically accelerate database processing and speed I/O operations. By pushing SQL processing to the storage server, all disks can operate in parallel, reducing server CPU and consumption, while using less bandwidth to move data between storage and database servers. Oracle also announced interfaces that will allow its customers to use Hadoop systems and connect them to Oracle systems so they can load Hadoop data into Oracle. They believe this will further increase the use of their solutions. Exadata is being positioned as the future of Oracle, offering a single hardware ecosystem with solutions for database management, storage, data crunching, and mission critical workloads. Notably, on October 2, Oracle announced Oracle Exalytics, an in-memory machine for business intelligence and planning applications.

HP AND BIG DATA

HP (HPQ, Not Covered) has made two key acquisitions to build out its Big Data portfolio: Vertica in February 2011 and Autonomy in October 2011. Vertica: Vertica was a leading provider of analytics database and analytics platform with over 300 customers including Groupon, Twitter, AOL, Zynga, and Comcast, among others. Vertica is known for its massively parallel columnar database technology called Vertica Analytics platform. After the acquisition, the new HP Vertica Analytics system now combines the original Vertica Analytics Platform with HP Converged Infrastructure. This upgraded solution helps customers analyze in real-time, large amounts of complex data in physical, virtual and cloud environments. Autonomy: Autonomy is a leading provider of infrastructure software and before it was bought by HP was the second largest publicly-traded software company in Europe. Autonomy's core platform, IDOL server, primarily obtains information from connectors and archives them in its proprietary structure, but also offers more than 500 functions such as hyper-linking and summarization. During the process of categorization and archiving, IDOL server deploys Meaning Based Computation to form a contextual and conceptual meaning from the information.

GOOGLE AND BIG DATA

Google (GOOG, Not Covered) has been a significant player in the Big Data space via its development of Bigtable. Bigtable is a compressed, high-performance proprietary database system built on the Google File System ("GFS"). This distributed storage system is used for managing structured data that is designed to scale to a very large size. Bigtable development began in 2004 and is now used by a number of Google applications such as Google Reader, Google Maps, Google Earth, YouTube and Gmail. Bigtable has been able to support the demands on its data size from applications requesting various forms of data ranging from URLs to web pages to satellite imagery. Due to the variety of data requested, these applications also exert diverse latency requirements on Bigtable from slow, back-end, bulk processing to real-time data serving. Despite these varied demands, Bigtable has provided a flexible high-performance solution for all these Google products. Following Googles philosophy, Bigtable was an in-house development designed to run on commodity hardware. Bigtable allows Google to have a very small incremental cost for new services and expanded computing power. Mechanics of Bigtable: Each table is a multi-dimensional sparse map. The table consists of rows and columns and each cell has a time version. There can be multiple copies of each shell with different times, so they can keep track of changes over time. In order to manage the huge tables, they are split at row boundaries and saved as tablets. Tablets are each around 100-200MB and each machine stores up to 100 of them. This setup allows fast-grain load balancing and fast rebuilding. There is lot of redundant data in the system, so they make heavy use of compression; the compression looks for similar values along rows, columns, and times.

IBM AND BIG DATA

IBM (IBM, Not Covered) has made analytics a cornerstone of its strategy. It has both built solutions organically and has also been very acquisitive. We review some of IBM's solutions in the Big Data space below: IBM InfoSphere: BigInsights is IBM's solution for managing and analyzing Internet-scale volumes of structured and unstructured data. BigInsights is built on the open source Apache Hadoop software framework. IBM InfoSphere Streams enable continuous analysis of massive volumes of streaming data with submillisecond response times. IBM WebSphere Operational Decision Management V7.5 software combines business events and rules management in a single platform to unify rules-based contextual decisions with time-based situational awareness. IBM Cognos is IBM's business intelligence and Financial Performance Management Solution. It includes solutions for query and reporting, analysis, dashboarding, and scorecarding, among other solutions. IBM acquired Cognos for $4.9 billion in 2007. IBM Netezza is IBM's data warehousing appliance. In June 2011, IBM rolled out its first appliance, post-Netezzas acquisition, called IBM Netezza High Capacity Appliance which is designed to allow companies to analyze up to 10 pentabytes in a few minutes. The latest Netezza appliance is optimized for IBMs BladeCenter technology and is part of a broader portfolio focused on big data projects. IBM bought Netezza for $1.7 billion in 2010. IBM's SPSS solutions are centered on predictive analytics. SPSS Statistics provides an advanced statistical analysis solution. SPSS Modeler is a pattern and trend solution. SPSS Data Collections is a solution to help organizations get an accurate view of people's attitudes, preferences, and opinions with IBM SPSS Data Collection. IBM bought SPSS for $1.2 billion in 2009.

SAP AND BIG DATA

SAP's (SAP, Market Underperform, Walravens) Big Data strategy is primarily centered around HANA. HANA, which stands for High Performance Analytic Appliance, is based on a superset of technologies that SAP has been building for some time, including the MaxDB database and TREX in-memory engine. HANA is made available to partners in appliance form and places data to be processed in RAM instead of reading it off disks, adding a performance boost. Data held in memory by HANA can be restored in the event of a power outage or any disruption, as the in-memory data is backed by a persistence layer that logs transactions and incorporates save points. HANA is compatible with any BI (Business intelligence) application that supports common query languages like SQL and MDX. Usage: SAP has initially focused on HANA's ability to support real-time analytics, especially as part of a series of specialized applications that target discrete business problems. One of the first such products is Strategic Workforce Planning, which companies can use to figure out the economic and logistical ramifications of making big shifts in staffing. HANA can also handle the transactional workloads of ERP (enterprise resource planning) applications like its Business One, Business ByDesign, and Business Suite products. All of those products are expected to gain HANA support, but the enterprise-grade Business Suite is expected to take the longest time.

EMC AND BIG DATA

In July 2010, EMC (Not Covered) acquired Greenplum, a 10-year old data warehouse software company. As part of the deal, Greenplum became a new computing division for EMC. In October 2010, EMC launched a Greenplum data warehousing appliance designed to take on the likes of IBMs Netezza and Oracles Exadata. The launch came 75 days after EMC acquired Greenplum. With the Greenplum appliance, EMC is making its splash into the integrated appliance market.

INFORMATICA AND BIG DATA

Informatica (INFA, Market Outperform, Walravens) views Big Data as a confluence of three trends: Big Transaction Data, which includes both traditional OLTP databases and OLAP and data warehouse appliances; Big Interaction Data, which includes data from social media such as Facebook; and Big Data Processing, which includes new technologies such as Hadoop. Informatica's view is that its integration tools address all three facets of the Big Data trend. Informatica's messaging around its most release, Informatica 9.1, is all about Big Data, including "Big Data Integration", "Authoritative and Trustworthy Data", "Self-Service", and "Adaptive Data Services." What exactly is new in Informatica 9.1? Informatica provided some hints on its last earnings call, when it noted that 9.1 "will feature adapters for the Hadoop file system, HDFS, to move data in for parallel processing by Map/Reduce and move results out for consumption." Based on our conversation with an Informatica representative, customers currently on maintenance will have free update rights to 9.1, but would need to pay extra for any adapters or connectors to new systems, as with all releases.

INITIATIONS SUMMARY
We are initiating coverage on six companies in the infrastructure software universe: MicroStrategy (MSTR) We are initiating coverage on MicroStrategy with a Market Outperform rating and $140 price target. MicroStrategy is the largest publicly-traded independent BI vendor. We like MicroStrategy because of its powerful value proposition of an end-to-end BI architecture and analytics platform, its large market presence with some of the leading companies in the world, its well-built developer ecosystem, and its four-quarters in a row of double-digit license growth which we expect to continue. While MicroStrategy has invested heavily in 2011 (with operating margins expected to be down 800 basis points) to better compete with the emerging players like QlikTech and Tableau Software, we believe the investments will start to bear fruit toward the end of this year leading to significant operating margin expansion next year and revenue with EPS estimates coming in above consensus estimates. We look for 2011, 2012, and 2013 EPS of $1.81, $3.67, and $5.14 versus consensus of $1.79, $3.63, and $4.59, respectively. Our $140 price target implies a very reasonable 2013 EV/Revenue multiple of 1.9x, a discount to the peer group, and a 2013 P/E of 27x, slightly above MicroStrategy's average 5 year forward P/E multiple, roughly in line with its TTM revenue growth rate of 28%. Progress Software (PRGS) We are initiating coverage on Progress Software with a Market Perform rating. Progress Software provides enterprise software products that enable organizations to be operationally responsive to a changing business environment. We like the steps Progress is taking to transition the business to the fast-growing category of Enterprise Business Solutions; however, we remain on the sidelines until we see a permanent CEO in place, more consistent sales execution, and possibly a divestiture of nonstrategic assets. Progress Software trades at a 2013 P/E multiple of 10x versus the peer group median of 11x. We look for 2011 non-GAAP EPS of $1.43, versus consensus of $1.45; 2012 non-GAAP EPS of $1.66, versus consensus of $1.63; and 2013 non-GAAP EPS of $1.76, versus consensus of $1.72. Qlik Technologies (QLIK) We are initiating coverage on Qlik Technologies ("QlikTech") with a Market Outperform rating and a $35 price target. QlikTech is the fastest growing company in our Big Data/Business Intelligence coverage universe and one of the fastest growing publicly-traded software companies, with expected 2011 revenue growth of 41%. We like QlikTech because we believe it has a wide-open market opportunity, a strong value proposition, and based on our survey of 16 customers, we believe the company will be able to exceed growth expectations. We look for 2011, 2012, and 2013 non-GAAP EPS of $0.30, $0.47, and $0.68 (versus consensus of $0.29, $0.44, and $0.63) on revenue growth of 42%, 28%, and 25%, respectively. Our $35 price target implies an EV/2013 revenue multiple of 5.6x, a modest premium to the high-growth software peer group. Quest Software (QSFT) We are initiating coverage on Quest Software with a Market Perform rating. Quest Software is a provider of enterprise systems management software products that has grown primarily via acquisition. We like Quest's ability to generate cash (with a TTM FCF yield of 10%) and its deep product portfolio. However, Quest's performance in the past four quarters has been inconsistent, with EPS misses in three of the last six quarters and revenue misses in two of the last six quarters. While we believe the downside risk on this name is limited due to its valuation, we remain on the sidelines until we see more consistent execution. Quest Software trades at a 2013 P/E of 10x, versus the comp group of 11x. We look for 2011 non-GAAP EPS of $1.33, in line with consensus; 2012 non-GAAP EPS of $1.64, versus consensus of $1.65; and 2013 non-GAAP EPS of $1.90, versus consensus of $1.93.

Teradata (TDC) We are initiating coverage on Teradata with a Market Outperform rating and $63 price target. We like Teradata because it is the leading data warehousing vendor, we believe it stands to benefit from the Big Data trend more than any other software vendor; the competitive environment for Teradata is more benign than the conventional wisdom believes, and we believe the company is well positioned to beat consensus expectations for 2012 and 2013. We look for 2011, 2012, and 2013 non-GAAP EPS of $2.29, $2.70, and $3.15, versus consensus of $2.25, $2.58, and $2.95. Our $63 price target represents an FY13 P/E multiple of 20x, in line with Teradata's 10-year average. TIBCO Software (TIBX) We are initiating coverage on TIBCO Software with a Market Outperform rating and a $33 price target. We like TIBCO because we believe it is a well-managed company growing the top line 21% and that is committed to growing the bottom line 15-20% per year, it is a cash flow machine with a 10-year FCF CAGR of 24%, it is tapping a large market opportunity that we feel is getting even bigger as a result of the Big Data trend, it is well diversified across verticals and product areas, it has strong partnerships, and we believe it represents a good acquisition target. We look for 2011 non-GAAP EPS of $0.94 (consensus $0.94), 2012 non-GAAP EPS of $1.12 (consensus $1.11), and 2013 non-GAAP EPS of $1.32 (consensus of $1.27) on revenue growth of 21%, 13%, and 11%, respectively, well above its comp group. Our $33 price target implies a 2013 P/E of 25x, in line with TIBCO's expected 2011 license growth rate and a premium to the peer group median of 15x.

JMP SECURITIES SOFTWARE RESEARCH TEAM

Greg McDowell Vice President - Infrastructure Software [email protected] 415-835-3934 Greg McDowell joined JMP Securities in December 2007 and serves as a Vice President in Equity Research covering Software. Prior to joining JMP, Greg spent nine years at Oracle in various Account Executive and Finance positions. While at Oracle, Greg sold both Oracle's Application solutions, including e-Business Suite and PeopleSoft solutions, and Oracle's core technology solutions, including database software, business intelligence software, and middleware software. Greg received an MBA from the Smith School of Business at the University of Maryland and a BA from the University of California at Davis. Patrick D. Walravens Director of Technology Research Senior Analyst, Software [email protected] 415-835-8943 Pat Walravens joined JMP Securities in November 2001 and serves as Director of Technology Research and as a Senior Research Analyst covering Software. Prior to joining JMP, Pat spent five years at Lehman Brothers, where he served as a Senior Research Analyst in the Equities Group and a Vice President in the Technology Investment Banking group. Previously, Pat practiced corporate law with the firm of Cooley Godward Kronish (now "Cooley LLP"), where he represented emerging technology companies and venture capital investors. In 2007, Pat ranked among the top three analysts in the software industry for stock picking, according to StarMine results published in Forbes. Pat is frequently quoted and interviewed by the media, including the Wall Street Journal, Business Week, Forbes, CNBC, and Bloomberg. Pat received an MBA from the Anderson School of Management at UCLA and a JD from the UCLA School of Law, where he received the Order of the Coif. Pat holds a BA in Economics from Stanford University, where he served as an undergraduate teaching assistant in the computer science department.

Peter Lowry Associate [email protected] 415-869-4418 Peter Lowry joined JMP Securities in June 2011 and serves as an Associate in Equity Research covering Software. Prior to joining JMP, Peter had 15 years experience as an Investment Banker, Private Banker and CPA at top-tier firms such as PWC, Schroder, Lehman Brothers, UBS, Bank of America, Deutsche Bank and Ion Partners. Peter worked with both corporate and private clients with finance issues across public accounting, corporate finance, capital markets and private wealth management. Peter has an MBA from Columbia University, an MS in Public Accounting from the University of Hartford, and a BA from Hamilton College.

We would also like to acknowledge the following gentlemen for their help with the Big Data project: Praveen Chandran Rishi Sharma Alec Short Vincent Song Naga Surendran Vijay Tennety Julian Terkaly

JMP FACTS AND DISCLOSURES

Analyst Certification:
The research analyst(s) who prepared this report does/do hereby certify that the views presented in this report are in accordance with my/our personal views on the securities and issuers discussed in this report. As mandated by SEC Regulation AC no part of my/our compensation was, is or will be directly or indirectly related to the specific views or recommendations expressed herein. This certification is made under the obligations set forth in SEC Regulation AC. Any other person or entity may not use it for any other purpose. This certification is made based on my/our analysis on the date of this reports publication. I/We assume no obligation to update this certification to reflect any facts, circumstances or events that may subsequently come to my/our attention. Signed Greg McDowell, Patrick Walravens, Peter Lowry

Publicly Traded Companies Covered by JMP and Mentioned in This Report (as of November 15, 2011):
Company Actuate Corporation Adobe Systems, Inc. Bank of America Corp. Cisco Systems, Inc. Citrix Systems, Inc. CommVault Systems, Inc. Cornerstone OnDemand, Inc. Demand Media, Inc. DemandTec, Inc. EMC Corporation Hewlett-Packard Company Informatica Corporation JDA Software Group Inc. MicroStrategy, Inc. Oracle Corporation Progress Software Corporation Qlik Technologies Inc. Rackspace Hosting, Inc. RealPage, Inc. Responsys, Inc. RightNow Technologies, Inc. SAP AG Symantec Corporation TIBCO Software Inc. Teradata Corporation Ultimate Software Group VMware, Inc. salesforce.com Disclosures (1) (1) (1) (1) (1) (1,3) (1,3,5) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1) (1,3) (1,3) (1,3) (1) (1) (1) (1) (1) (1) (1)

JMP Securities Disclosure Definitions:

(1) JMP Securities currently makes a market in this security. (2) JMP Securities has received compensation for banking or other services rendered to this company in the past 12 months. (3) JMP Securities was manager or co-manager of a public offering for this company in the past 12 months. (4) JMP Securities participated as an underwriting or selling group member of a public offering by this company in the past 12 months. (5) JMP Securities and/or its affiliates have obtained a position of at least 1% in the equity securities of this company during the ordinary course of its/their business/investments. (6) An officer of JMP Securities is a director or officer of this company. (7) The analyst covering this company (as defined in NASD Rule 2711) or a member of the analyst's household has a financial interest in this company. (8) The analyst covering this company or a member of the analysts household serves as an officer, director, or advisory board member of this company. (9) The analyst covering this company has had discussions of employment with the company.

JMP Securities Investment Opinion Definitions:

Market Outperform (MO): JMP Securities expects the stock price to outperform relevant market indices over the next 12 months. Market Perform (MP): JMP Securities expects the stock price to perform in line with relevant market indices over the next 12 months. Market Underperform (MU): JMP Securities expects the stock price to underperform relevant market indices over the next 12 months.

JMP Securities Research Ratings and Investment Banking Services: (as of October 3, 2011)
Regulatory Equivalent Buy Hold Sell # Co's Under Coverage 207 105 3 315 % of Total 66% 33% 1% 100% Regulatory Rating Buy Hold Sell # Co's Under Coverage 207 105 3 315 % of Total 66% 33% 1% 100% # Co's Receiving IB Services in Past 12 Months 58 7 0 65 % of Co's With This Rating 28% 7% 0% 21%

JMP Rating Market Outperform Market Perform Market Underperform TOTAL:

Stock Price Chart of Rating and Target Price Changes:

Note: First annotation denotes initiation of coverage or 3 years, whichever is shorter. If no target price is listed, then the target price is N/A. In accordance with NASD Rule 2711, the chart(s) below reflect(s) price range and any changes to the rating or price target as of the end of the most recent calendar quarter. The action reflected in this note is not annotated in the stock price chart. Source: Jovus and JMP Securities. In order to obtain these (6 or more) stock price charts or additional applicable disclosures and information concerning JMP's recommendations of companies under coverage mentioned in this report, please contact JMP Securities at (877) 263-1333 or visit www.jmpsecurities.com.

JMP Disclaimer:
JMP Securities LLC (the Firm) compensates research analysts, like other Firm employees, based on the Firms profitability, which includes revenues from the Firms institutional sales, trading, and investment banking departments as well as on the quality of the services and activities performed that are intended to benefit the Firms institutional clients. These data have been prepared by JMP Securities LLC for informational purposes only and are based on information available to the public from sources that we believe to be reliable, but we do not guarantee their accuracy or completeness. Any opinions and projections expressed herein reflect our judgment at this date and are subject to change without notice. These data are neither intended nor should be considered as an offer to sell or a solicitation or a basis for any contract for the purchase of any security or other financial product. JMP Securities LLC, its affiliates, JMP Group LLC, Harvest Capital Strategies LLC, and their respective partners, directors, officers, and associates may have a long or short position in, may act as a market maker for, or may purchase or sell a position in the securities mentioned herein. JMP Securities LLC or its affiliates may be performing, have performed, or seek to perform investment banking, advisory, or other services and may have acted as manager or co-manager for a public offering of securities for any company mentioned herein. The reader should assume that JMP Securities LLC will solicit business from the company covered in this report. Copyright 2011. All rights reserved by JMP Securities LLC. JMP Securities LLC is a member of FINRA, NYSE Arca, NASDAQ, and SIPC.

JMP SECURITIES LLC

600 Montgomery Street, Suite 1100, San Francisco, CA 94111-2713, www.jmpsecurities.com
Peter V. Coleman Director of Equity Research (415) 869-4455

Financial Services Capital Markets David Trone Steven Fu, CFA Chris Ross, CFA

(212) 906-3525 (212) 906-3548 (212) 906-3532

Real Estate Hotels & Resorts William C. Marks Housing & Housing Supply Chain Michael G. Smith Land Development Michael G. Smith Real Estate & Property Services William C. Marks Real Estate Technology Michael G. Smith REITs: Healthcare Peter L. Martin, CFA Aaron Hecht REITs: Office & Industrial Mitch Germain Technology Clean Technology Alex Gauna Communications Equipment Erik Suppiger Semiconductors Alex Gauna

(415) 835-8944 (415) 835-8965 (415) 835-8965 (415) 835-8944 (415) 835-8965 (415) 835-8904 (415) 835-3963 (212) 906-3546

Consumer & Specialty Finance, Commercial Banks John Hecht (415) 835-3912 Kyle M. Joseph (415) 835-3940 Financial Processing & Outsourcing David M. Scharf Kevane A. Wong Insurance Matthew J. Carletti Christine Worley Market Structure David M. Scharf Kevane A. Wong (415) 835-8942 (415) 835-8976 (312) 768-1784 (312) 768-1786 (415) 835-8942 (415) 835-8976

Residential & Commercial Real Estate Finance Steven C. DeLaney (404) 848-7773 Trevor Cranston, CFA (415) 869-4431 Trevor Cranston, CFA Healthcare Biotechnology Charles C. Duncan, PhD Roy Buchanan, PhD Jason N. Butler, PhD Gena H. Wang, PhD Liisa A. Bayko Heather Behanna, PhD Jason N. Butler, PhD Healthcare Facilities & Services Peter L. Martin, CFA Aaron Hecht Healthcare Services Constantine Davides, CFA Tim McDonough Medical Devices J. T. Haresco, III, PhD (415) 869-4431

(415) 835-8998 (415) 835-3918 (415) 835-8998 (415) 835-8943 (415) 835-3934 (415) 869-4418 (415) 835-3934

(212) 906-3510 (212) 906-3514 (212) 906-3505 (212) 906-3528 (312) 768-1785 (312) 768-1795 (212) 906-3505 (415) 835-8904 (415) 835-3963 (617) 235-8502 (617) 235-8504 (415) 869-4477

Software Patrick Walravens Greg McDowell Peter Lowry Greg McDowell

For Additional Information Mark Lehmann President, JMP Securities (415) 835-3908 Erin Seidemann Vice President, Publishing (415) 835-3970

Account Planning Guide
No ratings yet
Account Planning Guide
14 pages
module 2-3 fuba midterms
100% (1)
module 2-3 fuba midterms
5 pages
Artificial Intelligence in Service: Ming-Hui Huang and Roland T. Rust
No ratings yet
Artificial Intelligence in Service: Ming-Hui Huang and Roland T. Rust
18 pages
Critical Capabilities For Analytics and Business Intelligence Platforms
100% (1)
Critical Capabilities For Analytics and Business Intelligence Platforms
73 pages
A Review on the Role of Big Data Analytics in The
No ratings yet
A Review on the Role of Big Data Analytics in The
8 pages
Enterprise integration Report
No ratings yet
Enterprise integration Report
7 pages
Big Data-31-40
No ratings yet
Big Data-31-40
10 pages
Respuestas
33% (18)
Respuestas
6 pages
2022BigData
No ratings yet
2022BigData
10 pages
IBM's Sam Palmisano - A Super Second Act - Fortune Tech
No ratings yet
IBM's Sam Palmisano - A Super Second Act - Fortune Tech
5 pages
Big Data and Business Opportunities
100% (1)
Big Data and Business Opportunities
6 pages
Lec 1 Introduction to Big Data Analytics
No ratings yet
Lec 1 Introduction to Big Data Analytics
68 pages
PIIS0016510720344667
No ratings yet
PIIS0016510720344667
6 pages
ADBMS-Module 1 Notes
No ratings yet
ADBMS-Module 1 Notes
18 pages
CS8091 BDA Unit I LectureNotes
No ratings yet
CS8091 BDA Unit I LectureNotes
73 pages
Big Data in CRM
No ratings yet
Big Data in CRM
12 pages
Sci Solution Brief 121319 WHS12347USEN
No ratings yet
Sci Solution Brief 121319 WHS12347USEN
6 pages
Tone Analyzer Project Report
No ratings yet
Tone Analyzer Project Report
18 pages
IDAV_Unit-1
No ratings yet
IDAV_Unit-1
20 pages
Using Artificial Intelligence To Optimize PDF
No ratings yet
Using Artificial Intelligence To Optimize PDF
6 pages
BIG DATA, Assignment 2 by Priyanka Kumari
No ratings yet
BIG DATA, Assignment 2 by Priyanka Kumari
16 pages
BDA Upto Unit3
No ratings yet
BDA Upto Unit3
42 pages
5167 Full Paper Mostao
No ratings yet
5167 Full Paper Mostao
14 pages
Watson Theory of Caring Science
No ratings yet
Watson Theory of Caring Science
22 pages
ITECH1103 - Big Data and Analytics Group Assignment Semester 1
No ratings yet
ITECH1103 - Big Data and Analytics Group Assignment Semester 1
22 pages
Big Data
100% (6)
Big Data
56 pages
Lab X - Building A Machine-Learning Annotator With Watson Knowledge Studio
No ratings yet
Lab X - Building A Machine-Learning Annotator With Watson Knowledge Studio
27 pages
Lab 03 Watson Assistant
100% (1)
Lab 03 Watson Assistant
56 pages
Big Data Unit I
No ratings yet
Big Data Unit I
8 pages
Part E-15 Big Data
No ratings yet
Part E-15 Big Data
12 pages
The Rise, Fall, and Resurrection of IBM Watson Health
No ratings yet
The Rise, Fall, and Resurrection of IBM Watson Health
17 pages
TCS Big Data Global Trend Study 2013
0% (1)
TCS Big Data Global Trend Study 2013
106 pages
AI and Security
100% (1)
AI and Security
11 pages
BIG DATA CRITERIA
No ratings yet
BIG DATA CRITERIA
10 pages
Bda Unit1
No ratings yet
Bda Unit1
19 pages
Unit 1 Big Data Notes
No ratings yet
Unit 1 Big Data Notes
48 pages
IEEE BigDataOpenSourcePlatforms
No ratings yet
IEEE BigDataOpenSourcePlatforms
8 pages
BD1 1
0% (1)
BD1 1
9 pages
Ibm Watson
No ratings yet
Ibm Watson
18 pages
Big Data Analytics: Fuel For Enterprise Software
No ratings yet
Big Data Analytics: Fuel For Enterprise Software
8 pages
IBM Think2019 ConfGuide FINAL4 PDF
No ratings yet
IBM Think2019 ConfGuide FINAL4 PDF
22 pages
Review Paper
No ratings yet
Review Paper
6 pages
AIDN002012010067 RADMSKBig Data Military Informationand Intel
No ratings yet
AIDN002012010067 RADMSKBig Data Military Informationand Intel
9 pages
Advanced Analytics: What Is Big Data Analytics? Definition, Benefits, and More
No ratings yet
Advanced Analytics: What Is Big Data Analytics? Definition, Benefits, and More
13 pages
Artificial Intelligence Master Analyst: Get Certified As
No ratings yet
Artificial Intelligence Master Analyst: Get Certified As
6 pages
Predictive Analytics Real Life Use Cases
No ratings yet
Predictive Analytics Real Life Use Cases
10 pages
The Influence of Big Data Analytics in The Industry
No ratings yet
The Influence of Big Data Analytics in The Industry
15 pages
A Natural Conversation Framework For Con PDF
No ratings yet
A Natural Conversation Framework For Con PDF
24 pages
The Forrester Wave™: Web Content Management Systems, Q1 2017
No ratings yet
The Forrester Wave™: Web Content Management Systems, Q1 2017
19 pages
What Is Data Mining?: Warehousing
No ratings yet
What Is Data Mining?: Warehousing
12 pages
Big Data Is A Broad Term For
No ratings yet
Big Data Is A Broad Term For
14 pages
Big Data: Spot Business Trends, Prevent Diseases, C Ombat Crime and So On"
No ratings yet
Big Data: Spot Business Trends, Prevent Diseases, C Ombat Crime and So On"
8 pages
(IJCST-V9I6P1) :yew Kee Wong
No ratings yet
(IJCST-V9I6P1) :yew Kee Wong
7 pages
Reading Assignment - 474 Final
No ratings yet
Reading Assignment - 474 Final
2 pages
Big Data Analytics
0% (1)
Big Data Analytics
19 pages
Keeping Pace With Technology and Big Data
No ratings yet
Keeping Pace With Technology and Big Data
34 pages
Using AI in Higher Education
No ratings yet
Using AI in Higher Education
31 pages
Value Oriented Big Data Strategy: Analysis & Case Study: Khaled Himmi, Jonathan Arcondara, Peiqing Guan, Wei Zhou
No ratings yet
Value Oriented Big Data Strategy: Analysis & Case Study: Khaled Himmi, Jonathan Arcondara, Peiqing Guan, Wei Zhou
10 pages
Mining of Sensor Data
No ratings yet
Mining of Sensor Data
46 pages
Big Data Analytics Report
No ratings yet
Big Data Analytics Report
37 pages
Unit - 1 - Big Data - RCA - E 45
No ratings yet
Unit - 1 - Big Data - RCA - E 45
42 pages
Big Data
No ratings yet
Big Data
20 pages
5 Estrategias de Respaldo
No ratings yet
5 Estrategias de Respaldo
7 pages
Kaisler2013 - Big Data - Issues and Challenges Moving Forward
No ratings yet
Kaisler2013 - Big Data - Issues and Challenges Moving Forward
10 pages
Big Data in MK Research
No ratings yet
Big Data in MK Research
31 pages
IBM Open Pages
100% (1)
IBM Open Pages
17 pages
GTB Big Data Whitepaper (DB0324) v2 PDF
No ratings yet
GTB Big Data Whitepaper (DB0324) v2 PDF
28 pages
Unit 1
No ratings yet
Unit 1
55 pages
Group 9 - DesignThinking & AI - Assignment
No ratings yet
Group 9 - DesignThinking & AI - Assignment
13 pages
Cis Bigdata TL
No ratings yet
Cis Bigdata TL
8 pages
What Is Watson's Theory of Transpersonal Caring?
No ratings yet
What Is Watson's Theory of Transpersonal Caring?
5 pages
Big Data: The Definitive Guide To The Revolution in Business Analytics
No ratings yet
Big Data: The Definitive Guide To The Revolution in Business Analytics
66 pages
Freestyle Chess Mauboussin
No ratings yet
Freestyle Chess Mauboussin
10 pages
Big Data Analytics TEXTBOOK
No ratings yet
Big Data Analytics TEXTBOOK
230 pages
Big Data: Opportunities, Strategy and Challenges
100% (1)
Big Data: Opportunities, Strategy and Challenges
17 pages
Gartner Big Data Opportunities in Industries
No ratings yet
Gartner Big Data Opportunities in Industries
24 pages
Big Data Impact Analysis
No ratings yet
Big Data Impact Analysis
8 pages
Analytics: The Real-World Use of Big Data: How Innovative Enterprises Extract Value From Uncertain Data
100% (1)
Analytics: The Real-World Use of Big Data: How Innovative Enterprises Extract Value From Uncertain Data
22 pages
Big Data Technologies
No ratings yet
Big Data Technologies
4 pages
IBM Big Data Analytics Study 2013 - Annotated
No ratings yet
IBM Big Data Analytics Study 2013 - Annotated
20 pages
A Seminar Report: Big Data
No ratings yet
A Seminar Report: Big Data
22 pages
Annotated Bibliography 2
No ratings yet
Annotated Bibliography 2
6 pages
SAAI1-AI Analyst 2019-Course Guide 1
No ratings yet
SAAI1-AI Analyst 2019-Course Guide 1
166 pages
Big Data Analysis Guide
No ratings yet
Big Data Analysis Guide
11 pages
Big Data: the Revolution That Is Transforming Our Work, Market and World
From Everand
Big Data: the Revolution That Is Transforming Our Work, Market and World
PAT NAKAMOTO
No ratings yet
BigData Research Paper
No ratings yet
BigData Research Paper
22 pages
Big Data - Challenges for the Hospitality Industry: 2nd Edition
From Everand
Big Data - Challenges for the Hospitality Industry: 2nd Edition
Michael Toedt
No ratings yet
Big Data: Opportunities and challenges
From Everand
Big Data: Opportunities and challenges
BCS, The Chartered Institute for IT
No ratings yet
Big Data: Understanding How Data Powers Big Business
From Everand
Big Data: Understanding How Data Powers Big Business
Bill Schmarzo
2/5 (1)
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
From Everand
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
alasdair gilchrist
No ratings yet

Big Data and How BI Got Its Groove Back

Uploaded by

Big Data and How BI Got Its Groove Back

Uploaded by

Industry Overview - Software

November 15, 2011

WHAT IS BIG DATA?

Source: Google Trends

WHAT ARE THE CAUSES OF BIG DATA?

FIGURE 6: The Growth of Business Transaction and Web Application Data

FIGURE 9: IDC Figure on the Growth of the Digital Universe

BIG DATA MARKET SIZE

Source: JMP Estimates

Source: JMP Estimates

Source: JMP Estimates

Source: IDC and JMP Securities

Source: IDC and JMP Securities

WHAT TECH LAYERS ARE IMPACTED BY BIG DATA?

DATA MANAGEMENT LAYER

FIGURE 16: Data Management Landscape According to 451 Group

Source: Google Trends

Source: Google Trends

Hadoop also includes a number of subprojects, as shown in Figure 19 below.

HADOOP CASE STUDIES

FIGURE 20: The Four NoSQL Categories

BUSINESS INTELLIGENCE, ANALYTICS AND BIG DATA

FIGURE 21: Info Tech Research Group BI Architecture

Source: Info Tech Research Group

Source: Info Tech Research Group

BUSINESS ANALYTICS VERSUS BUSINESS INTELLIGENCE

FIGURE 23: Business Intelligence and Business Analytics S.No.

Source: JMP Securities

AGILE BUSINESS INTELLIGENCE

BUSINESS INTELLIGENCE IN THE CLOUD

FIGURE 24: Example of a Data Visualization Tool

BIG DATA INDEX STOCK PERFORMANCE

Source: JMP Estimates & FactSet

BIG DATA VALUATION DISCUSSION

PUBLICLY TRADED COMPANIES IN THE BIG DATA SPACE

MICROSOFT AND BIG DATA

ORACLE AND BIG DATA

HP AND BIG DATA

GOOGLE AND BIG DATA

IBM AND BIG DATA

SAP AND BIG DATA

EMC AND BIG DATA

INFORMATICA AND BIG DATA

JMP SECURITIES SOFTWARE RESEARCH TEAM

JMP FACTS AND DISCLOSURES

JMP Securities Disclosure Definitions:

JMP Securities Investment Opinion Definitions:

JMP Rating Market Outperform Market Perform Market Underperform TOTAL:

Stock Price Chart of Rating and Target Price Changes:

JMP SECURITIES LLC

(212) 906-3525 (212) 906-3548 (212) 906-3532

Software Patrick Walravens Greg McDowell Peter Lowry Greg McDowell

You might also like