0% found this document useful (0 votes)
328 views

Knowledge Sharing 2014 Book of Abstracts: An EMC Proven Professional Publication

aa

Uploaded by

AbhishekBhau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
328 views

Knowledge Sharing 2014 Book of Abstracts: An EMC Proven Professional Publication

aa

Uploaded by

AbhishekBhau
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

An EMC Proven Professional Publication

KNOWLEDGE SHARING
2014 BOOK OF ABSTRACTS
2014 EMC Knowledge Sharing Book of Abstracts 2
TABLE OF CONTENTS
FIRST-PLACE KNOWLEDGE SHARING ARTICLE
VDI and The Fast Access to Patient Data Challenge ..................................................................................... 7
by Justin Beardsmore
SECOND-PLACE KNOWLEDGE SHARING ARTICLE
Anatomy of Business Impact Management using SMAC .............................................................................. 8
by Mohammed Hashim
THIRD-PLACE KNOWLEDGE SHARING ARTICLE
Service-Oriented Storage Tiering: A 2014 Approach to a 2004 Problem ........................................................ 9
by Daniel Stafford
BEST OF BIG DATA
Implications of CAP Theorem on NoSQL Databases ...................................................................................10
by Ravi Sharda and Bharath Krishnappa
BEST OF TROUBLESHOOTING
Troubleshooting Java on Linux.................................................................................................................. 11
by Sumit Nigam
ELITE KNOWLEDGE SHARING ARTICLE
A Whole New Game: Leveraging Big Data in Baseball ................................................................................ 11
by Bruce Yellin
ELITE KNOWLEDGE SHARING ARTICLE
Driving Tomorrows Information Technology Platform ................................................................................ 13
by Paul Brant
BACKUP RECOVERY/ARCHIVE
Protecting Your Data Lake: Strategic or Business as Usual?........................................................................ 13
by Russell Easter
What Is RecoverPoints Power? ................................................................................................................ 15
by Roy Mikes
NetWorker 8.1 Enterprise Backup Protection for Virtualized Data Center ....................................................16
by Gururaj Kulkarni, Anupam Sharma, and Naveen Rao
NetWorker 8.1 Integration with Boost over Fibre Channel and Virtual Synthetics Feature of Data Domain..... 17
by Gururaj Kulkarni and Soumya Gupta
NetWorker 8.1: The Next Big Thing ............................................................................................................18
by Anuj Sharma
Congure NetWorker NMDA Backup for Oracle RAC Using Oracle SCAN ......................................................18
by (Arthur) Zhuang-Song Jiang
Keep the needed. Archive the rest. ...........................................................................................................19
by Kobi Shamama
2014 EMC Knowledge Sharing Book of Abstracts 3
How Analytics Can Help Backup Administrators ...................................................................................... 20
by Balaji Panchanathan
Avamar Integration with VMware ............................................................................................................. 20
by Nadine Francis
DDboost Implementation with NetWorker in Complex Networking Scenario ...............................................21
by Crescenzo Oliviero
NetWorker Delegation Model for ROBO Environment .................................................................................21
by Puneet Goyal
Backup Optimization DD Boost - NetWorker inside ............................................................................... 22
by Mohamed Sohail
Managing Data Protection for a Better Nights Sleep ................................................................................ 22
by Gareth Eagar
BIG DATA/PREDICTIVE ANALYTICS
How He and She Use CMC ....................................................................................................................23
by B. Nicole Parfnovics
Big Data: Deciphering the Data Center ..................................................................................................... 24
by Andrew Bartley and Rich Elkins
Intelligent QoS for EMC Storage Through Big Data Analytics .......................................................................25
by Yangbo Jiang
Does Big Data Mean Big Storage? ............................................................................................................ 26
by Mikhail Gloukhovtsev
The IT Department Becomes a Data Broker .............................................................................................. 26
by Nick Bakker and Paul Wu
New Opportunities @ the Crossroads of M2M & Big Data ..........................................................................27
by Prakash Hiremath
Big Data Analytics Modeled Customer Feedback Portal ............................................................................ 28
by Shoukathali Chandrakandi, Shelesh Chopra, and Gururaj Kulkarni
Predictive and Self-healing Data Center Management .............................................................................. 29
by Gnanasoundari Soundarajan, Hemanth Patil, and Radhakrishna Vijayan
Conceptual Data Mining Framework for Mortgage Default Propensity ....................................................... 30
by Wei Lin and Pedro Desouza
Data Structures of Big Data: How They Scale ............................................................................................ 30
by Dibyendu Bhattacharya and Manidipa Mitra
Envisioning Precise Risk Models Through Big Data Analytics ..................................................................... 31
by Pranali Humne
BUSINESS PROCESS
TCE @ iCOE GPS .....................................................................................................................................33
by PradeeMathew, Ranajoy Dutta, Jayasurya Tadiparthi, Gayathri Shenoy, and Athul Jayakumar
Smart Talent Acquisition ......................................................................................................................... 34
by Trupti Kalakeri
2014 EMC Knowledge Sharing Book of Abstracts 4
CLOUD
Impact of Cloud Computing on IT Governance .......................................................................................... 34
by Declan McGrath
Incorporating Storage with an Open Source Cloud Computing Platform ..................................................... 36
by Ganpat Agarwal
Patterns of Multi-Tenant SaaS Applications .............................................................................................. 36
by Ravi Sharda, Manzar Chaudhary, Rajesh Pillai, and Srinivasa Gururao
VM Granular Storage - Rethinking Storage for Virtualized Infrastructures ...................................................37
by Alexey Polkovnikov
A New Era for Virtualization and Storage Team Collaboration .................................................................... 38
by Moshe Karabelnik
NAS Storage Optimization Using Cloud Tiering Appliance ......................................................................... 39
by Brijesh Das Mangadan Kamnat
Cloud-Optimized REST API Automation Framework ................................................................................... 40
by Shoukathali C K and Shelesh Chopra
DOCUMENT MANAGEMENT
Cloud-Based Dashboard Provide Analytics for Service Providers .............................................................. 40
by Shalini Sharma and Priyali Patra
Finding Similar Documents In Documentum Without Using a Full Text Index ...............................................41
by M. Scott Roth
INFRASTRUCTURE MANAGEMENT
Eliminating Level 1 Human Support from IT Management ..........................................................................41
by Amit Athawale
Guide to Installing EMC ControlCenter on a Windows 2008 Cluster ............................................................ 42
by Kevin Atkin
SECURITY
Storage Security Design - A Method to Help Embrace Cloud Computing ..................................................... 42
by Ahmed Abdel-Aziz
Secured Cloud Computing ........................................................................................................................43
by Gnanendra Reddy
Securely Enabling Mobility and BYOD in Organizations .............................................................................43
by Sakthivel Rajendran
An Approach Toward Security Learning .................................................................................................... 44
by Aditya Lad
STORAGE
2014: An SDS Odyssey .............................................................................................................................45
by Jody Goncalves, Rodrigo Alves, and Raphael Soeiro
Review Storage Performance from the Practical Lens ................................................................................ 46
by Anil C. Sedha and Tommy Trogden
2014 EMC Knowledge Sharing Book of Abstracts 5
Enabling Symmetrix for FAST Feature Used with FTS for 3rd Party Storage ...................................................47
by Gaurav Roy
How I Survived EscalationsBest Practices for a SAN Environment! .......................................................... 48
by Mumshad Mannambeth and Nasia Ullas
Virtual Provisioning and FAST VP Illustration Book ................................................................................... 48
by Joanna Liu
Simplied Storage Optimization Techniques for VMware-Symmetrix Environment ..................................... 49
by Janarthanan Palanichamy and Akilesh Nagarajan
The Need for VPLEX REST API Integration ................................................................................................. 49
by Vijay Gadwal and Terence Johny
How Implementing VNX File System Features Can Save Your Life ............................................................... 50
by Piotr Bizior
Overcoming Challenges in Migrating to a Converged Infrastructure Solution ............................................. 50
by Suhas D. Joshi
Disk Usage and Trend Analysis of Multiple Symmetrix Arrays Congured with FAST/VP .............................. 51
by Mumshad Mannambeth and Nasia Ullas
Connecting the Scattered Pieces for Storage Reporting ..............................................................................52
by Hok Pui Chan
Benets, Design Philosophy, and Proposed Architecture of an Oracle RAC/ASM Clustered Solution ............53
by Armando Rodriguez
RELEVANT INDUSTRY TOPICS
Storage Reporting and Chargeback in the Software Dened Data Center ....................................................53
by Brian Dehn
Understanding IO Workload Proles and Relation to Performance .............................................................54
by Stan Lanning
System Tap for Linux Platforms.................................................................................................................54
by Mohamed Sohail
Embedded QA in Scrum ...........................................................................................................................55
by Jitesh Anand, Chaithra Thimmappa, Vishwanath Channaveerappa, Varun AdemaneMutuguppe, and
Neil OToole
Detailed Application Migration Process ................................................................................................... 56
by Randeep Singh and Dr. Narendra Singh Bhati
A Journey to Power Intelligent IT ...............................................................................................................57
By Mohamed Sohail
Health Check and Capacity Reporting for Heterogeneous SAN Environments ..............................................57
by Mumshad Mannambeth and Salem Sampath
2014 EMC Knowledge Sharing Book of Abstracts 6
(L to R) Tom Clancy, Dibyendu Bhattacharya, Manidipa Mitra, Moshe Karabelnik, Shareef Bassiouny, Anand
Subramanian, Aditya Lad, Paul Brant, Bruce Yellin, Alok Shrivastava
THANK YOU
For the eighth consecutive year, we are pleased to recognize our EMC Proven Professional
Knowledge Sharing authors. The 2014 Book of Abstracts demonstrates how the Knowledge
Sharing program has grown into a powerful platform for sharing ideas, expertise, unique
deployments, and best practices among IT infrastructure professionals.
Articles from our contributing Knowledge Sharing authors have been downloaded more than
1,000,000 times, underscoring the power of the knowledge sharing concept. The past year alone
saw more than 400,000 downloads! View our full library of Knowledge Sharing articles, and
articles from the 2014 competition, published monthly, at:
http://education.EMC.com/KnowledgeSharing
Our continuing success is built on the foundation of committed professionals who participate,
contribute, and share. Through the Knowledge Sharing program, your industry-leading expertise
can provide thought leadership to a wide audience of IT professionals who must redene their
skills.
We thank all those who participated in the 2014 Knowledge Sharing competition.
Tom Clancy
Vice President
EMC Education Services
Alok Shrivastava
Senior Director
EMC Education Services
2014 EMC Knowledge Sharing Book of Abstracts 7
FIRST-PLACE KNOWLEDGE SHARING ARTICLE
VDI and The Fast Access to Patient Data Challenge
by Justin Beardsmore
Converging drivers have created a perfect storm within healthcare, creating extreme pressure on
IT and clinicians. Technology-literate clinicians have become accustomed to simple, ubiquitous
access to and sharing of information, just as they are used to in their personal lives. On the other
hand, there are still many technology-averse clinicians and patients navigating the digital world.
In the UK alone, the Nicholson Challenge is driving through efciency saving. In short, everyone
is being asked to do more with less, improve patient outcomes and service integration and, of
course, make information available anytime, anyplace, and anywhere for clinicians and patients
alike.
Both Server Based Computing (SBC), and Virtual Desktop Infrastructure (VDI) deployments are
commonplace in U.S. and EMEA healthcare. Thus far, the primary drivers for this have generally
been around CapEx, and OpEx reduction costs, with desktop management infrastructure costs
and simplied desktop support being obvious examples. This is along with solving immediate
operational issues like Windows 7 migration, security breaches, and staff tension introducing
new initiatives like BYOD. This trend is expected to continue in both markets, with EMEA
anticipating a more rapid increase than the U.S. over the next 24 months. This suggests that
many CIOs see this type of technology as a potential solution to some of the problems outlined in
the perfect storm.
Throughout the next 24 months, IT healthcare professionals will leverage the rich vein of industry
material to make informed procurement and implementation decisions around the choice of
technology, along with decisions on operation process re-alignment and functional organization
restructuring that are needed to support the new environment. Meanwhile, they need to
understand the sector and respect the organizations starting positionwhich is probably still
heavily reliant on paper-based processeswhile maintaining a rm focus on the future, which
will bring more agency integration and shared working, whether social care, community, private
practice, acute, or medical research, both locally and across geographical borders.
This article is aimed at CIOS and IT professionals who are looking at SBC or VDI as a viable
technology within the healthcare sector. The article will discuss why it should be viewed as a
clinical change process rmly grounded within the IT vision of providing fast and secure access
to patient data from anywhere. At the same time, one must not lose sight that it is also a
technology implementation, whose work stream needs to be aligned and coupled with Electronic
Medical Record (EMR), Electronic Document Management (EDM), and any paper-light programs
of work, along with the infrastructure strategy, especially if it is cloud-focused. Additionally, the
desktop, enterprise application, authentication process, and backend infrastructure all have a
relationship, which needs to be honored. All this must be considered within the context of the
above downward pressures from the storm, and the maturing desktop as a service model.
2014 EMC Knowledge Sharing Book of Abstracts 8
SECOND-PLACE KNOWLEDGE SHARING ARTICLE
Anatomy of Business Impact Management using SMAC
by Mohammed Hashim
The convergence of Social Computing, Mobility, Analytics and Cloud Computing (SMAC) is leading
a new service revolution in managing business processes. The strategic services primarily revolve
around Business Impact Management (BIM) encompassing all spheres of operations.
The erstwhile relationship between Businesses and Consumers has taken a new dimension
where the inuence of SMAC-based decisions is directly related to an agile business model
unlike conventional market dynamics and process automations. This volatility in B2C (Business-
to-Consumer) is setting a new norm that would control the way people and business relate to
technology and product innovations. This is also opening up newer avenues in business process
management and alternate opportunities of IT delivery management.
So is SMAC actually adding more business value or complicating the current state of affairs?
It is often dubbed as a stack with the different components operating as an integrated solution
to deliver maximum business results and transformation value adds. Here, Social computing
and Mobility entities are seen as effective drivers of productivity, accessibility and engagement;
while Analytics and cloud computing are enabling real time decision making at optimal costs.
SMAC is denitely accelerating the pace of business and would function as a key enabler of
transformations cutting across technology and process lines. This is a compelling reason for
organizations to incorporate these technology stacks in various ways to create a meaningful
business impact like improving total customer experience (TCE), robust innovations and
competitive marketing, and product life cycle management.
This Knowledge Sharing article discusses the various elements of SMAC architecture that enable
organizations to realize an agile business model and the methodology towards assessment
and adoption. The article also examines the implication of these solutions on existing business
models while reshaping for the future. Also covered is the role of SMAC in achieving the goal of
realizing strategic benets of sustainable IT services in terms of the creation of value to customer,
enterprises, and businesses as a whole.
This article touches on the technological and operational aspects of adopting SMAC; integrating
the framework with business processes and delivery, salient efciency drivers, and comparative
study of SMAC solution-based services that can be delivered as a unied stack versus discrete.
The idea of striking a balance between the various entities of the SMAC stack would be helpful
for Solution Architects, IT Consultants and Technology Specialists involved or interested in SMAC
based systems concept, design, implementation and management.
By improving real-time access to actionable knowledge on BIM choices through various channels;
the gen-next organizations would augment in driving services, developing and adopting a more
agile business model. In the same manner that steam power, steel and electricity provided the
platform for the industrial revolution; the SMAC stack is setting the foundation for the knowledge-
based digital revolution.
2014 EMC Knowledge Sharing Book of Abstracts 9
THIRD-PLACE KNOWLEDGE SHARING ARTICLE
Service-Oriented Storage Tiering: A 2014 Approach to a
2004 Problem
by Daniel Stafford
While EMC and VMware have been developing technologies to ensure service levels across the
storage infrastructure, the actual deployment of these technologies often runs into difcult
practical concerns that must balance the needs of multiple stakeholders in the IT organization.
Technology architects need a set of best practices for how these technologies can be deployed
across a very wide set of workloads in a standardized, automatable fashion while maintaining the
cost and exibility benets that come with large-scale storage consolidation.
This has immediate relevance to architects and IT managers struggling to build a private cloud
that supports a wide range of workloads without creating a sprawl of different solutions or
increasing costs to the business.
This Knowledge Sharing article will explore best practices for combining VMAX Host I/O Limits,
VMware Storage QoS technologies, SRM reporting, and automation/orchestration technologies
to build a Storage Service Catalog which provides consistent performance for multiple tiers for
applications living in a shared infrastructure.
We will consider an approach to building a multi-tiered storage service catalog which:
Provides tiers with different costs and performance levels which allow IT to be competitive
both internally and externally
Enforces limits to ensure fair utilization of paid-for resources and reduce the impact of noisy
neighbors
Allows for automation to ensure speed and consistency in provisioning
Allows for exibility and speed in re-tiering workloads
In the process, we will tackle many of the practical problems that come with such a system.
These include:
Automating the analysis new and existing workload to recommend the most effective tier
Using an SRM solution to proactively detect changing workloads to better partner with
application owners
Providing a consistent, integrated tiering strategy across both virtual and physical hosts
Algorithms for balancing diverse workloads across available resources at cloud-scale aka
the Tetris Problem
Array and SAN best practices to avoid limits, ensure exibility, and maximize uptime
Array best practices to ensure consistent performance over time aka the First Day is the
Best Day Problem
2014 EMC Knowledge Sharing Book of Abstracts 10
BEST OF BIG DATA
Implications of CAP Theorem on NoSQL Databases
by Ravi Sharda and Bharath Krishnappa
In 2000, Eric Brewer, a professor at UC Berkeley and now VP of infrastructure at Google,
introduced the idea that one can fully achieve at most two of three desirable properties of a
networked shared-data system: consistency (C), availability (A), and partition tolerance (P). What
started as a conjecture in 2000 became known as The CAP Theorem, after it was formally
proved by Gilbert and Lynch in 2002.
Consistency, in CAP theorem, means that a data item has the same value and the absolute latest
value across different nodes of a distributed system. This is like atomicity in ACID, rather than
consistency: The ACID C refers to database consistency: ensuring all integrity constraints are
honored. Availability means that a non-failing node always returns a response. This meaning of
availability is subtly different from its usual meaning of readiness of service. Partition-tolerance
means that nodes of a distributed system continue to function in the presence of network
partitions. Since only two of three properties can be achieved simultaneously, a distributed
system can be characterized as either an AP system, or a CP system, or a CA system.
For systems that must continue to function under network partitions, like many cloud and
distributed databases such as NoSQL databases do, there is an inevitable tradeoff to be made
between strong consistency and high availability (AP or CP). Ensuring strong consistency often
requires sacricing not only availabilityin the CAP sensebut also performance and scalability.
Indeed, many distributed NoSQL databases, such as Amazon Dynamo, Apache Cassandra,
Voldermort, and Apache CounchDB, emphasize availability (as well as scalability and
performance) and relax consistency. The weaker consistency model they adopt is called eventual
consistency. Some, like Apache Cassandra, allow users to select from a range of consistency
options (referred to as Tunable Consistency).
In 2012, Brewer noted that the CAP Theorem was widely misunderstood and that CAP
prohibits only a tiny part of the design space: perfect availability and consistency in the presence
of partitions, which are rare. The design space can be separated into two distinct parts: under
the presence of partition and in the absence of partitions. What is AP under a partition does not
have to remain AP in the absence of a partition. Also, many engineers, based on their experience
with large-scale distributed systems, have come to realize the costs associated with weaker
consistency models. Moreover, existence of stale data is not always acceptable: many business
use cases warrant maintaining both consistency and availability. As a result, a new breed of CP
databases has sprung up, such as Google Spanner and FoundationDB.
Despite its age, there are different interpretations of the CAP theorem, leading to a degree of
confusion among practitioners about CAP properties, and misunderstanding of the claims made
by NoSQL database vendors. This Knowledge Sharing article aims to demystify the CAP Theorem
(yet again!), help practitioners understand the mechanisms supported/implemented by NoSQL
databases with respect to CAP properties, and evaluate the associated tradeoffs.
2014 EMC Knowledge Sharing Book of Abstracts 11
BEST OF TROUBLESHOOTING
Troubleshooting Java on Linux
by Sumit Nigam
Seasoned programmers will agree with Murphys Law stating when left to themselves, things
have a tendency to go from bad to worse.
Have you been spending a lot of late nights troubleshooting production problems? Has your
company been troubled with SLA violations because your teams best effort at troubleshooting is
simply not enough? What are others doing to do a quick root cause analysis of the problems?
Many useful Linux commands and tips are available which every developer must learn to
effectively diagnose a problem quickly. To keep this Knowledge Sharing article within scope, it
discusses Java applications deployed on Linux only.
A problem may manifest in many forms. It could be a suddenly unresponsive system or a total
application crash. It could be a system running in degraded manner or a system which simply
errors out on every ow. What are the effective means to troubleshoot Java application on Linux?
What lessons can be shared with developers to make them effective troubleshooters?
We will discuss at length commands and tips which can help developers appreciate the science
behind troubleshooting. A lot of these commands may already be known at the surface level by
developers, but that is not enough to use them effectively.
Knowing a command output is one thing. However, the ability to correlate the output to pinpoint
the problem in code is another. Armed with knowledge of these correlation mechanisms, we then
look at some interesting examples and use command outputs to indicate how the root cause
analysis could have been done immediately.
No troubleshooting is adequate if it cannot indicate ways to address the problem and ways
to effectively monitor for such occurrences in the future. Another age-old adage ts well here;
prevention is better than cure. Hence, we conclude the article with discussions around what are
good remedial measures to address the problems and good application monitoring strategies to
help troubleshoot better and faster.
ELITE KNOWLEDGE SHARING ARTICLE
A Whole New Game: Leveraging Big Data in Baseball
by Bruce Yellin
The modern game of baseball dates back to the late 1840s and a sport where statistics have
thrived from the Industrial Age to the Information Age. This Knowledge Sharing article examines
the inuence Big Data has on Americas favorite pastime. It provides new and insightful
information on how baseball players, coaches, managers, and management use it to gain a
competitive edge over their opponents. Big Data is also helping to keep athletes healthy, drive
bottom line revenue, bring along fresh talent, automate aspects of game journalism, and greatly
2014 EMC Knowledge Sharing Book of Abstracts 12
increase the fan experience. Big Data topics such as analytics, clustering, stratication, and
others are illustrated as they apply to:
pitchers like Justin Verlander and Mariano Rivera
hitters like Miguel Cabrera and Kevin Youkilis
base stealers like Rickey Henderson
coaches and managers
team ownership
the media
The reader is also brought into a blow-by-blow, yet ctional game and shown how big data
helps teams craft management decisions, create strategic game plans, and make tactical player
adjustments. We also see ballplayers leverage these analytics and how umpires calls threaten
carefully laid battle plans. Through this, the reader will come to understand and appreciate big
data charts, diagrams, and photos of how the game is really played, and just what the next few
years will bring. Along the way, you will learn about the role Hadoop is playing, see big data-
driven smart apps in action, and even get your hands dirty examining individual pitcher data
records to see how it is done.
Before arriving at the ballpark, a world of big data analytics was hard at work behind the scenes
trying to make it a fun experience. For example, online ticket purchasing leveraged your frequent
fan card points, gave you great previews of your seat, and made parking recommendations based
on either convenience or cost. When you get to the park, you may be greeted by name as your
electronic ticket is scanned, and your smart GPS app directs you to your seat, shortest-line rest
rooms, or the nearest concession.
The team has also studied big data results from advanced military-grade Doppler radar and
PITCHf/x and FIELDf/x multi-camera video systems. Players and coaches can view, query, and
draw conclusions from the complete trajectory of the ball from the instant the pitcher releases it
until it crosses the plate or the batter hits it, the path and speed of runners on the diamond, and
the movement of the elders at 30 frames-per-second for every game and every opponent.
From your seat, you see a giant scoreboard displaying the batters picture, his batting average,
home runs, runs batted in, and much more. You watch the curve and dip of a pitch, the batters
swing speed, and the elders actual versus optimized path to a ball, all from your smart phone
or tablet in near real-time. Vendors are stocking and re-stocking food and concessions to match
your favorite items with an eye to factors such as todays opponent, the score, and weather
conditions. Should your team fall behind during the game, dont fret youll probably get
promotional text messages for 2-for-1 hot dogs with a beer purchase, or a personalized special
half-off promotion for a souvenir team logo cap for your child whom your team knows you brought
to the game. Afterwards, the team may solicit your feedback on the venue as they do their best to
entice you to come again, soon.
Fans at home are also included in the big data-driven experience as the broadcasters mix their
usual perspectives and conversation with specic matchup details between pitcher and hitter,
base runners, inelders and outelders, and the data the managers and their coaches have likely
mined to make each pitch and play pay off, all with exciting graphics initiated by each pitch.
2014 EMC Knowledge Sharing Book of Abstracts 13
The 2003 book Moneyball marked a statistical revolution in the game, yet accounted for just
10 terabytes of data in a years worth of games, with most of that coming from detailed data of
every pitch and swing. A dozen years later, everything is tracked and almost the same amount of
real-time data is generated in a single game, with an astounding 1.5 petabytes of structured and
unstructured data amassed every year. No wonder big data has produced a second, even greater
baseball renaissance. Play ball!
ELITE KNOWLEDGE SHARING ARTICLE
Driving Tomorrows Information Technology Platform
by Paul Brant
This Knowledge sharing article will focus on what will be the next IT platform of the future. Even
though many believe the cloud is that platform, there is more.
An interconnection of converging forces is happening, building upon and transforming user
behavior while creating new business opportunities. These forces are Social, Mobile, Cloud, Big
Data, and the underlying fabric of security.
In the interconnection of these forces, information is the context for delivering enhanced social
and mobile experiences. Mobile devices are becoming the defacto platform for effective social
networking and new ways of doing our work every day. Social links people to their work and each
other in new and unexpected ways.
Cloud computing and security represents the superglue for all the interconnected forces. Without
cloud computing, social interactions would have no place to happen at scale, mobile access
would fail to be able to connect to a wide variety of data and functions, and information still
would be stuck inside internal systems.
How about all of this data or whats termed big data. The trend today is information or data is
stored everywhere. Social, mobile, cloud and the security fabric make information accessible,
shareable, and consumable by anyone, anywhere, at any time.
To take advantage of the interconnection of forces, organizations must embrace these disruptions
and develop the appropriate skills and mind-sets.
BACKUP RECOVERY/ARCHIVE
Protecting Your Data Lake: Strategic or Business as Usual?
by Russell Easter
According to the latest IT spending survey from the Enterprise Storage Group, backup and data
protection are considered high priorities for most IT storage professionals. Furthermore, an IDC
study highlights annualized benets of $1.8m and 55% productivity savings from a consolidated
backup and archiving strategy.
2014 EMC Knowledge Sharing Book of Abstracts 14
While backup remains Job 1 for many administrators, 39% still see data protection as a signicant
challenge. Those surveyed highlighted that around 15% of recoveries were unsuccessful and
that only 8 of 10 completed within pre-arranged service levels. Much of these concerns have
their roots in the organic complexity inherent in current data protection deployments as there
are typically between 5 and 9 different technologies such as backup, journaling, snapshots,
replication, and archiving in use at any one time. This diversity of approaches continues to grow
as organizations look to reduce costs and improve agility and recoverability.
A decade of disk- and data deduplication-based backup technologies has eased the backup
administrators task considerably, although 66 percent of companies still use tape for some or all
of their backup data. Looking to the future, the maturity of cloud-based backup services, adoption
of Bring Your Own Device (BYOD), and Software as a Service (SaaS)-based delivery is putting more
pressure on data protection strategies as the risk of future data leakages and losses grows.
While post-recession IT budgets are starting to increase, the data protection strategist has to
ght for storage funds as capacity requests soak up the majority of the budget as companies
try to keep pace with growth demands. Furthermore, while this is a top issue for storage
administrators, its only seen as a subset of priority #5 (legacy modernization) or #6 (IT
management) initiatives at the executive level. Its been my experience that this has resulted in
backup refresh projects being postponed from year to year in lieu of higher priorities.
Can storage professionals build and articulate a transformational business case for backup and
data protection projects that provides positive and measurable business outcomes?
Harnessing data protection capabilities to deliver business impact
The rst tape-based backups were developed and implemented in the late 1960s and focused
on supporting two main use casesdisaster recovery and restore of a subset of data in the
event of corruption and/or deletion. More recently, the additional use cases of governance and
compliance have been added into the data protection environment. However, backup and restore
capabilities are typically viewed as an operational inconvenience and insurance policy by many
organizations. Consequently, many data protection business cases focus on cost saving benets.
While important, storage professionals need to also highlight how the lack of effective data
protection and compliance is now becoming a constraint on the business.
For example, McKinsey highlights that the number one IT-enabled business trend for the
next decade will be the business uptake of social-based applications and technologies.
In their research, they highlight that todays knowledge workers spend 60% of their time
searching, reading, and collaborating and that social technologies could make these workers
25% more productive. Unfortunately, the second highest (29%) inhibitor for the adoption of
such technologies is reported as regulatory and compliance concerns by many IT and storage
professionals.
Articulating the Business Benets from your Data Protection Investments
One of my clients previously had backup infrastructure refresh projects on their portfolio wish list
for over four years. Only minor incremental changes (software upgrades, additional tape drives,
2014 EMC Knowledge Sharing Book of Abstracts 15
libraries, and media servers) have been implemented during this period. Over the last 12 months,
I have helped them understand and articulate the real depth of their data protection challenges.
As Professor Joe Peppard of Craneld School of Management highlights in his research, the
failure to realize benets is primarily due to the methods and tools that emphasize the supply
side of IT delivery.
In applying the Craneld Cassandra Benets Dependency Network (BDN) highlighted in Joes
research to the context of data protection, I was able to help my client articulate the value that
can be attributed to a backup transformation project. In addition, by establishing an approach
that incorporated stabilization improvements to meet short term growth and service level targets,
and longer term optimizations that focused on reducing complexity and meeting compliance and
governance needs, the customer has been able to secure funding for the project for the rst time.
This article demonstrates how a combination of EMC quick-script analysis, the creation of a
Data Protection BDN, and adoption of a stabilize and optimize approach can help get your
data lake protection transformation.
What Is RecoverPoints Power?
by Roy Mikes
Data replication is an increasingly important topic these days. My Knowledge Sharing article will
help you understand the need for replication performed by EMC RecoverPoint.
It is certainly not a new product but recently I nished a RecoverPoint conguration. After
installation, conguration, and a recovery test, I was amazed at the great potential of this
appliance/software. RecoverPoint makes it easier to protect applications that grow with a Wizard
that allows you to modify the applications protection conguration to add new storage volumes.
According to EMC youll never have to worry about data protection again. As far as I can judge, this
is almost 100% true.
RecoverPoint systems enable the reliable replication of data over any distance within the same
site (CDP), to another distant site (CRR), or both concurrently (CLR). Specically, RecoverPoint
systems support replication of data that applications are writing over Fibre Channel to local SAN-
attached storage. The systems use existing Fibre Channel infrastructure to integrate seamlessly
with existing host applications and data storage subsystems. For remote replication, the
systems use existing IP connections to send the replicated data over a WAN, or use Fibre Channel
infrastructure to replicate data asynchronously or synchronously. The systems provide failover of
operations to a secondary site in the event of a disaster at the primary site.
Similar to other continuous data protection productsand unlike backup products
RecoverPoint needs to obtain a copy of every write in order to track data changes. RecoverPoint
supports three methods or write splitting: host-based, fabric-based, and in the storage array. EMC
advertises RecoverPoint as heterogeneous due to its support of multi-vendor server, network, and
storage environments.
2014 EMC Knowledge Sharing Book of Abstracts 16
Each site requires installation of a cluster that holds a minimum of two RecoverPoint appliances
for redundancy. Each appliance is connected via Fibre Channel to the SAN, and must be zoned
together with both the server and the storage. Each appliance must also be connected to an
IP network for management. All replication takes place over standard IP for asynchronous
replication and Fibre Channel for synchronous replication.
Beyond integration with EMC products such as CLARiiON or VNX storage arrays, Replication
Manager, and EMC Control Center, RecoverPoint integrates with VMware vCenter and Microsoft
Hyper-V allows protection to be specied per virtual machine instead of per volumes that are
available. But not only vCenter and Hyper-V integration. It integrates with Microsoft Shadow Copy,
Exchange, SQL Server, and Oracle Database Server which allows RecoverPoint to temporarily stop
writes by the host in order to take consistent application-specic snapshots.
RecoverPoints concurrent local and remote (CLR) data protection technology eliminates the need
for separate solutions as it provides CDP and CRR of the same data. The solution now provides
more exibility to replicate and protect data in many local and remote-site combinations with less
storage footprint whether for production applications or for test and development.
Despite the simple looks and the lots of sounds goods, you really have to know what you are
doing. It can be confusing because of the many possibilities and you should therefore be careful
what you do.
NetWorker 8.1 Enterprise Backup Protection for Virtualized
Data Center
by Gururaj Kulkarni, Anupam Sharma, and Naveen Rao
EMC has empowered its backup solution with the virtualization handle. Backup applications
are integrated with the virtualization solution, which enables it to back up large data center
environments with effective performance.
The latest integration of NetWorker with VMware using Avamar Virtual Backup Appliance (VBA)
is a great leap in protecting virtualized data centers with greater exibility and ease of use.
NetWorker version 8.1 launched a new way to protect virtual data centers using this integrated
solution. It also adds a scalable model and provides a greater competitive edge in the market.
The policy-driven approach of protecting VMs with a centralized user interface provides a value
add solution to end users. The solution is implemented using Avamar technology to protect the
VMs using user congurable policy. The policy helps backup administrators to dene the policy
with less effort and assign an action either to protect or clone the virtual machine backup. This
integration also leverages EMC Data Domain as the target device.
The changed block tracking (CBT) helps to protect the VM by copying only changed blocks. The
solution has a centralized proxy which facilitates data transfer between VMware and NetWorker
target device such as Data Domain. It can co-exist with existing legacy workow and addresses a
few of the challenges from earlier integration by reducing conguration complexity and leveraging
true CBT. The VBA leverages the advanced transport modes such as hotadd for SAN-based
protection. It also provides a capability of nbd for protecting VMs using network.
2014 EMC Knowledge Sharing Book of Abstracts 17
The solution is easily scaled to protect up to 10k VMs in a data center being managed by single
or multiple virtual centers. Each VBA can easily scale to protect up to 500 VMs with one or two
external proxies. Each VBA has eight internal embedded proxies and users can also congure
external proxy being managed by VBA which in turn will have eight proxies built.
This Knowledge Sharing article provides a detailed overview of performance and scalability of
this solution and also lists some best practices while adapting this solution. The article is also
provides end users some troubleshooting tips to address challenges that may come up after
implementing this solution.
NetWorker 8.1 Integration with Boost over Fibre Channel and
Virtual Synthetics Feature of Data Domain
by Gururaj Kulkarni and Soumya Gupta
This Knowledge Sharing article discusses the performance benet gained from two important
new features that were introduced in NetWorker 8.1 with Data Domain Integration.
1. Data Domain Boost Over Fibre Channel
Customers using NetWorker with Virtual Tape Libraries (including EDL, Data Domain as VTL, and
others), auto changers, or tape device as their backup solutions cannot transition to NetWorker
backup to disk with Data Domain since they have a dedicated Fibre Channel environment and
Data Domain devices support data transfer only over TCP/IP.
This article describes the new feature introduced in NW 8.1 where NetWorker clients and storage
nodes support Fibre Channel (backup and recovery operation) connectivity to Data Domain
devices by leveraging Fibre Channel capability available with DD Boost 2.6 library.
This support not only optimizes the customers existing investment in their Fibre Channel
infrastructure but offers both client-side deduplication and support of the Fibre Channel protocol
using a backup-to-disk workow.
2. Virtual Synthetics
In the current Synthetic Full Feature, the consolidation of full and one or more incremental is
done on the NetWorker Storage node. If backup data resides on Data Domain, it has be pulled
from it and then processed on NetWorker Storage Node and consolidated backup will be sent
back to Data Domain. This requires additional network bandwidth and un-necessary overhead
on NetWorker storage node. Due to this the time taken to synthesize the Saveset and network
bandwidth usage was more.
This article describes the new feature introduced in NW 8.1 where Virtual Synthetic Full backups
are an out-of-the-box integration with NetWorker, making it self-aware. Therefore, if a customer
is using a Data Domain system as their backup target, NetWorker will use Virtual Synthetic Full
backups as the backup workow by default when a synthetic full backup is scheduled, thus
optimizing incremental backups for le systems.
2014 EMC Knowledge Sharing Book of Abstracts 18
Virtual synthetics reduce the processing overhead associated with traditional synthetic full
backups by using metadata on the Data Domain system to synthesize a full backup without
moving data across the network
NetWorker 8.1: The Next Big Thing
by Anuj Sharma
NetWorker 8 was launched with major underlying architectural changes which have the product
more stable, exible, efcient, robust, and interoperable with the Cloud Infrastructures.
Enhancements introduced in NetWorker 8.0 made it the rst choice for Enterprises. Continuing
with the success of NetWorker 8.0, EMC recently introduced 8.1 that is being seen as the next big
thing in the Backup industry.
In this Knowledge Sharing article, I will discuss the new features of NetWorker 8.1, best practices
to implement the features, and how organizations can take maximum advantage of NetWorker
8.1.
Topics discussed in detail in this article include:
NetWorker 8.1 Upgrade Considerations
Improved Data Domain Integration
Improved Snapshot Management
Windows 2012 Backup Enhancements
Block-based Backups
DDBoost over Fibre Channel (DFC)
VHD Backups
NMM 3.0 Enhancements
Virtual Synthetic Full Backups
Confgure NetWorker NMDA Backup for Oracle RAC Using
Oracle SCAN
by (Arthur) Zhuang-Song Jiang
Single Client Access Name (SCAN) is a feature used in Oracle Real Application Clusters (RAC)
environments that provides a single name for clients to access any Oracle Database running
in a cluster. You can think of SCAN as a cluster alias for databases in the cluster. The benet is
that the clients connect information does not need to change if you add or remove nodes or
databases in the cluster.
SCAN was rst introduced with RAC 11g Release 2. This feature can be utilized when we congure
the NetWorker NMDA backup to back up multimode Oracle RAC active-active databases and give
us some great benets. For example, having a single name to access the cluster to connect to
databases in this cluster enables NetWorker scheduled backup jobs to access any databases
running in the cluster in mode of load-balanced managed automatically by the Oracle RAC. This is
done independently of the number of databases or servers running in the cluster and regardless
2014 EMC Knowledge Sharing Book of Abstracts 19
of which server(s) in the cluster the requested database is actually active.
It will store the backup data under the index of a NetWoker client named after the SCAN name
of Oracle RAC. This greatly facilitates recovery because we do not need to remember which node
backed up what data.
So far, the EMC NetWorker Module for Databases and Applications Release 1.2 or Release 1.5
Administration Guide has not mentioned the Oracle SCAN feature. In my Knowledge Sharing
article, I will use an actual work example to showcase how to congure NetWorker NMDA 1.2 to
back up a multi-node Oracle RAC using the Oracle SCAN feature.
Keep the needed. Archive the rest.
by Kobi Shamama
Market research frm IDC projects a 61.7% compound annual growth rate (CAGR) for
unstructured data in traditional data centers from 2008 to 2012 vs. a CAGR of 21.8% for
transactional data.
IDC projects that the digital universe will reach 40 zettabytes (ZB) by 2020, an amount that
exceeds previous forecasts by 5 ZBs, resulting in a 50-fold growth from the beginning of 2010
The backup didnt fnish for two days, the DR site is full I cant replicate anymore, please
delete all fles that are not needed
Unstructured data is getting bigger by the day. With the simplicity of creating les and folders and
proliferation of smarts phones and tablets, IT managers are challenged with keeping up with the
vast amount of new data being created.
This massive amount of data brings new challenges such as:
How to nish backup for all the data
How to manage all the data
And, most important - how to know which les are most relevant
The solution every organization should consider: Keep the needed. Archive the rest.
EMC is a pioneer in archiving solutions using Hierarchical Storage Management (HSM) technology
for policy-based tiering, an efcient method of storing data by separating the relevant les from
more expensive and faster tiers to archiving the rest in a more cost-effective storage platform.
This Knowledge Sharing article will demonstrate the added value of a HSM solution for archiving
deployments based on EMC NAS machines. I will discuss use cases and suggested solutions on
how to build a HSM platform to maximize the value for money in the NAS environment (based
on real customer experience), minimize backup time, and reduce the amount of physical storage
arrays to manage in the organization.
This article is aimed toward CTOs and EMC Proven Professional IT managers who would like to
take their NAS environment to the next level of efciency by reducing the cost and management of
the NAS infrastructure.
2014 EMC Knowledge Sharing Book of Abstracts 20
How Analytics Can Help Backup Administrators
by Balaji Panchanathan
Pretty much all backup products do two things well: data collection and data reporting. However,
they do not do the third part: data analysis. This Knowledge Sharing article will focus on how
analytics can be used on top of backup products (Avamar and NetWorker). I will explain how
to perform data analytics for backup products and the benets that can be derived out of that
analysis.
The article will focus broadly on:
1. How simple functions like min,max, variance can be used to get great insights to backup
performance
2. What type of predictive analytics can be done based on the data available in the Avamar
server
3. How to do predictive analytics
This article will help backup administrators use their dormant data more effectively through
use of simple tools. Focus will be on what type of data to use, tools to use to fetch meaningful
information from those data, and how they can be used daily to make the backup administrators
life a lot easier. Although this article will feature Avamar and NetWorker, the frameworks and
tools described will be helpful to all types of administrators.
Avamar Integration with VMware
by Nadine Francis
This Knowledge Sharing article covers Avamar integration with VMware, Avamar integration with
Data Domain in virtual environments, and how to back up a virtual machine (Image-Level Backup
& Guest-Level Backup).
Topics will include:
A brief introduction about virtualization and its benets
An explanation of IT transformation and its steps to gain a longer-term vision of IT
Avamar integration with Data Domain in virtual environments.
How to back up a virtual machine (Image-Level Backup & Guest-Level Backup).
A quick licensing guide and part number overview
Use cases
This article will be helpful for selling Avamar Virtual Edition (AVE), thus it can be useful for Sales,
Pre-Sales, and Marketing teams.
2014 EMC Knowledge Sharing Book of Abstracts 21
DDboost Implementation with NetWorker in Complex
Networking Scenario
by Crescenzo Oliviero
Do you have different data zones managed by the same networker management console?
Do you have different backup networks that cannot communicate each other?
Do you have a single big Data Domain system that is used by different data zones?
Would you like to implement DDboost with direct client feature?
NetWorker provides a best in class enterprise backup solution. Data Domains reputation as
a premier data deduplication system is well-known. The combination of NetWorker and Data
Domain with Boost, provides a great solution for performance, simplicity, manageability and
availability.
A method to congure Data Domain boost devices with NetWorker, where customer network is not
at, is detailed in my Knowledge Sharing article. I document a procedure where only NetWorker
Server will be used to congure devices, without the requirement that Networker Management
Console must see Data Domain devices on the network.
NetWorker Delegation Model for ROBO Environment
by Puneet Goyal
Delegation Model is a special type of model in which customers route all their data recovery
through their own local IT staff remote ofce/branch ofce (ROBO) instead of backup
administrators. Delegation model is a general term to delegate some part of the administrative
work to another group. However, customers often wish to customize the model according to their
needs.
With the use of some basic functionality of NetWorker, one can provide a ROBO backup solution
using the existing backup solution. The wide adaptability and acceptability of the solution for
operating systems, databases, applications, backup devices, and more makes it responsive to
accept the customers needs.
This Knowledge Sharing article will take you on a journey to implement Delegation Model using
NetWorker as the main backup product. We will discuss the following topics to create a solution
for the ROBO environment.
1. Understanding the backup architecture and backup environment of the customer.
2. Collecting product features which can be used as a solution.
3. Drawing a prototype.
4. Implementing the model.
5. Delegating the part.
6. Performing the tuning.
7. Gaining the claps.
2014 EMC Knowledge Sharing Book of Abstracts 22
Benets of this article include:
Helps customer include their branch and remote ofces with the data center to provide
complete protection to their environment.
Helps Professional Services write a tailor-made solution for customers needs.
Managing the solution in the same backup console as the data center servers.
Document can be easily integrated into the procedure generator tools and made available
via web.
Solution architect will nd a new way to use the added feature of NetWorker for their
customers and will remain in the competition with other vendors.
The goal of this article is to provide a foundational understanding of Delegation Model, and assist
in developing a blueprint for the tailored solution which can be extracted by the NetWorker and
some backup device.
Backup Optimization DD Boost - NetWorker inside
by Mohamed Sohail
Do you need to speed up your backup by up to 50%? Do you need to reduce the use of your
bandwidth up to 99%? Do you want to reduce the backup server workload up to 40%? Do you
want to increase your backup success rate?
Assuming that you said yes, the answer is DD Boost. it is a solution which will allow you to
nish backups within backup windows with breathing room for data growth. With performance
up to 31 TB/hr, it is three times faster than any other solution, enabling you to use your existing
network infrastructure more efciently.
In this Knowledge Sharing article, we illustrate the benets of the new DD Boost feature over
Fibre Channel, its integration with NetWorker, and how it will enhance your backup system
performance.
NetWorker is a cornerstone element in the backup solutions of large infrastructure customers.
This article targets backup administrators, support engineers, and stakeholders interested about
the importance of DD Boost over Fibre Channel and how to use it most effectively; speeding
up backups, avoiding congestions that slow down large critical backups through bandwidth
utilization reduction, and workload minimization on backup hosts (NetWorker Server and
Storage nodes) which should normally enhance the backup success rate. Ill also illustrate how to
better leverage the use of the current infrastructure without investing in a tech refresh for many
components in the backup environment.
Managing Data Protection for a Better Nights Sleep
by Gareth Eagar
In addition to well-known massive data growth, the infrastructure used to store and access
that data is also rapidly changing. We are in the midst of a trend of storage moving to the cloud
(whether public, private or hybrid), and even big storage hardware vendors are embracing the
2014 EMC Knowledge Sharing Book of Abstracts 23
move to a more software dened world (look no further than EMCs ViPR storage solution).
Traditional ways of protecting all this data that is being generated is also changing. Backup of
all bits and bytes to tape is being replaced by backup of unique bits and bytes to disk through
the use of innovative deduplication technology (such as Data Domain). To increase the speed of
access to data in the event of the primary source becoming unavailable, more and more storage
arrays are being replicated to remote sites so that application availability and resiliency in a 24 x
7 x 365 on-demand world is increased.
The good news is that the tools available to monitor and manage data protection have been
improving, and consolidation and integration of tools is leading the industry closer to the long
dreamed of single pane of glass for monitoring of IT infrastructure.
In this Knowledge Sharing article well take a deeper look at data protection trends and best
practices and examine how current tools, such as EMC Data Protection Advisor, are helping IT
staff sleep easier. Some of the topics that well look at include:
The importance of monitoring data protection
How to monitor and track data protection results
Getting to the root of the problem
Being proactive Real-time alerting on the things that matter now
Paying for it all Chargeback models for data protection services
BIG DATA/PREDICTIVE ANALYTICS
How He and She Use CMC
by B. Nicole Parfnovics
Computer-Mediated Communications (e.g. text-messages, e-mails, etc.) has become an
unavoidable reality as a form of correspondence around the world. Whether youre on a WebEx
conference via your iPhone listening to a Brown Bag presentation or just sending a quick message
to let your signicant other know youre working late at the ofce again, CMC has become the
norm for getting a message from point A to point B.
The past few years have been a roller coaster ride for CMC media and devices. Consider Facebook.
While nothing but an inkling of an idea just under 10 years ago, the social networking enterprise
giant has fast become a topic of interest to a plethora of people. Market research companies,
academic researchers, and advertising rms all have good reason to take note of social media,
as Facebook has recently become one of the hottest publicly traded companies on the market,
making its debut on the prestigious Fortune 500 list this year.
Mobile devices are also a large factor of the CMC craze. Apple certainly reigns supreme. The lines
outside Apple stores form year after year for the newest, latest, greatest iDevices. iPads, iPad-
Minis, and all of the tablets that quickly copied the Apple devices are now the latest fad and what
you see every other business traveler carry on any given airplane, at least even with laptops, if
not outnumbering them.
2014 EMC Knowledge Sharing Book of Abstracts 24
My Knowledge Sharing article will examine the predictive value of age and gender as they
relate to various forms of computer-mediated communications. Anonymous online surveys
were administered to students registered in three separate Internet-based courses to assess
participants average number of Facebook friends, average amount of time spent on Facebook,
and average number of daily text messages sent.
Social media usage is one of todays hottest #trends and the research nding contained within
this article provides the reader with valuable comparative insight about social media, usage
habits, gender, and age of its users. Multiple regression analysis tests were run on various social
media communications predictor variables to examine whether age and gender were signicant
predictors to any of those social media variables. Every test that was run on gender and age in
this study indicated signicant relationships between the various usages of CMC and social
media.
The study gave only a small overview of computer-mediated communications, gender, and age
(how he and she use CMC) yet produced multiple signicant results and raises many thought-
provoking questions. Additional theorizing and empirical investigation on intentional social
actions with regard to CMC should be conducted; however, due to the relatively short lifespan of
social networks, a longitudinal study design will be necessary for future cause-effect inferences.
Fortunately, EMC Proven Professional Data Science courses provide associates with specialized
skills to conduct skills and direct BI projects such as these.
Big Data: Deciphering the Data Center
by Andrew Bartley and Rich Elkins
Data center budgets are shrinking while ever expanding services are expected by end users.
Virtualization and cloud services have already improved data center efciencies, but business
units still demand more. Meanwhile any proposals for investments in new services are met
with increasing scrutiny; simply purchasing more resources for your virtualized environment is
no longer an option. Using traditional forecasting and budgeting techniques can make it nearly
impossible to correctly determine where to invest your limited resources.
This Knowledge Sharing article will explain how using predictive analytics to improve utilization
of existing data centers will be particularly useful to any IT professional that has experienced
shrinking budgets and increased demands. It will provide generalized instructions along with
specic examples of how to analyze historical resource demands and predict future demand.
These tools will enable IT professionals to improve utilization of existing resources.
IT and data center management will need to monitor usage metrics as they transition from legacy
IT models to newer, more dynamic models. These new dynamic models must support on-demand
service and automation; on-demand services require that the right resources be available at
the right time for end user consumption. In order to correctly forecast and predict data center
resource demand, IT professionals must harness the power of Big Data Analytics to intelligently
design their services.
2014 EMC Knowledge Sharing Book of Abstracts 25
New dynamic IT models support this trend with architectures that support dynamic elastic
environments that maximize efciency by allowing the same hardware to perform more use
cases. These innovations are not enough on their own to reduce operating and capital costs; a
higher understanding of the environment is required to properly manage it and maintain service
levels.
Patterns of service consumption can be identied by organizing and adding intelligence to vast
amounts of raw data. This requires two inputs: raw data and long-term business plans. We
already collect many different forms of data in our data centers; service requests, equipment
management data, equipment logs, resource utilization, and so on. Analyzing and summarizing
this raw data can show trends in consumption. By mapping these consumption trends to long-
term business plans, we can more accurately predict future data center resource demands based
on similar historical records.
While the specic applications of Big Data analytics will vary drastically based on your particular
business, the general concepts are all similar. A large company developing a new software
product will require certain resources for each of their developers but not every developer will
be working on the new product at the same time. Analyzing historical records can help identify
resource utilization and resource demand magnitude over time; this will predict the right
resources are available at the right times.
It is critical to data center management that metrics be measured and analyzed to understand
how resources are consumed, why they are consumed, and when demand will rise and fall. By
understanding how specic resources are being consumedsuch as compute, memory, storage,
and network utilizationdata center management can provide an improved and more reliable
service. Usage statistics can be examined to determine and mitigate spikes in demand, enforce
different QoS levels, determine what program the resources are being consumed for, and where
to make future resource investments. The collected metrics coupled with the ability to analyze
and plan around them enables elasticity, continuous delivery, and an improved customer
experience.
Intelligent QoS for EMC Storage Through Big Data Analytics
by Yangbo Jiang
In the storage industry, Quality of Service (QoS) is the ability to provide different priorities to
different applications and LUNs, or to guarantee a certain level of performance to an application. Its
becoming more critical for preventing workloads or tenants from adversely affecting one another and
for meeting service-level objectives for storage performance. Usually, most storage vendors have
implemented this feature directly into the product, such as Navisphere Quality of Service Manager (
NQM) in USD product. However, present QoS implementations are inexible and mechanized. Since
we are into Big Data era, why not take full advantage of it to make QoS intelligent?
Storage environments absorb large amounts of data every second, much of which can
contain very useful information hidden inside. By using Big Data analytic techniques, storage
administrators could predict customer behavior patterns and automatically adjust QoS values for
applications proactively in real-time.
2014 EMC Knowledge Sharing Book of Abstracts 26
In this Knowledge Sharing article, we describe how to build a QoS analysis framework by
analyzing massive data retrieved from existing data collection techniques in storage and
interacting with the existing QoS feature in storage. For better analysis on such massive data, we
adopt popular Big Data analysis methodsfor instance, using Nave Bayesian Classier based
on probability theory on these discrete data to determine user behavior patterns and predict
users future behaviorthen interact with storage to set QoS value proactively.
This article will be of interest to those who would like to enhance QoS capabilities in their storage
product. The analysis method in this article could be could be easily implemented to build
intelligent QoS into the storage environment.
Does Big Data Mean Big Storage?
by Mikhail Gloukhovtsev
There are various denitions for Big Data. Sometimes Big Data is dened as the data that
cannot be managed using existing technologies: applications and infrastructure. Therefore,
new applications and new infrastructure for those new applications as well as new processes
and procedures are required to use Big Data. The storage infrastructure for Big Data applications
should be capable of managing large data sets and providing required performance.
Development of new storage solutions should address the following characteristics of Big Data:
Huge volume of data (for instance, billions of rows and millions of columns)
Speed or velocity of new data creation
The variety of data created (structured, semi-structured, unstructured)
Complexity of data types and structures, with an increasing volume of unstructured data
(80-90% of data in existence is unstructured)
As Big Data requires special storage solutions, what is Big Storage? Should it be principally new
storage architectures? Or can we modify existing storage platforms to use for Big Data? How can
Big Data Analytics with its sharing nothing storage architecture be integrated with traditional
enterprise BI data warehouse environments using shared storage? As the value of data changes
over time, how can Big Data value be aligned with the cost of data protection and do Information
Lifecycle Management (ILM) processesincluding active and passive archivingapply to Big
Data?
In this Knowledge Sharing article, I will try to answer these questions by providing a balanced
review of pros and cons of the possible storage architecture solutions and related storage
technologies for Big Data Analytics.
The IT Department Becomes a Data Broker
by Nick Bakker and Paul Wu
Big Data is here to stay. IT infrastructure is becoming a commodity. Data is a valuable asset and
the manner in how data is used becomes increasingly important to the success of the business.
Consequently, the traditional role of the IT department evolves to that of a data broker. As a data
broker, it is up to the IT department to also ensure data availability and reliability to support the
decision making process in the business.
2014 EMC Knowledge Sharing Book of Abstracts 27
To predict the future, one should understand the history as well. In our Knowledge Sharing
article, we will describe the development of IT departments. Their background and the path to
the future, as we expect it will happen, will be presented. This will explain why we predict the
scenario of an IT department as data broker. We will dene the role of data broker in terms of
activities and responsibilities, along with relations of the data broker with the environment.
IT (Storage) professionals should be prepared for their new role which will require different skills
and competences. Understanding this scenario is required to be able to anticipate in time. We
will explain the impact of this scenario for the most common IT management roles.
Data governance affects the whole organization. With the increase of available information, the
thirst for real-time knowledge becomes more apparent. IT departments have to act accordingly
to deliver the right services. The data broker will be held responsible of the quality of data,
whenever its created inside or outside the own organization.
The reader will learn about the new role of the IT department and develop an understanding of
the technological and organizational developments which will change the position and added
value of the IT professional. We will highlight the challenges that the IT department encounter
in its transition to data broker and how these challenges can be overcome from an IT service
management perspective.
We will use the ITIL framework to describe the roles in the IT department 2.0. We present the
argument that the roles according to ITIL remain the same, but we perceive a shift in emphasis
towards information and data. An overview is provided on what we believe is the impact of these
changes to the main ITIL roles and why it is important that the role of the data broker becomes
apparent for the IT department.
The power of Big Data is to retrieve information from different data collections. Data is stored in
almost all departments. Each department creates their own data silo. To maximize the value of
the decentralized stored data, an overall governance structure is required. We will address that
transparency and the willingness to share data are important to creating value from the available
bytes.
New Opportunities @ the Crossroads of M2M & Big Data
by Prakash Hiremath
M2M is commonly known as a technology that enables machine to machine communication. A
machine could be any device that can be connected to the network and can securely exchange
information with other devices over wired or wireless networks.
M2M has evolved over the years and is now seen as next technology disruption by experts. There
are primarily four core steps are involved in M2M as follows.
1. Collection of data at source In this step data is collected from source. Source could be
anything like a device that is capturing machine data.
2. Transferring data over wired/wireless network/networks securely In this step data is
encrypted at source and sent over wired or wireless networks. It could be WAN, LAN, WIFI or
2014 EMC Knowledge Sharing Book of Abstracts 28
public internet. Data will be received at destination, most likely centralized data center, will
be decrypted and stored in data store
3. Data Assessment Decrypted Data is then analyzed by experts to drive conclusions.
This data is monitored by continuously.
4. Take Action You can then take appropriate action based on the conclusion. This means
you can even control the devices at source from centralized locations and issue healing
commands.
Then there is a Big Data which has already created buzz in all the industries. What is Big Data?
Big Data is characterized by 3 Vs that is Volume, Velocity and Variety of data. Big data combines
structured, semi-structured and unstructured data which generally beyond the capacity of
traditional data warehousing softwares and processes much faster and creates valuable insights
for business. Big Data primarily uses Hadoop to process the large data.
M2M and Big Data have brought in technology disruption like cloud computing, mobility, and
social. A world of new opportunities are created for enterprises when two disruptive technologies
meet at the crossroads. M2M can help communicate with devices and gather large amount of
data both internal and external. Internal data is commonly known as enterprises own data and
external data is something external to enterprise or something beyond the control of enterprise.
Using Big Data enterprises can analyze the various collected data sets and create a great insight
that can create huge opportunity for business.
This Knowledge Sharing article explores the new opportunities created by merging M2M and
Big Data technology. The reader will also be introduced to M2M and Big Data technologies. This
article will also cover a couple of use cases outlining the opportunities created at the crossroads
of M2M and Big Data.
Big Data Analytics Modeled Customer Feedback Portal
by Shoukathali Chandrakandi, Shelesh Chopra, and Gururaj Kulkarni
There are more than 40k+ BRS customers using various set of products from BRS. Customers
are great assets for continuous product enhancement and their feedback is valuable contributor
for total customer experience (TCE). There are various mechanisms we use to interact with these
customers for their feedback about product usage, enhancement plan, and reviews about our
product portfolios. There are multiple interfaces in product business unit which interacts with
these customers, among them, pre-sales and sales, product marketing, Executive management,
product management, engineering, and support. Most of the data gathered or reported from
customers is either spread across various portals, buried, of not being effectively used to for
improving the product.
In this Knowledge Sharing article we will discuss how to leverage the EMC Greenplum analytical
platform to gather and interpret data which can mutually benet customers and various BRS
product organization teams. This enhanced interaction model will help BRS accelerate sales
growth by timely positioning of its product portfolios for data center expansion at customer site.
2014 EMC Knowledge Sharing Book of Abstracts 29
Predictive and Self-healing Data Center Management
by Gnanasoundari Soundarajan, Hemanth Patil, and Radhakrishna Vijayan
Existing data center monitoring and reporting tools can only raise an alert after the problem has
occurred in data center. Root Cause Analysis is done only after the issue is raised. This leads to
reactive rather than pro-active troubleshooting or customer service.
Examples of typical problems include:
Sudden IT Service outage
Slowing device performance
Disk space shortage
Reactive customer service
Extensive security sreaches
Maintaining mission-critical application uptime
There are multiple sources from which we can get the Network/Storage/Virtualized IT Resources
utilization event details. This data can be used to generate reports on observed events which give
details of problems that occur in a production setup.
Extending this data usage to the next level can enable predictions to be made based on the
observed events and statistical parameters, analyzing the impact on the data center, and alerting
users before the user/service is impacted. With this new model, the problem can be identied
in advance by applying statistical analysis on network and storage parameters. The prediction
will be used to take corrective action before the service is impacted. Its similar to taking
precautionary action before a storm occurs in city.
This Knowledge Sharing article will describe a solution to achieve high availability in the data
center through a predictive and self-healing technique. A few benets of adopting this approach
include:
New model will help avoid critical unexpected issues rather than performing RCA analysis
after the issue arises.
Predicting of problems and providing proactive solutions rather than reactive xes.
Quick corrective actions can be taken based on the prediction.
Historical trending data is utilized to predict patterns.
Reduced downtime cost per year and data center maintenance cost.
Increased reliability of data center components due to early prediction of issues and
correction.
Networks, Storage and Virtual infrastructure inside a data center will be better prepared to
handle a ood of events, latency issues, breakdowns.
Better customer satisfaction drives increased product revenues.
Improved business-critical application availability.
2014 EMC Knowledge Sharing Book of Abstracts 30
Conceptual Data Mining Framework for Mortgage
Default Propensity
by Wei Lin and Pedro Desouza
The scope of this Knowledge Sharing article focuses on Dodd-Frank Act Qualied Residential
Mortgages (QRM) exemption rules applied to ABSs that are collateralized exclusively by
residential mortgages. The approach intends to serve as a stepping stone in calculating and
evaluating total accumulated risk for asset- backed security mortgage pools to determine an
approach for mandated risk retention.
This article is organized into ve sections.
Section I - general introduction for the business challenges.
Section II - an overview for the framework ow.
Section III - discusses the process for QRM, prole data consolidation, and synthetic sample data
preparation. The qualication results of the samples are analyzed.
Section IV - Household lifecycle model and mortgage default/prepayment model are introduced
and apply to non-qualied residential mortgage default propensity forecasting. Three non-
qualied QRM samples are presented in this section.
Section V Conclusion.
Data Structures of Big Data: How They Scale
by Dibyendu Bhattacharya and Manidipa Mitra
We are living in the data age. Big data which is information of extreme size, diversity, and
complexity is everywhere. Every enterprise, organization, and institution are starting to realize
that this huge volume of data can potentially deliver high value to their business. Enormous
explosion of data has led to massive innovations on various technologies. All innovations revolve
around how this huge volume of data can be captured, stored, and eventually processed to
extract meaningful insights that will help make better decisions faster, perform predictions of
various outcomes, and more.
Big Data technology innovations can be broadly categorized into following areas:
Technologies around Batch Processing of Big Data (Hadoop, Hive, Pig, etc.).
Technologies around Real Time Processing of Big Data (Storm, Spark, etc.)
Technologies around Big Data Messaging infrastructure (Kafka)
Big Data Database: NoSQL technologies (HBase, MongoDB, etc.).
Big Data Search technologies (ElasticSearch, SolrCloud, etc.).
Massively Parallel Processing (MPP) technologies (HAWQ, Impala, Drill, etc.).
Now for each of these categories of Big Data technologies, various products or solutions are either
already established or are evolving. In this Knowledge Sharing article we will pick some popular
2014 EMC Knowledge Sharing Book of Abstracts 31
Open Source solutions in each of these areas and will try to explain how efciently each of these
solutions uses various Data Structures or fundamental Computer Science concepts to solve a very
complex problem. Most of these solutions have extreme usage of Data Structures and Computer
Science concepts, but we will focus on only the most prominent one from each of these solutions.
For example, Hadoop is the most popular technology for Batch Processing of Big Data. We will
delve into Hadoop architecture to explain how Hadoop effectively uses the hard disk for efcient
data transfer from storage to compute. We will explain how Operating System concepts are used
here to minimize disk seeks and increase the sequential access of data blocks.
If you move to real time challenges of Big Data, Storm is the most promising Open Source
solution in this space. Storm has guaranteed message processing capabilities for every message
that comes into Storm. For a real time streaming system to have this guarantee with millions
of messages per second may not be an easy task. But Storm makes very efcient use of Graph
Data Structure (DAG Directed Acyclic Graph) to keep track of the message tree. This article will
explain how Storm uses this Graph Data Structure.
In the NoSQL solution space HBase is considered the most popular Columnar Data store. HBase
is highly optimized for high volume write and best suited for range queries of Big Data. For
traditional database systems, the B+ Tree index is not a very efcient design, thus HBase uses
something called Log Structured Merged (LSM) Tree to store the data in-memory and periodically
compacted to disk. With the use of LSM tree, HBase can achieve very high volume of write and
perform very fast range scan.
Another popular NoSQL document store, MongoDB, uses an Operating System concept called Memory
Mapped File to effectively store the document indexes into the memory for very fast retrieval.
In the Big Data messaging infrastructure, Kafka is a very promising solution. This article will
explain how Kafka message brokers effectively utilize Operating System Page Cache and use
sequential disk scan capabilities to scale to hundreds of thousands messages per second, which
is otherwise impossible for traditional messaging system to scale.
Additionally, this article will also touch upon how a Big Data search infrastructure like Elastic
Search uses RAM for very fast search.
By reading this paper, one can develop a deep understanding of technology spectrum, and the
challenges in the Big Data space, and how various solutions tried to solve those challenges. This
article will hopefully motivate readers to explore more deeply into these technology areas and to
other aspects of each of these solutions.
Envisioning Precise Risk Models Through Big Data Analytics
by Pranali Humne
Do you nd it irritating that we dont know what a real risk is, or understand how to capture
a risk? Do you constantly ght the nag that you dont run a water-tight operation? Do you get
insecure when your Risks are moving targets? Are you investing in the right mitigation practices?
This Knowledge Sharing article opens up into a view to a probable endeavor of how all these
2014 EMC Knowledge Sharing Book of Abstracts 32
muddled undercurrents can become placid and predictable with correct use of evolving yet
accurate statistical models and technologies.
Successful Service Delivery largely depends on how the end-to-end Risk Management practices
are embedded. Most risks assessed today are largely driven by ones perception of how an event
can affect the status quo. In other words, this means that the deciphering risk is the function
of these soft parameters that are practically immeasurable and intangible. The shadow of this
subjectivity then derives the mitigation and prevention approach as well. The involvement of
perception in such a critical process makes it very hard to understand the holistic view of the
business or the services pans into the unknown risk elements as a result.
These lead to the obvious and inevitable consequence of being unprepared at those crucial
unknown moments that can take a well-run service delivery into catastrophic consequences
all this CAN and MUST be avoided. The ill effects can lead to loss of customer trust, equity,
reputation, revenue, investment, and life.
Using Big Data Analytics, these types of situations can be potentially avoided. Risks can be
analyzed as a combination of historical data and various environmental/cultural parameters.
The risks are forecasted statistically based on real time analysis based on variables such
us service model, delivery structure, operational efciency, technological constraints, and
environmental or cultural factors. Multiple Risk assessment techniques with diversied data are
used to arrive at an accurate assessment of Risk. Enterprise-wide scalability and adaptability
assist in reliable evaluation of Potential Threats.
A co-relation between these factors/scenarios is being worked upon based on any of the following
techniques:
Structured What-if Technique
Markov Analysis
Monte Carlo Simulation
Evaluation of Alternatives (EoA)
Details in this Knowledge Sharing article illustrate the sample simulation against each of these.
This is then extrapolated into occurrences from different scenarios with their predictability.
With statistical tools, the accuracy of potential or imminent risks can be ascertained and
learned. The identied risksclassied as quantitative or qualitative risksare then prioritized
understanding the nature, magnitude, and impact of the risks. The identied risks can also be
prioritized based on organizational risk appetite. A mathematical model is developed based on
the Probability of occurrence, Impact, and Exposure, of risks. This model is subjected to rigorous
statistical analysis.
This gives us an almost unbiased, frequency-driven and predictable output that can help take
us into associating failure modes for each output arrived. A thorough, personnel and operational
life-cycle bearing output will greatly help building an axiomatic strategy and reduce the very
expensive salvage from back drafts because of ineffective reghting.
2014 EMC Knowledge Sharing Book of Abstracts 33
We can then explore and present with collective intelligence the various opportunities we must
use to enable successful Risk mitigation. This will thereby enable a Service Delivery Organization
to earn trust, instill greater business sense and, in its course, make the business, its customers,
and its employees, successful by efcient practices of prevention fueled by continues learning.
BUSINESS PROCESS
TCE @ iCOE GPS
by PradeeMathew, Ranajoy Dutta, Jayasurya Tadiparthi, Gayathri Shenoy, and Athul Jayakumar
Total Customer Experience (TCE) is a term that has been in existence for many years. As EMC
personnel, we strive to achieve TCE wherein some succeed and others falter. What if there was a
surere way to garner Six Sigma levels of TCE? A way to ensure that our Customer(s) receive the
very best service time and time again!
Project Managers in particular must realize that TCE not only affects the end customer but also
the different workgroups, entities, and personnel within an organization; the term Customer
encompasses all the aforementioned groups as each group is essentially a different customer. It
is imperative that all groups are dealt with exactly the same way to facilitate a coherent approach.
Case in point, the following personnel can be classied as Customers:
Pre-Sales
Partners
Resource Management
Delivery
Customer Support
Miscellaneous third party
First, how can Project Managers streamline and simplify the process within the EMC project
management community to attain high levels of customer satisfaction? What changes are needed
to alleviate low TCE levels? Should one be looking at a one-off solution or a sustainable model
that is capable of delivering enhanced TCE levels resulting in customer delight?
Second, is it better for Project Managers to work independently in their own style or is it
preferable to have a standardized uniform approach when dealing with multiple projects and a
wide range of customers and volatile temperaments?
Third, a good work/life balance is essential for personal and organizational growth. Empowering
Project Managers to take time off to recharge is a no-brainer, but what steps are taken to ensure
that TCE remains at an all-time high even in the absence of the primary Project Manager?
Fourth, self-realization is the rst step toward attaining high quality deliverables; therefore, we
must ask ourselves the following questions:
Is the correct process being followed by all the Project Managers?
Are all Project Managers cognizant of the project management best practices within EMC?
Does the existing project management methodology have any signicant gaps?
Are we working as a cohesive unit within the iCOE GPS community?
2014 EMC Knowledge Sharing Book of Abstracts 34
How can we ensure that only deliverables of the highest quality reach our Customer(s)? Do we
need to measure quality or do we leave it to chance?
Last, it is said that a foundation needs to be rock solid for a structure to prevail against the
elements. In the context of project management, how are we tracking project assignment,
workload, utilization, and revenue forecasts? Is there a necessity to track such criteria and how
would one go about reporting the ndings? Manual and tedious methods to track measurements
are time-consuming and error-prone. How can we overcome such menial jobs?
At iCOE, we have devised a Five Point System that elevates TCE to Six Sigma levels within the
project management community by simplifying, streamlining, and automating the prevailing
process. This Knowledge Sharing article provides a deep dive into the project management
lifecycle to examine, diagnose, and correct the aforesaid issues to bring about unprecedented
levels of TCE that echo EMCs core values INTEGRITY, EXCELLENCE and, above all,
CUSTOMERS FIRST!
Smart Talent Acquisition
by Trupti Kalakeri
One of the greatest challenges faced by organizations is nding the right talent. A bad hire can
cost up to 3.5 times the salary of the open position. Plus, its impact can be felt at many levels and
in many ways. It affects the immediate team and hiring managers, morale suffers, productivity
drains, and so on. Taking a Smart Hiring approach is the only solution for having right people!
Some of the benets of adopting a Smart Hiring approach include:
Reduced recruitment cost due to less manual intervention
Identify quality candidates
Helping to nd the right talent at the right time
My Knowledge Sharing article will explain how to leverage data analytic technologies to hire the
right talent in any organization.
CLOUD
Impact of Cloud Computing on IT Governance
by Declan McGrath
Cloud computing has caused a major shift in Information Technology (IT) architecture, altering
the way services are sourced and delivered. Cloud computing represents a growing evolution in
IT in which core IT services are becoming sliced and diced across many providers. Organizations
are under pressure to improve efciencies and cut costs by using collaborative solutions and real
time information exchange.
Constant evolution of IT has helped organisations automate and innovate thus providing a
competitive advantage in the global marketplace. Early adoption of Cloud Services can provide
these organisations with an opportunity to transform their business model and gain competitive
2014 EMC Knowledge Sharing Book of Abstracts 35
advantage. While cost reduction is one of the benets there are a number of others, such as the
rapid deployment of services to allow the organisation to capitalise on opportunities that may
otherwise be lost. Organisations can then concentrate on their core competencies while cloud
providers focus on running their IT infrastructure.
A move to the cloud, however, requires a well-planned strategy as there are many business
and technical constraints that need to be mitigated. IT Governance and regulation are required
to clear any doubts with regard to security and management of the organisations data. IT
Governance is part of a wider Corporate Governance activity but the pervasive use of technology
has created a critical dependency on IT that calls for a specic focus on IT Governance.
Management and security of data are common concerns for organisations and any organisation
that wants to move their services to the cloud will ask, How secure is my data and service?
The challenge for organisations when adopting cloud services is to understand the maturity and
robustness their IT Governance framework.
The purpose of this research is to understand the impact of cloud computing on ITIL and to
answer the following: What is the Impact of cloud computing on IT Governance?
The literature review examined both the concept of cloud computing and IT Governance with a
specic focus on ITIL and was combined with a review of current thinking in this eld.
The basis for this research was a single case study, where the Case Company is a global provider
of outsourced development services.
Three objectives were set out initially for this research.
1. What considerations are taken into account by companies when they move their data or a
clients data to a Cloud Service Provider (CSP)?
2. What types of Cloud Readiness Assessment, if any, are taken by companies?
3. How do you match your IT Governance framework to the IT Governance framework of a cloud
service provider?
Having met each of these objectives, it is envisioned that the research will:
1. Create a Cloud Readiness Assessment document which will prepare a business for a move to
a cloud service provider and assess its suitability.
2. Align Company As IT Governance Framework to the cloud service solution which ensures
there are no gaps which could potentially pose a risk to the business.
A series of semi-structured interviews were conducted and, having met the three objectives
set out in the report, a Cloud Readiness Assessment was created. Subsequently, this Cloud
Readiness Assessment was then tested on a cloud service provider. The research found that when
the company was initially selecting a Cloud Solution, its focus was outwards at the cloud service
provider as opposed to inwards at the company and its IT Governance framework. As a result,
a number of gaps in the companys IT Governance Framework were identied, which presented
various levels of risk to the business.
Through this research and the use of the Cloud Readiness Assessment, the company is now in
a better position to align future cloud service offerings to their IT Governance framework and
mitigate against future risks to the business.
2014 EMC Knowledge Sharing Book of Abstracts 36
Moving forward, it is envisioned that this Cloud Readiness Assessment could be used by
other industries as they prepare to adopt cloud services and align them to their IT Governance
Framework (ITIL).
Incorporating Storage with an Open Source Cloud
Computing Platform
by Ganpat Agarwal
Have you ever worked on a cloud computing platform? Was it vendor-based or open source? If
vendor- based, you probably faced a number of compatibility problems. If you have been faced
with this problem, you might have given thought to having a cloud computing platform in which
endless features can be incorporated without compatibility issues and which can be managed as
you choose. If you are looking for such a solution, try OpenStack.
OpenStack is an open source cloud computing platform which uses object and block storage
features from the storage perspective.
EMC, the market leader in software dened storage, is well-positioned to participate in the
OpenStack open source program. In addition to providing its own storage solutions, EMC can
leverage OpenStacks many outstanding features.
As an open source platform, OpenStack is free from the most common problems that customers
face with other cloud computing service providers; one of the biggest being vendor lock-in.
Choosing to be part of the OpenStack community allows participants to interact and discuss
problems with our other counterparts to know their high level views.
This Knowledge Sharing article introduces the storage aspect in open source cloud computing.
It details features of the EMC products used in this work and the benets obtained from the
OpenStack platform.
Patterns of Multi-Tenant SaaS Applications
by Ravi Sharda, Manzar Chaudhary, Rajesh Pillai, and Srinivasa Gururao
Multi-tenancy is a design concept in which a single shared instance of a system serves multiple
customers (or even multiple entities/organizations of a single customer). Software-as-a-Service
(SaaS) is a software delivery method in which a hosted software application services the
applications functions for multiple customers, eliminating the need for individual customers to
deploy and maintain the application on-premise. The relationship between multi-tenancy and
SaaS is one of an enabler: multi-tenancy enables SaaS.
While the notion of multi-tenancy pre-dates SaaS, techniques for implementing multi-tenancy
have become widely discussed only since the advent of SaaS. Despite the plethora of articles
available on the Web, many architects and designers nd it difcult to gain an understanding
of the implementation nuances involved in implementing multi-tenant software. This is where
cataloging patterns for implementing multi-tenant SaaS applications would help.
2014 EMC Knowledge Sharing Book of Abstracts 37
This Knowledge Sharing article aims to identify and describe several architectural, design, and
implementation patterns of multi-tenant SaaS applications. For each pattern, we explain what
problem it solves, how it works, when to use it, and its pros and cons.
Examples of patterns covered in the article include:
Architectural: Database-per-Tenant, application-instance-per-tenant, metadata-driven
architecture, etc.
Design: Tenant Context, Extension Table Layout, Tenant Resolver, etc.
Implementation: Tenant Filter, Connection-Pool-per-Tenant, Sub-Domain-Per-Tenant, etc.
VM Granular Storage - Rethinking Storage for Virtualized
Infrastructures
by Alexey Polkovnikov
When we deal with traditional storage systems we mainly use LUNs and File Systems as
provisioning and management units. We talk about LUNs for the block-wise access, FS for the
le-level, and both when it comes to unied storage. For iSCSI, FC, and NFS (CIFS), the level of
granularity is LUNs and les. What is wrong with the storage granularity? Why are good old
LUNs and les not enough now?
As we know, the infrastructure has changed dramatically; now it is increasingly virtualized. Virtual
server is a default deployment option (over physical server) for many enterprise deployment
policies nowadays. Different industry reports and estimations indicate that virtual server
deployments are surpassing physical server deployments during the last 4-5 years.
So what? Infrastructure virtualization is not a secret (and storage virtualization also is not). The
point is that traditional storage arrays that are very effective when working with the physical
servers are becoming not that good when the servers are virtualized.
Storage arrays are smart enough in single host I/O pattern understanding when a LUN is
dedicated to a single host or when it is shared and there is a possibility to track the hosts with
some identier (like World Wide Name). Caching techniques that are helping the array to improve
its external performance can be effectively used in this case to pre-fetch the data from the
internal drives (and yes, FLASH-based drives are still too expensive to hold a signicant part of
arrays capacity, so there are still a lot of mechanical hard drives inside).
These performance improvements fall short when it comes to the virtualized hosts, due to the
fact that I/O from the different VMs is coming the same way and storage is unaware of this. Data
protection/copying/movement features are also greatly impacted by the fact that the same LUNs
are used by numerous virtualized hosts. LUN level locking is a signicant issue for such cases.
Here the emerging concept of the VM granular storage comes into play. This is a rethinking of the
storage arrays which takes into account the virtualized infrastructure reality and makes VMs rst
class citizens for the storage systems. WMware vVols (Virtual Volumes) is a solution of this type.
2014 EMC Knowledge Sharing Book of Abstracts 38
This article about VMware Virtual Volumes technology concepts, history, and context will be of
particular interest to those involved in storage infrastructure planning, decision-making and
implementation.
A New Era for Virtualization and Storage Team Collaboration
by Moshe Karabelnik
A few years ago, at the beginning of the server virtualization era it was very simple: Usually there
were two teams: storage team and server virtualization team. Virtualization team asked for a
specic storage capacity for specic purpose/workload and thats all. The storage team created
the LUNs or NFS le systems and gave it to the virtualization team. Roles were clear to each of
those teams.
Today, the situation is much more complex. New technologies and features emerged on both
the storage side and the virtualization sidesuch as Storage Distributed Resource Scheduling
(SDRS), VStorage API for Array Integration (VAAI), Storage I/O Control (SIOC), Thin Provisioning,
Snapshots, and Clones.
Those new technologies and features put into question the exact roles of the server virtualization
team and the storage team not to mention that the interaction and collaboration between those
teams should be much tighter.
A few examples of questions and thoughts arise: Should we thin provision storage on the
virtualization level or on the storage level? Should we use SIOC? Should we implement VAAI?
Which of the teams should take care of clones and snapshots?
In this Knowledge Sharing article I will discuss the relevant technologies and features involved,
explain the collaboration between the storage team and the server virtualization team regarding
each technology or feature, and suggest what should be the tasks for each team regarding those
technologies while constructing the infrastructure for the rst time as well as on a regular basis.
Following discussion of the above, I will suggest a methodology that will give a clear view for each
team of its relevant tasks.
This article will be of interest to any storage team member or server virtualization team member
that would like to fully utilize the systems under their responsibility by enhancing collaboration
with the other team.
While this article will focus on the technologies and features especially related to EMC VNX
storage systems and VMware VSphare server virtualization environment, the experienced IT
professional could adopt it for any other technology as well.
2014 EMC Knowledge Sharing Book of Abstracts 39
NAS Storage Optimization Using Cloud Tiering Appliance
by Brijesh Das Mangadan Kamnat
Managing unstructured data growth is a key challenge that many organizations face today. To
catch up with this data growth, organizations must continue investing in their storage real estate
which increases CapEx and reduces prot margins. With shrinking IT budgets, organizations are
now exploring opportunities for optimizing their existing infrastructure and ensuring that they
derive maximum return on investment (ROI). However, data growth is only one part of the problem
as most of the data that is created is barely accessed over a period of time; add to that analytics,
along with regulatory and compliance requirements that force organizations to retain this inactive
but precious data for longer periods of time.
Optimization strategies such as Hierarchical Storage Management (HSM) can help organizations
optimize their primary storage and ensure that inactive data moves across different storage tiers
during its lifecycle to free up precious storage space. The HSM solution can be on-premise, cloud-
based (public), or a combination of both. An on-premise archiving solution might require upfront
investment but it allows organization complete control of their data along with enabling use of
commodity hardware as a secondary tier. A cloud-based archiving solution provides a perfect
blend of pay-as-you-grow model with little upfront investment and the ability to retain inactive
data for longer duration of time. However, cloud adoption has been slow within organizations
due to concerns about data security and privacy; also a thorough ROI analysis would reveal that
public cloud as an archiving tier for the long term can be a costly affair. Besides the complexities
associated with the deployment model there is another set of challenges associated with manual
HSM or archiving techniques, the rst being the identication of inactive data and the second
being the movement of data across tiers when it is being accessed.
The need of the hour is an automated HSM solution that can identify inactive data and move
them through different storage tiers completely online without any disruptions to end users. EMC
Cloud Tiering Appliance (CTA) provides a completely automated approach for policy- based le
tiering with minimum user intervention. In addition, CTA can also be used for migrating data from
multiple unied storage lers or le servers to EMC storage.
This Knowledge Sharing article provides a HSM solution using CTA which allows for moving
inactive NAS data based on user-dened policy to secondary tier storage, including public cloud
service providers such as AWS. Key points covered in this article include:
Evaluate typical use cases for CTA deployment
Solution Architecture for deployment
Conguration and Execution workow
Using CTA for NAS migration
Share key observations related to time taken for archiving and recall for different data sets
2014 EMC Knowledge Sharing Book of Abstracts 40
Cloud-Optimized REST API Automation Framework
by Shoukathali C K and Shelesh Chopra
In this Knowledge Sharing article, we describe a generic automation framework tool for REST-
based application that we have developed. The tool is focused primarily on:
Portability to different products as it is generic and can be expanded across
Extensibility Can add layers to support the future request
Modularity To allow plug-and-play of separate intelligent modules
Intelligence by leveraging a cloud-optimized solution as well as analytics to store and predict
various requirements
Migration From Legacy to REST API-based framework
Abstraction and Simplied What is to be automated is the users choice with a high degree
of abstraction to how it is automated
The framework is designed is such a way that it can be used across platform and across
organization regardless of application build based on REST protocol.
DOCUMENT MANAGEMENT
Cloud-Based Dashboard Provide Analytics for Service Providers
by Shalini Sharma and Priyali Patra
Management of an individuals documents is very difcult as the number of documentsfrom
phone bills, to E-tickets, to insurance policies, to bonds, and moreincreases daily. The task
becomes even more difcult when it involves management of a group of people, such as family,
friends, etc. While there are a few private cloud services available where documents can be
placed for future reference, these services only store the documents, not organize it. Imagine
having a dashboard that would collect all the documents from personal laptops, mail boxes, and
other sources, organize it, and tag it intelligently. The trick is to do this with minimal manual
effort.
This Knowledge Sharing article discusses how to build a cloud-based solution which can acquire
documents from various congured sources and tag them intelligently, such as owner name,
monthly bill, service provider for the family, and so forth. Such smart dashboards manage
documents in the cloud for the entire group and is accessible from personal devices such as
mobile phones, personal laptops, tablets, etc.
This is one example of an intelligent approach to collecting various kinds of documents in large
volume and performing analytics on it. This analysis can predict and reveal information about
various preferences by the group and hence can be extrapolated to predict for a large amount
of people. For example, this analysis can predict service provider popularity in respective areas
as per real data. Likewise, this is also helpful for the service provider to analyze their targeted
customers or get their view on newly launched services provided by the service provider. This
also provides a wealth of data for future analysis and can also co-relate between groups and the
services they prefer.
2014 EMC Knowledge Sharing Book of Abstracts 41
The advantages of this application can be extended to documents such as telecommunication-
related services, property document management, legal case document management, and so
forth. An example of such a service is this: suppose the application holder forgets to pay the
phone bill, this application can send an alert or a reminder to the application holder about the
impending bill. It can also manage SMS, mails, receipts, images related to payments, and so on,
in effect providing a non-physical folder to encase all the documents and manage it.
Finding Similar Documents In Documentum Without Using a Full
Text Index
by M. Scott Roth
This Knowledge Sharing article will discuss how to congure Documentum to enable
identication of syntactically similar content without the use of a full text indexing engine. The
technique described utilizes a Java Aspect to calculate SimHash values for content objects and
stores them in a database view. The database view can then be queried programmatically via an
Aspect or by using DQL to identify content similar to a selected object.
Many systems that identify similar content do so by storing a collection of ngerprints (sometimes
called a sketch) for each document in a database with other sketches. When similar content is
requested, these systems apply various algorithms to match the selected contents ngerprints
with those stored in the database. Full text indexing solutions also require databases and index
les to store word tokens, stems, synonyms, locations, etc. to facilitate identication of similar
content. Some full text search engines can be congured to select the most important words from
a document, and build a query using those words to identify similar content in its indexes.
The solution I discuss in the article condenses the salient features of a document into a
single, 64-bit hash value that can be attached directly to the content object as metadata, thus
eliminating the need for additional databases, indexes, or advanced detection algorithms.
Similar content can be detected by simply comparing hash values.
INFRASTRUCTURE MANAGEMENT
Eliminating Level 1 Human Support from IT Management
by Amit Athawale
Cost effectiveness, value-add, and reduction in human resources are constant challenges of IT
infrastructure management. The pressure of service delivery often impacts employees as well
as management, not to mention the overhead due to employee turnover. Furthermore, there is
always a demand to move resources up the skill ladder so that the coveted employee pyramid is
achieved.
This Knowledge Sharing article discusses how to address these challenges and make
infrastructure management more effective by incorporating automation starting from routine
Level 1 (L1) activities such as reporting, health checks, alerting, etc. Though mundane, these
activities are expected to be done without human error and at specic times of the day. These
make them highly prone to errors.
2014 EMC Knowledge Sharing Book of Abstracts 42
Therefore, it makes good business sense to automate L1 activities. Through automation, L1
workload is reduced thereby freeing staff for next level activities. This also helps keep the
employee skill pyramid balanced.
This article will help in understanding different stages of automation in IT service and an
approach to Integrated automation. Case studies used in this article will show how easy it can
be to automate activities by simple scripts and Excel spreadsheets. We will demonstrate how
a reduction in L1 workload by close to 50% was effectively transferred to performing Level 2
activities. This was more effective due to the Integrated approach to automation.
This article will help IT professionals and management leaders recognize the value of automating
L1 activities to improve operational efciency and maintain the employee skill pyramid.
Guide to Installing EMC ControlCenter on a Windows 2008
Cluster
by Kevin Atkin
In this Knowledge Sharing article, I will share an installation and conguration guide for new
installations of EMC ControlCenter installed in a Windows 2008 Clustered environment. It will
also include details of Unisphere for VMAX element manager installation on one of the EMC
ControlCenter nodes.
This Knowledge Sharing article is divided into three sections:
Section 1: pre-requisites and access requirements.
Section 2: hardware and software required to install EMC ControlCenter
Section 3: EMC ControlCenter conguration steps
This article assumes that the conguration will be done in conjunction with the referenced EMC
ControlCenter OAT process and that the reader is qualied to use EMC management tools and/or
Solutions Enabler CLI.
SECURITY
Storage Security Design - A Method to Help Embrace Cloud
Computing
by Ahmed Abdel-Aziz
With many organizations rushing to embrace cloud computing, security professionals seek tools
that enable them to guide organizations on their cloud journey. This Knowledge Sharing article
starts with an introduction to cloud computing: the tenets, service models, and deployment
models. It suggests a process that can help answer Is cloud for me? and explains the tight
relationship between cloud computing, virtualization, and shared storage. To address the
pressing need of data-centric security, the research introduces storage security along with its
foundational element of data classication as an important converged discipline. To overcome
2014 EMC Knowledge Sharing Book of Abstracts 43
the data classication challenge, the research proposes an automated approach for data
classication. This paves the way to adopt technology-neutral and technology-specic best
practices for storage security design. The article concludes with a real-world solution that helps
apply the suggested best practices, and which enables mobility and the strategy of bring your
own device (BYOD).
Secured Cloud Computing
by Gnanendra Reddy
Cloud computing is a model used to describe a variety of computing concepts that involve a large
number of computers that are connected and share resources over the Internet. While cloud
computing offers many advantages, it remains susceptible to security breaches.
Major threats to cloud security are Distributed Denial of Service attacks (DDoS) and Man-in-
Middle (MIM) attacks.
DDoS Attacks attempt to make a resource unavailable to its end users. DDoS attacks typically
target the services hosted on high prole webservers like credit card payment gateways, bank
sites, etc. In a DDoS attack, the incoming trafc will ood the victim from many different sources,
potentially hundreds, thousands, or more.
This effectively makes it impossible to stop the DDoS attack simply by restricting a single IP
address; and, it is very difcult to distinguish normal user trafc from attack trafc when spread
across so many points of origin.
Man-in-Middle Attacks intrude into an existing communication and inject false information by
eavesdropping, intercepting and selectively modifying the data.
MIM Attacks are referred to as session hijacking attacks since the intruder gains access to
a legitimate users session in order to tamper it. The attack initially starts with snifng and
eavesdropping on a network connection and trying to alter or reroute the intercepted data.
It is essential to research and develop counter measures for DDoS and MIM attacks. In my
Knowledge Sharing article, I discuss these and other security issues and challenges and offer
solutions to improve availability of resources and enhance the security of information in the
cloud.
Securely Enabling Mobility and BYOD in Organizations
by Sakthivel Rajendran
Increasingly, users are accessing enterprise information via multiple mobile devices including
smart phones and tablets. Everyone is talking about BYOD (Bring Your Own Device) for work, and
also accessing corporate information on mobile devices. Users are demanding access to corporate
information from any device, any location, and any application, at any time.
While BYOD and Mobility offer exibility to the workforce and agility in business operations,
mobile devices lack robust security features which we would expect in a modern device.
2014 EMC Knowledge Sharing Book of Abstracts 44
Still, failure to embrace the emerging BYOD trend might force employee users to circumvent
established organizational controls.
To mitigate security concerns, an Enterprise Mobility Management (EMM) approach provides
balance between security and productivity
This Knowledge Sharing article aims to educate readers to overcome security risks associated
with mobility and BYOD. Readers will be introduced to best practices to secure the ve
components of BYOD and Mobility.
1. Device
2. Application being used in the device
3. Data handled in the application
4. Network used to connect the device in the organization
5. Users
Also discussed in this vendor-neutral article is Access control matrix, Mobile Device Management
(MDM), Mobile Application Management (MAM), Secure Development Lifecycle (SDL), Network
Access Control (NAC), Virtual Desktop Infrastructure (VDI), secure containers and App wrapping.
These technical controls can be deployed as an additional layer to administrative controlssuch
as IT security processesto offer organizations a level of assurance that its business information
will be protected from security risks
Highlighting the importance of having a strategy prior to embarking on the BYOD journey, this
article will be of help to anyone involved with mobility and BYOD programs in an enterprise
including Network security, Enterprise Architect, Information Security Architect, IT security
Manager, Application architects and developers, Audit and compliance, System administrators,
program managers, and Security Operations personnel.
An Approach Toward Security Learning
by Aditya Lad
Security is a mystery for those who do not know about it. Half and incomplete knowledge often
leads to misplaced decisions and wrong prioritization. The vastness and deepness of security
areas makes it even harder. Using this Knowledge Sharing article as a learning tool, IT and
security professionals can segregate their approach toward learning different security areas.
The rst part identies a learning approach, effective ways of practicing, and use of the right
tools. The second part categorizes various types of security areas that are generally different in
their security concepts, their ways and motives of exploitation, and which require specialized
knowledge of the respective elds.
This article focuses on professionals who are in senior positions, who want to understand
security better to make conscious decisions. It serves a help for those who are overwhelmed by
the security incidents, want to make an expertise but do not have the clarity or the direction. It
intends to serve as a learning guide to those who are at beginner level.
2014 EMC Knowledge Sharing Book of Abstracts 45
Many times there are situations when you do not understand the actual severity of the issue
reported. In the absence of a security expert, this misjudgment may lead to either losing focus on
the associated risk or wrongly prioritizing the effort. This Knowledge Sharing article tries to bridge
the gap between what we know about computer working and what exactly happens on the wire. It
discusses the various forms of attacks that we hear while working with security and also tries to
formalize an approach for learning about computer security.
This article talks about the approach, hurdles, and dilemmas professionals face to learn the
rapidly evolving security industry. The dynamic nature of security ensures that there is no single
textbook that can teach you everything, however the basic principles and learning approach
stays universal. Every products security is only a subset of the larger security domains. Having a
thorough and knowledgeable understanding helps its application into newer evolving areas.
Computer security should never be ignored. The popular excuse that we hear from people is that
they dont have any idea about security. Thats why they do not invest into their product security.
Thats why they follow more of a reactive approach rather than a preventive one. Thats why they
wait for something to happen rst rather than knowing what is the degree of something. You
dont know about security what an excuse.
STORAGE
2014: An SDS Odyssey
by Jody Goncalves, Rodrigo Alves, and Raphael Soeiro
Companies continually work to grow revenue while shrinking operating costs. This is not a new
phenomenon; therefore different parts of the business must constantly evolve, including IT.
Within IT organizations, Storage Administrators face challenges such as adapting to continuous
data growth, the increasing number of users, and shrinking IT budgets.
Additional pressures arise as companies adopt new IT and/or business models, such as web
and mobile applications, and Software-as-a-Service (SaaS). These new models call for an
unprecedented level of agility that traditional IT storage environments cannot support, because
these environments tend to have some or all of the following characteristics:
inefcient array utilization
reactive storage management
human error
on-demand storage requests/needs
inability to address performance bursts, lack of following best practices
lack of chargeback systems
This Knowledge Sharing article will discuss these challenges and explain why they exist in
traditional storage environments.
Software Dened Storage (SDS) infrastructure introduces a solution that addresses most of
the challenges that storage administrators face when attempting to meet todays storage
requirements while using traditional storage environments. SDS provides a mechanism to
2014 EMC Knowledge Sharing Book of Abstracts 46
abstract the storage layer across multiple heterogeneous arrays along with the ability to create
service catalogs that enable end-users to allocate their own storage, on-demand. This article will
explain why and how SDS addresses the challenges of traditional storage environments.
An organization that decides to adopt an SDS solution must complete three key design steps
before they can successfully implement such a solution:
1. assess business requirements
2. dene storage tiers and resource pools
3. dene services that will be placed in the service catalog
This article will dene and explain the importance of each of these areas and how they relate to
SDS.
With that, the authors will focus on the importance of the interrelation between the storage tiers
and the service catalog. While storage tiering exists today for traditional storage environments,
the methodologies and considerations need to be redened for SDS environments. With end-
users allocating storage in a self-service manner via the service catalog, the importance of proper
storage tiering is even more critical. The authors will establish a methodology for implementing
storage tiering and classication for SDS.
This article will also discuss a hypothetical scenario that highlights the business requirements of
an organization along with a depiction of a traditional storage environment. The authors will walk
through the use case, assess the business needs, and will utilize the established methodology to
tier the storage and a sample service catalog, thus culminating in a SDS environment.
Review Storage Performance from the Practical Lens
by Anil C. Sedha and Tommy Trogden
Many organizations develop their infrastructure vision with a fresh approach based on current
technologies or by using vendor recommendations. As time goes by and the storage architecture
is stressed, these organizations nd performance constraints which were not evident to them
earlier. Some begin to wonder if they selected the correct storage platform.
As more organizations move toward Big Data, virtualization, or cloud computing, pressure to
control budgets and project cost has started to affect the innovation that was originally promised.
Add in a new strategic project that just became a priority and all of a sudden you nd yourself
stuck from all angles; the business thinks they gave you everything you needed and limit changes
to scope/budget, senior leaders think you didnt plan well, and procurement processesas
usualare time consuming. Meanwhile, the IT architecture deteriorates at a much faster pace,
cannot scale non-disruptively, and you are now struggling to catch up.
The words faster, cheaper, and better start coming back to mind and they remember what
the storage vendor predicted You will never have a problem! However, reality is sometimes
different and the dwindling IT budget makes things more complicated. In that scenario, how
many of you have asked or were asked, Why did we not foresee this problem and how did our
predictions go wrong?
2014 EMC Knowledge Sharing Book of Abstracts 47
Over the many years that we have focused on IT Infrastructure design and architecture there is
one thing to always remember; a theoretical architecture or design should accommodate practical
design principles. When new requirements arise, think of at least a few worse case scenarios
change in workload type (random/sequential), data throughput, and data synchronization
to name a few. Look much deeper to analyze hidden metrics and then think of how the use of
that data will change over time. Additionally, does the business operate a 24x7 mission-critical
environment that will directly affect your ability to implement large scale changes? Is your vendor
of choice innovating fast enough?
These questions, along with the rate of data growth and change in the type of data require
critical thinking and analytical responses that would make/break your infrastructure design. As
the focus shifts to a software dened data center some cloud providers have gone ahead and
declared compute and storage as cheap commodity items. However, there is no one size that
ts all, and absolutely nothing in the enterprise comes for free. The true benets are always
in the scale out and shared use mechanisms of any IT environment. As data compression
algorithms have improved, deduplication and caching have become more acceptable to
organizations and are quite efcient as well. However, considering the phenomenon of explosive
data growth and change in type of data, storage engineers today must focus on innovative
data management techniques that ensure performance is not a barrier to the success of your
organization.
This Knowledge Sharing article explores many of the challenges faced by storage engineers and
offers practical tips for improved design and architecture that meets todays requirements. We
will discuss topics such as performance criteria, capacity vs. performance, cost impacts, future-
proof architecture, data compression, data replication/protection, cloud computing strategies,
and much more.
Enabling Symmetrix for FAST Feature Used with FTS for
3rd Party Storage
by Gaurav Roy
On occasion, customers perceive Symmetrix as a large capital involvement. This sentiment can
countered by extending the features of Symmetrix fully automated storage tiering (FAST) feature
to other storage arrays already existing in the customer site.
Doing so would enable the footprint of EMC grow in a well-dened way. Additionally, customers
can save money in terms of full tech refresh being carried out in the data centers. This would
ensure that customers are able to leverage the power of Symmetrix by enhancing their existing
infrastructure with a minute change.
This Knowledge Sharing article explores the challenges facing IT storage managers and offers
insight into implementing a course of action to provide budget relief, while providing better
services to internal and external customers. We discuss in-depth extension of Symmetrix features
across other approved arrays topped with automated tiering offered by Symmetrix.
2014 EMC Knowledge Sharing Book of Abstracts 48
How I Survived EscalationsBest Practices for a SAN
Environment!
by Mumshad Mannambeth and Nasia Ullas
During our tenure as Storage Administrators, we have been part of accounts supporting a
number of large and small customers. We have witnessed numerous incidents occurring in these
accounts on a daily basis. While a majority of these incidents have a low impact, at least a few
of them tend to seriously upset the customer. This Knowledge Sharing article analyzes the most
common causes of high severity incidents in a Storage Environment and what best practices can
be implemented to prevent these from occurring. We also assess the risks involved in various
processes in the SAN maintenance tasks and identify high risk activities.
The article combines best practices followed in various major accounts across the globe.
These best practices can reduce the major causes contributing to the most number of high
severity incidents. These solutions are simple, automated, and can be easily deployed in any
environment, as these do not require any additional software to be installed. By assessing risk
involved in each stage of a storage maintenance task, we can identify high risk activities and
implement checks on them. This enables the storage administrator to be extra cautious on
certain tasks.
Virtual Provisioning and FAST VP Illustration Book
by Joanna Liu
As a game changing technology for improving capacity utilization and automatically optimizing
performance in a tiered environment, Virtual Provisioning is adopted gradually by most of EMC
customers. It takes quite an amount of time, however, for beginners to grab the concept in detail
and become condent with management tasks through hundreds of pages technical guides.
Therefore, Im endeavoring to explore a new way to deliver this technical knowledge.
Using the popular Infographic format as much as possible, the illustrations in this Knowledge
Sharing article explain almost everything about Virtual Provisioning, ranging from the basics
to technical details. It includes illustrations of complicated concepts, step-by-step creation of
Virtual Provisioning architecture, calculation of subscribed and allocated space, etc. The following
are some examples that this Knowledge Sharing article covers:
How to set up a Virtual Provisioning environment
The difference between subscribed and allocated space of a thin pool, and how to calculate
How automated pool rebalancing works
How to reclaim unused space in a thin pool
The visual representations, together with corresponding Solutions Enabler command lines, turn
sophisticated technical information into concise knowledge, allowing readers to master the
cutting-edge technology more efciently. Not only are beginners able to grasp knowledge more
quickly, but those more experienced can also get a comprehensive view of virtual provisioning
which they couldnt nd in technical guides. In addition, the illustrations in the article can serve
as a best reference guide when users need it.
2014 EMC Knowledge Sharing Book of Abstracts 49
Simplifed Storage Optimization Techniques for VMware-
Symmetrix Environment
by Janarthanan Palanichamy and Akilesh Nagarajan
Are you struggling to optimize storage for your VMware-Symmetrix environment?
Are you confused about choosing better methods/tools to do Storage Optimization?
Do third-party tools that you have invested in just add to the complexity and confusion and dont
provide any real optimization benets?
It is easy for IT managers to get lost in the details given that there are numerous tools and
articles which discuss Storage Optimization. What IT managers really need to know is the various
simplied techniques to do virtual workload modeling and device modeling to derive maximum
benets.
As engineers working with Symmetrix, for over the years, we have learned few Simple techniques
which could be employed to maximize performance for VMware-Symmetrix environments.
This Knowledge Sharing article explores some of those techniques to optimize Storage
Performance in VMware-Symmetrix environments. It explains simple techniques to:
Collect Performance metrics using native tools (FAST/SIOC) and provide insights for making
decisions to have a balanced environment.
Reduce I/O latencies between hosts and the storage arrays.
Identify I/O contention issues and ways to address them.
Make I/O load balancing decisions based on EMC FAST information.
Perform predictive VMware Storage DRS to prevent IO boot storm situation.
Though this article discusses about Storage Optimization techniques for VMware-Symmetrix
environments, it can be easily adapted to other environments having different virtualization
software and arrays as well.
The Need for VPLEX REST API Integration
by Vijay Gadwal and Terence Johny
VPLEX features, like non-disruptive data mobility across heterogeneous arrays, data mobility
across sites and highly advanced cache coherency algorithms, dissolve physical barriers in and
between data centers, to open up a whole new approach to data center design.
While VPLEX as a product is designed to work with the existing array-based replication tools
with VPLEX acting as the Virtualization layerthere are best practices and special considerations
that need to be accounted for when the existing replication scripts need to be leveraged.
In this Knowledge Sharing article, we have devised a very simple and elegant solution that
will leverage the REST API interfaces already supported on VPLEX and that would require no
modication to the existing customer array based replication automation scripts. The article also
2014 EMC Knowledge Sharing Book of Abstracts 50
details how this approach can be extended easily to similar use cases on VPLEX involving snaps,
clones and provisioning operations.
We also describe the architecture of the solution and explain in detail the technical components
such as the operations supported using REST APIs, the steps necessary for the array-based
replication, various modules used, and the technologies used for the scripting.
The solution provided has been successfully designed and implemented in Fortune 500
companies and it saved several hours of work. The level of automation provided has an immense
impact on TCE and it has low or no learning curve to adopt this solution. The simple and
innovative method we used can lead to a wider usage of this solution to rapidly integrate the
array-based replication scripts for VPLEX environments. It has greatly assisted customers to make
the transition to a VPLEX environment in a smooth way by retaining their existing workows and
automation.
How Implementing VNX File System Features Can Save Your Life
by Piotr Bizior
If youre storage administrator like me, you probably know that industry predicts overall data to
grow by 50 times by year 2020. On the other hand, the number of storage administrators will only
grow by 1.5 by 2020. What that means to me, is that not only I need to nd more efcient way to
store the data, back it up and plan DR activities, but also improve efciency on my day-to-day
activities.
This is where EMC VNX and its features come into the picture. Implementing le storage features
like quotas, deduplication, replication, and checkpoints will not only save money and improve
RPO/RTO, but also make my life easier.
What if you could control the growth of data that users store on their home drives, monitor the
usage, and block certain les from being saved? What if I told you that you can deduplicate le
system, which can save you up to 72% of original data size? Sounds pretty good, right? Add to the
above list le system checkpoints to control point-in-time backup capabilities, and le systems
replication to address disaster recovery (DR) concerns.
My Knowledge Sharing article examines issues such as these, providing a solid and unied
solution for le storage that addresses every aspect of its lifecycle.
Overcoming Challenges in Migrating to a Converged
Infrastructure Solution
by Suhas D. Joshi
Convergence of data center infrastructure has risen to the top of the list of strategic initiatives
for IT organizations. A Converged Infrastructure Solution (CIS) integrates multiple data center
infrastructure components in a single package. There are signicant benets associated with the
use of a CIS. A CIS enables an IT organization to pool infra-structure resources, increase resource
2014 EMC Knowledge Sharing Book of Abstracts 51
utilization and reduce costs. From a software application owners point of view, a CIS offers rapid
elasticity, increased agility, and on-demand self-service capability.
On the surface, if a CIS is in the same data center as other servers, migrating the servers to
the CIS would appear to be a walk in the park. But as the saying goes, the devil is in the
details. In reality, when migrating servers to a CIS, numerous challenges come up and have to be
addressed.
My Knowledge Sharing article describes in detail the challenges that arise during migrations
to CIS and the strategies for overcoming the challenges. The article also includes a case study
in which nearly 2000 servers have been migrated into the Vblock-based private cloud for a
pharmaceutical company with a global footprint. The article will show how to avoid pitfalls and
implement strategies and best practices for successful migrations to a CIS. This article will be of
great interest to data center migration and virtualization team members, cloud architects, project
and program managers, planners, application owners as well as infrastructure, storage, and
networking specialists.
The author is a certied EMC Proven Professional Cloud Architect as well as a certied Project
Manager with several years of experience in Data Center Virtualization and Cloud enablement
with Vblock.
Disk Usage and Trend Analysis of Multiple Symmetrix Arrays
Confgured with FAST/VP
by Mumshad Mannambeth and Nasia Ullas
With the advent of multiple layers of virtualization it has become impossible to educate
the customer as to how exactly the different types of disks are being utilized by his various
applications. Virtualization technologies such as Meta LUN, Virtual Provisioning (VP), and Fully
Automated Storage Tiering (FAST) have made it difcult to create a direct disk capacity utilization
report.
As a result, storage administrators would not be able to satisfactorily answer the customer, if he
poses any of the following questions:
How much of my EFD disk is being used by the SAP Application?
If the usage is more than expected, which host in the SAP group is responsible for it?
Which application is using most of my FC Storage?
If my applications are only using 2.5 TB of SATA disk, why is the total usage 3 TB? Where is
the extra space being used?
Why is my ECM application using EFD space when they are not billed for it?
What is the trend of capacity utilization of ESX across multiple arrays?
In this Knowledge Sharing article, we have developed a procedure to collate, analyze and
generate a report which can answer all of the above questions and more. This report is simple
enough to be understood by an end user or customer who doesnt know the complex architecture
of FAST VP/ Thin Provisioning. Moreover, factoring in the FAST usage trending report with current
capacity planning method would give us a more accurate capacity management technique.
2014 EMC Knowledge Sharing Book of Abstracts 52
This report is vital as no tool is currently available to analyze and report FAST VP movement
of data at a sub LUN tier level, or to create a trending report per application based on FAST
movement. Existing tools do not help in reporting when an application is spread across multiple
physical arrays.
Connecting the Scattered Pieces for Storage Reporting
by Hok Pui Chan
Every organization has its own needs and there are a lot of different types (and even different
brands) of storage equipment to be managed. When managing a sizable storage infrastructure,
an accurate, timely and comprehensive storage reporting mechanism is important to measure
the performance and utilization of the storage infrastructure. Storage reporting is also crucial for
measuring the business value of the investments in storage equipments by associating storage
costs to different application and business units.
To produce a meaningful report to senior IT management, we need to co-relate some company
specic information into storage reports. Since most storage reporting tools do not provide such
functionality to include customized data inputs, storage administrators are struggling to produce
these reports manually at a regular interval while taking care of the accuracy and timeliness of
such reports.
Essential elements for a comprehensive storage report are available but scattered around
different places or tools. What we need to do is to nd a way to connect them together and
provide different dimensions for storage reporting, e.g. from project/ business lines perspective,
from server/applications perspective and, of course, from traditional storage arrays perspective.
In order to produce these co-related reports, a scalable and robust conguration repository is
required as the foundation to all the above.
In this Knowledge Sharing article, EMC Symmetrix VMAX SAN infrastructure will be used to
illustrate the technical details of extracting the source data, processing them, storing them
into the conguration repository, and how to produce the reports. These technical details
include, What and how can VMAX information be extracted?, How can those information be
transformed and imported into the conguration repository?, What kind of reports are useful
specic to a VMAX environment?, etc. Other practical considerations and opportunities for
future extension will also be discussed.
After reading the article, the audience will have a concrete idea, with technical details, on how
they can build their own storage conguration repository and reporting facilities.
2014 EMC Knowledge Sharing Book of Abstracts 53
Benefts, Design Philosophy, and Proposed Architecture of an
Oracle RAC/ASM Clustered Solution
by Armando Rodriguez
The demand for high performance, high capacity workload-optimized data warehouse and
analytics solutions have never been higher. The ability to extract information from todays large
scale data warehouse environment requires a exible, highly scalable storage solution that can
provide demonstrable return on investment that is require in todays competitive environment.
EMC VNX MCx Multi-Core platform provides a balance of performance, capacity, capital cost, and
operating cost that is unrivaled in the industry.
This Knowledge Sharing article will explore a scale out, building block approach for a Storage Grid
architecture utilizing VNX with an Oracle RAC data warehouse solution.
The article presents the challenges facing businesses running large scale Oracle RAC data
warehouse environments. It provides a reference architecture with the EMC VNX platform that
allows a business to scale out to multi-petabyte capacity and 100s or even 1000s of gigabyte per
second of storage bandwidth. Reference design, manageability, supportability and architecture
details will be explored in detail.
RELEVANT INDUSTRY TOPICS
Storage Reporting and Chargeback in the Software Defned
Data Center
by Brian Dehn
In the cloudy world of software-dened data centers, software-dened storage, and storage
automation, would you offer storage-as-a-service (SaaS) to your customers without a way to
report on or charge for usage? Probably not. However, reporting and chargeback are commonly
relegated to afterthoughts or last-minute suboptimal additions.
Cloud environments are technology stores where customers can choose from a menu of IT
services based on their requirements. The infrastructure that enables cloud services and allows
the offering of IT-as-a-service (ITaaS) is the software-dened data center. Whereas software-
dened compute and networking have been around for a while, software-dened storage is a
relatively new concept. EMC has taken the lead in this arena by releasing the industrys rst
software-dened storage solutionViPRas well as the industrys rst software-dened storage
reporting management (SRM) solutionSRM. ViPR allows enterprises to quickly provide storage-
as-a-service (STaaS) through abstraction and control plane automation, and SRM is the only
reporting solution that provides discovery, visibility, and analytics for software-dened storage.
This Knowledge Sharing article introduces the concept of software-dened storage and explains
how EMC ViPR lls that gap in the software-dened data center. It then discusses basic reporting
requirements and tackles the advanced topic of charging back for STaaS. Finally, the article
explains how EMC SRM facilitates reporting and chargeback for software-dened storage.
2014 EMC Knowledge Sharing Book of Abstracts 54
Understanding IO Workload Profles and Relation to
Performance
by Stan Lanning
When implementing a new application there is usually a requirements section or sizing guide
that indicates minimum number of processors or cores, the amount of memory, and storage
capacity required. In more complex application systems such as Microsoft Exchange, SAP, or
Oracle E-Business Suite there may be sizing calculators to help determine the number of servers,
network bandwidth, database requirements, and so on. Ironically, there are relatively few tools
for storage performance sizing, and they tend to be specic to individual products or vendors.
Inputs to these tools are often not well understood or simply not known.
As a storage solutions specialist involved in sizing IT infrastructure, one of the rst questions
I ask customers and sales teams is, what are the IO workload proles that the infrastructure
needs to support? Often I receive a response of 100 Terabytes, 2700 IOPS, 350 Megabytes
per second, or simply we dont know. Then I ask a series of questions trying to understand
what applications the customer is running or planning to run, how data ows in and out of the
system, and hopefully back into several factors that make up the IO workload proles. Without
this insight, there simply is not enough information to determine an appropriate performance
conguration.
So, what are these IO workload proles? Why are they important? How do we identify them?
How do they relate to performance sizing?
This article will explain the fundamentals of the characteristics of IO workload proles, such
as IOPS, MB/s, read/write ratios, response times, and queue lengths. It will also explore how
different proles can impact performance sizing and some common tools that can be used to
identify workload prole characteristics.
System Tap for Linux Platforms
by Mohamed Sohail
System Tap is a tracing programming language that enables system administrators to trace and
prole a large scale of Linux computing environments, including mission-critical production Linux
servers and even embedded devices without affecting the reliability, stability, or performance
of these environments. System Tap (or stap) is a set of programmatic instructions that the
administrator develops to be compiled and executed on the target system similar to any scripting
language like Perl or Python. The difference is that those two programming languages are
interpreted but System Tap is compiled to a binary form known as a Linux Kernel Module.
System Tap can extract all sorts of information from a live Linux system in order to diagnose a
complex performance issues or functional problems in a very safe way through probes that
can be considered as software sensors that the administrator plugs on the Kernel parts that he
needs to diagnose including the Linux system calls, low level kernel functions, and subsystems
functions.
2014 EMC Knowledge Sharing Book of Abstracts 55
System Tap is designed to provide the infrastructure for diagnosing/tracing a running Linux kernel
and provide a detailed report of its status. This can enable system administrators and developers
to write efcient scripts to identify where performance issues and/or bugs are caused. Before
System Tap, the procedure of proling a running Linux kernel was a very long process consisting
of adding very long instruments to the kernel, recompile the whole kernel to generate a new
kernel image with the custom monitoring instruction, then reboot the Linux system to this kernel.
System Tap eliminates all this, allowing users to gather the same information by simply running
user-written System Tap scripts.
This Knowledge Sharing article:
Introduces users to System Tap, familiarize them with its architecture, and provide setup
instructions.
Presents how System Tap can be very useful for diagnosis, developing any of the EMC
platforms running Linux, especially when generic tracing and troubleshooting tools seem
insufcient in providing meaningful insight about a certain problem.
Provides examples of what System Tap can do with sample general purpose scripts.
Offers tips and tricks with EMC Linux-based products.
Embedded QA in Scrum
by Jitesh Anand, Chaithra Thimmappa, Vishwanath Channaveerappa,
Varun AdemaneMutuguppe, and Neil OToole
The higher the quality, the higher the customer satisfaction and increase in sales. QA is the
customer advocate who assures quality of the product. Thus, the role of QA is becoming more
critical in modern Software Development Life Cycles like Scrum. Success of QA plays a vital role in
the success of Scrum.
In Scrum, projects draw testers out of the background (compared to Waterfall Model) and put
them into the spotlight. Testers can play a distinct role and drive product development by creating
acceptance tests before any code is even written. Working closely with development, the Scrum
team drives the release of the project. Working with the owners of user stories, testers translate
user stories into acceptance tests and help in writing automated test stories.
QAs role was to test the software being built by the Scrum team(s) and assure product quality.
Challenges faced in Scrum:
QA working on detached mode
Delay in shipping product
Excessive work load
Reduced/No scope for automation
Compromised quality
To overcome these issues, fully embedding QA into the scrum team has the following benets:
Testing each increment of code immediately as it is nished.
Whole team is responsible for quality.
Quality objectives dened by QA with buy in from the whole team.
2014 EMC Knowledge Sharing Book of Abstracts 56
Cross pollination (QA learn Development traits/practices while Development get exposure to
the QA mindset and test techniques)
Releasable software at the end of a sprint.
Embedded QA is not one single resource carrying out all quality/automation tasks for the
entire Scrum team
Embedded QA reduces project management overhead, increases Development and QA
interaction, and improves product quality. Finding bugs earlier and xing them earlier reduces
the overall cost in xing issues post deployment. This Knowledge Sharing article covers
implementation, roles, responsibilities, challenges, and benets of Embedded QA in detail.
Detailed Application Migration Process
by Randeep Singh and Dr. Narendra Singh Bhati
This Knowledge Sharing article on Application Migration process will explain the end-to-end
processes and best practices to be followed during any application migration. It is based on
experience in migration of infrastructure and associated applications from three existing data
centres to three new data centres involving a stack of 4000 servers and 3 PB of storage across
two of the data centres, and a third being lowest in stack, around 500 servers and 600 TB of data.
Processes and best practices followed are the same across the three GEOs and have yielded
positive results so far.
Any migration project can turn out to be a disaster if it is not planned properly in a structured
manner. Thus, it is essential to clearly identify the requirement, what needs to be captured, plan
in a structured way, follow best practices, processes, and then execute.
Areas of focus in this article:
Application Migration Process Flow
Capture Key Information
Application Migration Planning Templates
Understanding Application Complexity
Agree on Migration Methods Downtime Agreement
Agree on the Pre-migration and Cutover Templates to be used for migration
Communication
Execution
To summarize, understanding complexity of applications to be moved and associated
infrastructure is vital to any migration project. There is a lot of time that business puts in, hence
it becomes necessary to give value to their time and build their condence by following the best
practices and processes during any migration project.
2014 EMC Knowledge Sharing Book of Abstracts 57
A Journey to Power Intelligent IT
By Mohamed Sohail
Sustainability has emerged as a result of signicant concerns about the unintended social,
environmental, and economic consequences of rapid population growth, economic growth and
consumption of our natural resources. Saving our earth and going green is a key factor in the
future planning and design in all the industries. A very important consideration that affects the
decisions of the IT managers is the power consumption and the emission of the carbon in the
data centers.
This Knowledge Sharing article provides a cookbook for decision makers and data center
managers to help them on their journey to green data centers. It will provide recommendations,
considerations, tips and tricks on how to design future data centers or enhance current models.
The article concentrates on best practices and use of EMC products to achieve optimal results
on our journey to the Green IT. We will share a winning idea from the EMC Innovation Showcase
2012, EMC Power Smart: Energy Usage Big Data Model for EMC Products Worldwide: A Smart
Energy Monitoring and Reporting System. We will also illustrate best practice into modern and
sustainability compliant data centers.
Health Check and Capacity Reporting for Heterogeneous
SAN Environments
by Mumshad Mannambeth and Salem Sampath
Large service delivery accounts often nd it difcult to perform health checks and to monitor
capacity on hundreds of arrays and switches spread across different environments, in various
locations. To guarantee Service Level Agreement (SLAs), a health check report is prepared several
times a day, so that the arrays could be monitored closely and to ensure that all failures are
handled appropriately. This is a time consuming and tedious task that requires the effort of
multiple engineers dedicated for this purpose alone. The complexity increases as fabrics may
be spread across different environments and locations. Moreover, due to the heterogeneity
of the arrays and switches, a single tool may not serve the purpose of monitoring the entire
environment.
This Knowledge Sharing article lists the methodologies used in implementing a time-saving,
automated, health check and capacity report generation process for large numbers of different
types of arrays and switches in a shared cloud environment. This process uses numerous scripts
developed in OS-specic shells and presentation tools, like Excel, to prepare in minutes. The
process eliminates the need for the users to log in to each management workstation to inspect
the array. Instead health check scripts are run on the workstations using scheduled tasks and
reports are automatically emailed to the administrators group.
2014 EMC Knowledge Sharing Book of Abstracts 58
The article explores:
Health Check Routine for different types of heterogeneous arrays and switches
> EMC Symmetrix/VMAX
> EMC VNX/CLARiiON
> Hitachi HDS Arrays
> IBM XIV Arrays
> HP EVA/XP Arrays
> Brocade Switches
> Cisco Switches
Scripts to perform health check on the above mentioned arrays and switches
Data gathering techniques
> Emailing techniques from different OS Platforms
> Automatic FTP uploading and downloading scripts
Analysis of collated reports using Excel VBA - Automation of health check reports from
multiple Outlook emails to a single Excel based report
The procedure described in this article has been successfully implemented in multiple major
accounts reducing manual effort and time taken to perform health checks on a large number
of arrays and switches. It is especially suitable for accounts monitoring multiple domains and
heterogeneous products that cannot be monitored with a single tool.
2014 EMC Knowledge Sharing Book of Abstracts 59
This page is intentionally blank.
EMC Corporation
Hopkinton, Massachusetts 01748-9103
1-508-435-1000 In North America 1-866-464-7381
www.EMC.com
EMC
2
, EMC, EMC Proven, EMC ControlCenter, Data Domain, NetWorker, Symmetrix, Unisphere, VMAX, VNX, VNXe, VPLEX, ViPR, and the EMC logo
are registered trademarks or trademarks of EMC Corporation in the United States and other countries. VMware, is a registered trademark of
VMware, Inc. in the United States and/or other jurisdictions. All other trademarks used herein are the property of their respective owners.
Copyright 2014 EMC Corporation. All rights reserved. Published in the USA. 4/14 Book of Abstracts H2771.12

You might also like