0% found this document useful (0 votes)
7 views

Docu81241 CloudIQ Intelligent, Proactive Monitoring and Analytics

The document introduces Dell EMC CloudIQ, a cloud-native Software-as-a-Service platform designed for remote monitoring and analytics of Dell EMC Unity systems. It highlights features such as centralized monitoring, proactive health scores, and predictive analytics for capacity and performance management. The white paper serves as a guide for Dell EMC customers and partners to effectively utilize CloudIQ for managing their storage systems.

Uploaded by

michaelfuray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Docu81241 CloudIQ Intelligent, Proactive Monitoring and Analytics

The document introduces Dell EMC CloudIQ, a cloud-native Software-as-a-Service platform designed for remote monitoring and analytics of Dell EMC Unity systems. It highlights features such as centralized monitoring, proactive health scores, and predictive analytics for capacity and performance management. The white paper serves as a guide for Dell EMC customers and partners to effectively utilize CloudIQ for managing their storage systems.

Uploaded by

michaelfuray
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

CLOUDIQ: INTELLIGENT, PROACTIVE

MONITORING AND ANALYTICS


A cloud-native Software-as-a-Service for Dell EMC™
Midrange Storage

ABSTRACT
This white paper introduces Dell EMC™ CloudIQ, a cloud-native Software-as-a-Service
platform that enables administrators to remotely monitor Dell EMC Unity systems from
anywhere and at any time. This paper provides a detailed description of how to use
CloudIQ to proactively monitor and troubleshoot Dell EMC Unity systems.

April 2018

WHITE PAPER
The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of any kind with respect to the
information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable software license.

Copyright © 2018 Dell Inc. or its subsidiaries. All Rights Reserved. Dell EMC, and other trademarks are trademarks of Dell Inc. or its
subsidiaries. Other trademarks may be the property of their respective owners. Published in the USA [04/18] [White Paper] [H15691.1]

Dell EMC believes the information in this document is accurate as of its publication date. The information is subject to change without
notice.

2
TABLE OF CONTENTS

EXECUTIVE SUMMARY ...........................................................................................................5


Audience ........................................................................................................................................... 5
Terminology....................................................................................................................................... 5

CLOUDIQ ...................................................................................................................................6
The Value of CloudIQ to the Customer ............................................................................................. 6
CloudIQ Requirements ...................................................................................................................... 6
CloudIQ Data Collection............................................................................................................................. 6
CloudIQ Features .............................................................................................................................. 6
Comprehensive Monitoring ............................................................................................................... 7
Intelligent Analytics – Anomaly Detection and Capacity Predictions ................................................. 7
Performance Anomaly Detection................................................................................................................ 7
Capacity Trending and Predictions ............................................................................................................ 7
Proactive Health Score ...................................................................................................................... 7
CloudIQ Notification Emails ............................................................................................................... 8

CLOUDIQ USER INTERFACE ..................................................................................................8


Navigating CloudIQ ........................................................................................................................... 8
What’s New in CloudIQ ..................................................................................................................... 9
Overview Page .................................................................................................................................. 9

SYSTEMS - SUMMARY ......................................................................................................... 11


Systems Summary – Health Score View ......................................................................................... 11
Systems Summary – Configuration ................................................................................................. 12
Systems Summary – Capacity ........................................................................................................ 13
Systems Summary – Performance .................................................................................................. 14

SYSTEM – DETAILS .............................................................................................................. 15


System Details – Health Score................................................................................................................. 15
System Details – Configuration ................................................................................................................ 16
System Details – Capacity ....................................................................................................................... 16
System Details – Performance ................................................................................................................. 17

HEALTH CENTER .................................................................................................................. 18


Health Center – Proactive Health .................................................................................................... 18
Health Center – Alerts ..................................................................................................................... 19
Health Center – Reclaimable Storage ............................................................................................. 20

3
METRICS ................................................................................................................................ 21
Metric Dashboard Wizard ................................................................................................................ 21
Performance Comparison graphs.................................................................................................... 22

STORAGE OBJECTS ............................................................................................................. 23


Pools Listing .................................................................................................................................... 23
Pools – Properties.................................................................................................................................... 23
Pools – Capacity ...................................................................................................................................... 24
Pools – Performance ............................................................................................................................... 25
Block Objects .................................................................................................................................. 26
Block Objects – Properties ....................................................................................................................... 27
Block Objects – Capacity ......................................................................................................................... 27
Block Objects – Performance ................................................................................................................... 28
Block Objects – Data Protection............................................................................................................... 29
File Objects ..................................................................................................................................... 30
File Objects – Properties .......................................................................................................................... 31
File Objects – Performance ...................................................................................................................... 32
File Objects – Data Protection.................................................................................................................. 33

HOSTS .................................................................................................................................... 34
Hosts – Properties ................................................................................................................................... 35
Hosts – Capacity ...................................................................................................................................... 35

APPENDIX A – CLOUDIQ SECURITY................................................................................... 36


APPENDIX B - ENABLING CLOUDIQ AT THE SYSTEM ..................................................... 36
Dell EMC Unity System ................................................................................................................... 36

4
EXECUTIVE SUMMARY
With our busy daily lives, it is important to find easier and faster ways to manage storage. With the Dell EMC Unity storage systems,
Dell EMC seeks to simplify the user experience in every possible way. One key aspect is in providing a simple way to monitor single or
multiple Dell EMC Unity systems.

CloudIQ is designed to deliver these capabilities to customers:

 Centralized Monitoring of Dell EMC Unity systems


 Proactive Health Score to help users identify potential risks in the environment.
 Predictive Analytics enabling capacity trending, capacity predictions, and performance troubleshooting for Dell EMC Unity systems

This white paper describes these CloudIQ features that are presented in a consolidated user-friendly interface through any HTML5
browser.

As a Software-as-a-Service solution, CloudIQ delivers frequent, dynamic, non-disruptive content updates for the user. CloudIQ is built
in a secure multi-tenant platform to ensure that each customer tenant is completely isolated and secure from other customers.

Audience
This white paper is intended for Dell EMC customers, partners, and employees who are interested in understanding CloudIQ features
and how to monitor Dell EMC Unity Hybrid, Dell EMC Unity All Flash, and Dell EMC UnityVSA – Professional Edition systems.

Terminology
Secure Remote Services (formerly named ESRS) provides the remote connectivity that enables Dell EMC Unity systems to connect
to CloudIQ and to automatically open Service Requests (SRs) for critical issues that arise. Secure Remote Services allows Dell EMC to
securely transfer files, such as logs and dumps, from the systems. There are two types of Secure Remote Services: Integrated and
Centralized.

 Integrated Secure Remote Services, is embedded in the Dell EMC Unity Unisphere system, and is the recommended
configuration providing High Availability (HA) failover of Secure Remote Services from the Primary Storage processor (SP) to the
backup SP. Secure Remote Services communication uses ports 443 and 8443 (HTTPS) and needs unrestricted access to the
Global Access Servers (GAS).

 Centralized Secure Remote Services connects the system to a Secure Remote Services gateway server installed on a customer
site. Centralized Secure Remote Services does not configure an HA feature. Secure Remote Services Centralized communication
uses ports 443 and 9443 (HTTPS) and needs unrestricted access to the Global Access Servers (GAS).

Unisphere – The graphical management interface that is built into Dell EMC systems for configuring, provisioning, and managing the
systems’ features.

5
CloudIQ
CloudIQ is a cloud-native, Software-as-a-Service (SaaS) offering by Dell EMC that provides a simple monitoring interface for an
unlimited number of Dell EMC Unity systems. CloudIQ is hosted on Dell EMC infrastructure which is Highly Available, Fault Tolerant,
and guarantees a 4-hour Disaster Recovery SLA.

CloudIQ provides each customer an independent secure portal, and ensures that customers will only be able to see their environment
by using CloudIQ. Each user can only see those systems in CloudIQ which are part of that user’s site access as configured in
Dell EMC Service Center. Customers register their mid-range systems with their Site ID and CloudIQ accesses the systems in near
real time based upon polling shown in the table below to enable monitoring and troubleshooting for Dell EMC Unity systems. CloudIQ
will maintain 2 years’ worth of historical data for systems that are actively being monitored.

The Value of CloudIQ to the Customer


 Reduce TCO - Access using a web browser from anywhere, increase self-service, and expedite quality resolutions - all at no
charge.
 Expedite Time to Value - Get started in minutes with nothing to install or license. New features and capabilities are seamlessly
and non-disruptively provided through CloudIQ.
 Drive Business Value - Deliver higher uptime, increase performance, and perform effective capacity planning.

CloudIQ Requirements
CloudIQ is available to all customers with the following Dell EMC Unity systems running version 4.1 and later - Dell EMC Unity all
Flash, Dell EMC Unity Hybrid, and/or Dell EMC UnityVSA – Professional Edition.

The following requirements must be fulfilled:

 Secure Remote Services established and configured for CloudIQ Data Access.
 Valid Dell EMC support contract and account which the user will use to access CloudIQ.

When these requirements have been met, users can securely connect the system to CloudIQ and start to monitor their Dell EMC Unity
systems.

CloudIQ Data Collection


After the Dell EMC Unity systems have established connection to CloudIQ, data will be collected for the Dell EMC Mid-Range systems.

The frequency with which data is updated in CloudIQ varies, based


on the type of information. The following table shows the types of
data and the frequency with which CloudIQ updates this information:

Type of Data Update Frequency


Alerts 5 minutes
Performance 5 minutes
Capacity 1 hour
Configuration 1 hour
Data Collection Daily

Details about CloudIQ’s Security measures are available in Appendix


A, “CloudIQ Security”. Details about initial Secure Remote Services
configuration and CloudIQ access are available in Appendix B,
“Enabling CloudIQ at the System”.

CloudIQ Features
CloudIQ makes it faster and easier to analyze and identify storage issues accurately and intelligently, by delivering:
 Comprehensive monitoring of performance, capacity, system components, configuration, and data protection. CloudIQ also
provides details about Systems, Storage Pools, and Block and File Storage Objects.
6
 Predictive Analytics that enable intelligent planning and optimization of capacity and performance utilization.
 Comprehensive Proactive Health scores for Dell EMC Unity storage. It identifies potential issues in the storage environment and
offers practical recommendations based on best practices and risk management.

Comprehensive Monitoring
CloudIQ provides a helpful Overview Page that summarizes the key aspects of a storage environment so that the user can quickly see
what needs to be addressed in the environment. These summaries are especially focused on the Proactive Health Score, Anomaly
Detection, and Capacity Predictions as discussed below. From here, the user can easily navigate to the areas of interest or requiring
attention.

Intelligent Analytics – Anomaly Detection and Capacity Predictions


CloudIQ’s advanced predictive analytics differentiate it from other monitoring and reporting tools.

Performance Anomaly Detection


Using machine learning, CloudIQ analyzes historical performance data to determine the range of acceptable normal behavior and
indicate when a metric is either above or below the range. It is these ‘norms’ that are used to compare a system’s behavior and
performance abnormalities. This provides timely information about the risk level of the Dell EMC Unity systems with insights into
conditions and anomalies affecting performance.

Capacity Trending and Predictions


CloudIQ provides historical trending and future predictions to provide intelligent insight on how capacity is being used, and what future
needs may arise. CloudIQ tracks capacity usage over the time range that the systems were first connected to CloudIQ, and also
leverages a learning algorithm to predict when Storage Pools will become full. This assists users both with short-term risk mitigation
and longer-term planning.

Proactive Health Score


The Proactive Health Score is another key differentiator for CloudIQ, relative to other monitoring and reporting tools. CloudIQ
proactively monitors the critical areas of each storage system to quickly identify potential issues and provide recommended remediation
solutions. The Health Score is a number ranging from 100 to 0, with 100 being a perfect Health Score. The Health Score is based upon
the five categories shown in the table to the left. Some examples of
Category Sample Health Checks
how the Proactive Health checks mitigate risk are:
Physical components with issues,
Components
faulty cables, fans, etc.  Verifying redundant paths providing High Availability from the
Configuration Non-HA Hosts connections System through the SAN to the Hosts
CPU at high utilization and Service  Monitoring the capacity and subscription rate of Storage Pools
Performance Processors significantly to understand their trending and predicted time to full, to help
imbalanced the administrator avoid a total stoppage of I/O which could result
Pools that are over-subscribed and in application downtime.
Capacity
reaching full capacity  Data Protection policies that are not being fulfilled – such as
Recovery Point Objectives not Recovery Point Objectives that are not being met.
Data Protection meeting native replication and
snapshot policy

7
CloudIQ Notification Emails
CloudIQ provides an email triggered by any Health Score change in near real-time, so immediate action can be taken to resolve any
issues before they become a data outage condition. These emails (see example email to Mary KIMBALL – Acme Corporation) will bring
attention to the specific systems with issues that have been found. In many cases, the user will be notified about issues that commonly
go unnoticed until a complete data outage happens.

In this example email sent to Mary Kimball, CloudIQ has identified issues
with two hosts (MRApp1_Host1 and MRApp1_Host2) connected to the
Market Research Dell EMC Unity system, that are not logged into both SPs
of the system. This is a loss of redundant (HA) paths which could result in a
data outage should the remaining path also fail. Commonly this condition
goes unnoticed as this is not a system failure, but a Host HBA, switch port,
or cable failure.

By clicking the “Launch CloudIQ” button, the user can quickly go to CloudIQ,
navigate to the system, and view the related details affecting the Health
Score.

CloudIQ User Interface


CloudIQ is a cloud-based application, delivered in an HTML5 browser-based user interface which can be reached at
https://cloudiq.dellemc.com. An alternate method of connecting is by clicking the CloudIQ icon on the top menu bar of the
Unisphere web page.

When connected to CloudIQ, users access their storage systems environment to securely view their environment.

The illustrations and use cases discussed in this White Paper can be viewed with the online simulator accessible by using the following
link: https://CloudIQ.dellemc.com/simulator. In the simulator environment, there are Dell EMC Unity systems that display various level of
operations to show and understand the value of CloudIQ. When viewing the simulator, the dates will always be based on the current
date the simulator is launched.

Navigating CloudIQ
The menu tree on the left shows the high-level sections of CloudIQ. Each section will display key attributes with sortable columns for a
common and simplified user experience across the CloudIQ GUI.

 Overview – Status view of storage environment


 Systems – Card or List View of all the systems with Health Scores
 Health Center – List view of each system with the details related to the Health Score
 Metrics – Customizable metrics dashboard
 Storage – Aggregate Storage listings for Pools, Block, and File storage
 Hosts – Aggregate listing of all hosts connected to the systems with connectivity and capacity
 Settings – The CloudIQ configuration details for your account, User Community, and Customer Support
 Help – Online CloudIQ documentation which is searchable

8
What’s New in CloudIQ
CloudIQ is updated frequently to continue to deliver helpful new content to users. New features to CloudIQ are found by clicking the
”NEW” icon on the top menu bar.

The “What’s New in CloudIQ” window will


pop-up showing what has changed and
what enhancements have been added. To
see a historical list of the monthly updates, click View All Enhancements. The most recent
information will be shown first and you can scroll down the list to see the monthly evolution
of CloudIQ since its introduction. If you do not wish to see this again, it can be turned off by
sliding the Don’t show again until the next update button.

Once logged into CloudIQ, there are some key functions available on the upper right of the
menu bar: to log out by selecting your email, to see what’s new with a historical roll-up
of the CloudIQ updates, and a dropdown menu to select Online Chat with Customer
Service or provide your Feedback.

Overview Page
The Overview page provides a consolidated view of the Dell EMC Unity storage environments. This is the highest-level summary of the
environment providing users with a roll-up of the key factors to understand the overall health and operation of the storage systems.

Selecting Settings provides information about the user account and systems, and users (Team members, Advisors, and Partners) that
have access to this CloudIQ environment and Customer Support information.

Selecting Help provides online help topics with the latest information for CloudIQ.

9
The Overview page has the following tiles of information:

 System Health Scores – Categorizes (Poor, Fair, and Good) the monitored
systems in the environment. Based on their health score (Poor 0-73, Fair 74-
94, and Good 95-100) each system is represented by the number on the
right. Hovering over the number to the right of each score category will
display the system names with their Health Score.

 Systems with Anomalies – Anomalies are defined as deviations in the


system “norms” for performance, based on a rolling 3 week period. The
performance categories monitored are: IOPS, Bandwidth, Backend IOPS,
Block Latency, and SP Utilization. Clicking on the number under each
category will show the systems and the direction of the anomaly (High or
low). Selecting the system takes the user to the system’s detailed
Performance graphs.

 System Connectivity – Shows the total systems monitored in CloudIQ,


within three categories:
o Identified systems not configured ( )
o Systems with lost connectivity ( )
o Systems which are successfully connected ( )
The details for each category will show the last time the system connected
to CloudIQ, and which connection (Remote Support or CloudIQ) was lost.

 Pools Running out of Space – Leverages predictive analytics to identify


the number of pools at risk of running out of space. Selecting the subtitle
will navigate the user to the aggregate Pool listing. Hovering over the
number under each of the four categories will pop-up a list of pools within
that time range:

o Full
o Within a week (7 days)
o Within a month (8 – 30 days)
o Within a quarter (31 – 90 days)
o Clicking the number will navigate the user to the Pool listing, filtered by that time range.

 System Alerts – Summarizes the alerts from the Systems that are collected
by CloudIQ for the monitored storage systems over the last 24 hours.
Selecting the subtitle “x alerts in the past 24 hours” will show a filtered list of
alerts from the last 24 hours. Selecting any of the numbers under each alert
level will show the alert in detail.

 Support – Link to MyService360 which is another cloud-based service


dashboard that offers personal insights across the global storage
environment.

 Storage Usage – graphical summary showing consumption by each


storage resource category across all the monitored systems. Hovering over
the colors in the circle will give additional details of the total used storage
for that resource category.

10
Systems - Summary
The Systems view has the option to change to multiple views including Health Score, Configuration,
Capacity and Performance using the View by drop-down menu. The default format of this page is
viewing by Health Score, in the Card view, as shown below.

The Card View shows the five categories that are monitored by CloudIQ; Components ( ),
Configuration ( ), Capacity ( ), Performance ( ) and Data Protection ( ) information.

Users can alternatively choose the List view, by selecting the icon to the right of the View by drop-down box.

Note: If the List view is selected, this will become the new default view for all multi-system views until the user logs out or changes
back to a the Card view.

The Export icon can be found on each of the pages and will export data across all views to a single csv file.

Each view provides this information:

 Score – CloudIQ Health Score for system


 Name – User-defined name of system
 Model – Specific model of system
 Serial number – Unique serial number for system

Systems Summary – Health Score View


Each system has a health score (from 100 to 0) which is calculated as 100 minus the issue with the greatest impact of the five
categories. The number in the circle represents the most significant issue that needs to be addressed and drives the Health Score.
Each of the five categories monitored will be given a Health Score which is a negative number subtracted from the perfect score of 100.
This approach is intended to help the user focus first on the most significant issue for the system, so that the user can resolve the issue
to improve the health score.

This Systems Card view will link the user to the Health Center (discussed in the next section) when selecting any point deduction.

11
Systems Summary – Configuration
This view shows the systems’ Configuration details. The information provided is:

 Version – Software version installed


 Last Contact Time – The last time the system data was sent to CloudIQ
 Location – Location where the system is installed
 Site name – Site ID with which the system is associated

Card View is the default view when accessing the multi-systems view.

List View is the alternate view that can be selected with the icon to the left of the Export icon. For large environments, the list view may
be more useful because it allows the user to sort columns.

12
Systems Summary – Capacity
This view shows the systems’ Capacity details. The information provided is:

 Usable – Total disk capacity, which includes Used and Free space.
 Used – Disk capacity used for user data, that is, allocated to an object, such as a LUN, Volume, or file system
 Free – Disk capacity provisioned as a storage pool but not allocated to an object, such as a LUN or file system

 Logical – Total capacity visible to hosts attached to this system


 Savings

o Overall Efficiency – System-level storage efficiency ratio


o Thin – Ratio Thin provisioned objects on the system
o Snapshots – Ratio of snapshots on the system
o Data Reduction – Ratio of data that has data reduction applied (using Compression and/or Deduplication)

The Efficiency detail corresponds to the Logical Capacity Guarantee stated for Dell EMC Unity All-Flash systems running version 4.1.0
or above. The Efficiency Details include:

Logical Capacity, Overall Efficiency Ratio, and then the ratios for Thin Objects, Snapshots, and Data Reduction which collectively
contribute to the Overall Efficiency Ratio. Data Reduction.

Note: For Dell EMC Unity systems running version 4.3 and later, Data Reduction includes Compression and/or Deduplication.

Card View

13
Systems Summary – Performance
This view shows the systems’ Performance details. The information provided is:

 IOPS – Total number of I/O requests over the last 24 hours


 Bandwidth – Current system bandwidth
 Utilization – Current SP utilization at the last polling
 LUN Latency – Time required for a packet to travel from the source to the destination.
 Performance Trend graph – IOPS over the past 24 hours with a data point every 5-minutes

CloudIQ offers the additional feature of enabling the user to select multiple systems (up to 10) to compare performance metrics. The
user simply clicks the checkbox to select the systems to compare, and then clicks the Compare Metrics button.

14
System – Details
Within the System page of CloudIQ, there are detailed views of any individual system monitored by CloudIQ. Selecting any system
from any summary view will show a tab view of that system for Health Score, Configuration, Capacity, and Performance.

System Details – Health Score


This tab shows the details for a selected system driving the health score number. In this example there are three issues, two in the
Configuration category and
one in the Capacity
category. Selecting the
category and then selecting
one of the issues will display
a recommended resolution.

This view also provides any


other issues that are found in
any of the categories:

 Components
 Configuration
 Capacity
 Performance
 Data Protection.

Scrolling down in this view


shows a historical time line and calendar options. This graph displays the historical trend of the health score and details of any issue(s)
over the displayed range of time.

Selecting any of the issues listed to the right of graph will mark the change on the time-
line and a summary of the active issues will be displayed. Viewing across a longer-term
time range can be helpful in identifying recurring issues in the environment.

Selecting the calendar will open a drop-


down allowing users to select one of the
predefined ranges or enter a custom time
range. A custom view is the default.

Selecting any of the issues on the right will


present details (shown below). Selecting any line item will display the remediation.

15
System Details – Configuration
This tab shows the details for a selected system showing the physical and logical components of the system.

 Pools
 Storage
 Drives
 Hosts

The upper portion of this view


provides the hardware
identification and location,
Uptime, Version, IP-address
and important Hotfixes that
need to be applied.

The Pools tab shows various


information about the
configured storage pools
including Total Size, Used %,
Subscription %, Time To Full, and Free.

This sort of information helps in understanding the pools at risk where subscription rate is greater than the total free storage and the
Time to Full is predicted within a month

The Storage tab shows all the objects in the system: LUNs, File Systems, VMware VMFS, and NFS. For instance, this view can help to
determine if there is a specific object which is consuming the greatest amount of storage.

The Drives tab gives the details on the drives for the given storage system and where they are located in the system.

The Hosts tab gives the details about the hosts attached to this storage system.

System Details – Capacity


This tab shows the storage details for a selected system.

This tab shows the details for a selected system showing the storage details for the system.

 Total Capacity
 Storage Usage
 Drive Type Usage
 Pools
The Total Capacity is a
breakdown of the raw
storage: Used, Free, and
Unconfigured Drives.

Savings is a breakdown of
the Logical and Used savings
of the total storage visible to
the hosts.

Overall Efficiency is a
system-level storage
efficiency ratio. This ratio is
shown for; Thin storage,
Snapshots, and Data
Reduction (Compression and Deduplication). This much effective capacity is gained, leveraging the various efficiency features.

16
Storage Usage shows the consumed capacity of these storage objects: LUNs, File Systems, VMware (VMDK and VMFS), and
Snapshots.

Drive Type Usage shows what drive types are installed in the system, with configured and unconfigured capacity. Hovering over the
rings will show the details related to that configuration.

Pools lists the configured storage pools on the system. It includes the Free, Used, and Time to Full details. Selecting a pool name will
redirect the user to the Pool Details page.

System Details – Performance


This tab shows a selected system’s performance details for all its objects.

 Block Latency
 IOPS
 Bandwidth

Storage Object Activity displays Block


IOPS, IOPS, and Bandwidth over a 24-
hour period. The data is sorted from
high to low in order to quickly provide
visibility to objects using the most
resources. Below this is a summary
view of the system performance metrics
with a detailed graphical timeline.

Each performance graph shows a 24-hour timeline with an overlay of historic seasonality. Any anomalies detected will be displayed –
for example, as seen in the SPA and SPB Utilization below. Selecting any point on any of the graphs will give the top five most active
storage objects over that time period.

Click the GO TO ALL METRICS button in the Storage Object Activity upper right corner to access the Metrics page for additional
performance metrics.

The Metrics section provides more information about performance charts and how to create customized performance dashboards.

17
Health Center
The Health Center has three main sections:

 Proactive Health
 Alerts
 Reclaimable Storage

Health Center – Proactive Health


The Proactive Health section gives a comprehensive view of all the current health issues across all the monitored systems in the
environment. The user can use the Refine button to change the view to a single system or multiple systems, in order to focus on
issues for a particular system. When the user types the name of the system, the Proactive Health section will display the particular
system and its associated issues.

Selecting an individual system takes the user to the details discussed in the Systems section. Refer back to these sections:

 System – Health Score


 System – Configuration
 System – Capacity
 System – Performance

18
Health Center – Alerts
The Alerts listing displays all the alerts that are associated with the monitored systems. Users have several options for viewing the
alerts.

 Date - Date range


 System - System Name or Site ID
 Severity
o Critical – Event that has significant impact on the system and needs to remedied immediately
o Error – Event that has minor impact on the system and needs to remedied
o Warning – Event that you should be aware of but has no significant impact on the system
o Information – Events that do not impact the system functions
 Acknowledged
o Acknowledged – Events that have been reviewed and acknowledged on the array
o Unacknowledged – Events that have not been acknowledged on the array

Note: Alerts shown in CloudIQ come from the array and can only be acknowledged and unacknowledged on the array.

The alerts are grouped in current and weekly sections. When an alert has been acknowledged, there will be a checkmark at the right-
end of the alert line. More details pertaining to an alert can be seen by selecting the alert.

19
Health Center – Reclaimable Storage
The Reclaimable Storage view shows the objects and capacity of storage that may no longer be in use. This can be viewed two ways:
per System and per Rule Type. Users can group by using the drop-down menu to change the display to show the three rules that are
used for Reclaimable Storage, which are:

 Block Objects with no front end I/O activity


 File Objects with no front end I/O activity
 Block Objects with no Hosts attached

View by System (Default) shows reclaimable storage for each system with the number of objects and reclaimable storage presented. A
more detailed view of each can be seen by selecting the line item.

View by Rule shows reclaimable storage for each rule with the number of objects and reclaimable storage presented.

20
Metrics
The Metrics section allows the user to create custom performance dashboards. Different performance metrics are available based
upon the Category Selected (System, Block, File, Drive, or Pool), as shown in this table.

Objects (Block,
Metric System LUN or Volume) File Drive Pool
Bandwidth (BPS) X X X X X
Object Latency X X X
CPU Utilization X
IO Size X X X X X
IOPS X X X X
% Read X X X X X
Queue Length X X X
VVol Latency X X

Metric Dashboard Wizard


Users can click Add Metrics to open a wizard where a new dashboard can be created. Then users can select from each of the wizard
sections the data to view in the new dashboard.

1. Select the Category.

2. Select the System which is being monitored by CloudIQ.

3. Select the performance metrics from the Metrics list.

4. Select the objects for the System selected from the Objects list.

5. Select Add Metrics.

The new dashboard will show the performance graphs for each
selected metric with one or more Objects selected. Scrolling
across the timeline graph displays a vertical line on each graph
for quick analysis of performance at any given time. These
charts can be viewed as a grid pattern (shown) or one graph
per line. The timeline can be selected from a pre-defined value
ranging from Last Hour to Last 7 Days or the user can enter a custom date range.

In the Metrics View the graphs are dependent upon how the
user entered this view. This example displays a comparison
of two systems.

Hovering across the performance graph displays a vertical


line on all the graphs for the same point in time. The legend
to the right of the graph displays the performance
measurement related to the graph.

21
Performance Comparison graphs

Below are performance metrics for multiple systems. The System – Performance view contains an option to compare the performance
details of multiple systems. These are the performance graphs comparing two systems, as referenced in the previous section.

Note: VVol data is not included in object-level (LUN, file system, and drive) metrics because VVol object data is not saved.
Note: Block Latency timing shown is an auto-adjusted field for milliseconds (ms) and microseconds (µs) when appropriate.

22
Storage Objects
Storage views are presented in three categories: Pools, Block, and File objects. The Storage views provide an aggregated listing for
easy comparison of data. For each of the Storage sections, the listing can be refined to show the data per system(s) or by one of the
Time to Full intervals for Pools. The Issues column will display the number of health issues associated with any pool or storage object,
and a green check mark for items with no associated issues. The blue text identifies hyperlinks to the details for the item.

Pools Listing
This listing shows all the Pools across the entire environment of all systems monitored by CloudIQ. The Pools listing represents the raw
storage on the system that has been prepared to be provisioned as either Block or File storage. This listing provides the Pool Total
Size, Provision Used and Subscription percentages, and Free storage within the pool that has not been provisioned as Block or File.

The Subscription % and Time to Full are shown. Time to Full is based upon the storage size measured over time. The longer the pool is
configured, the more accurate the prediction of Time to Full. This Time to Full measurement identifies pools that are at greatest risk of
running out of storage space, and that require attention.

Pools – Properties
The Properties tab for a Pool provides
details for how the Pool was created and
the status of the five Health categories with
any issues found. Expanding the issue will
provide a suggested resolution Also
included in this view is a list of Storage
objects using this pool and drives assigned
to this pool, each of which ca be exported
to a CSV file.

In the upper right is a link to “Launch


Unisphere”. Selecting this will open the
Unisphere element manager for the system
hosting this Pool.

23
Pools – Capacity
The Capacity tab for a Pool provides details for the Pool capacity, showing total used and free storage as well as subscription. There is
a Storage usage ring showing how the Used storage is configured, and a Historical Trend with Predicted Date to Full graph. In the
bottom graph you can view the details by sliding the graph bar and then hover over the graph line.

If you hide the Predicted Date to Full, the future trend is no longer shown and only the Free and Used trend graph will be shown.

24
Pools – Performance
The Performance tab for a Pool provides details for the Pool Storage Object Activity. A 24-hour trend graph is shown below for Block
latency (LUNs and Volumes), IOPS, and Bandwidth (LUNs, Volumes, and File Systems).

Scrolling down on this view provides the user with detailed performance graphs for IOPS, Bandwidth, Backend IOPS, and Block
Latency. If an Anomaly is found, this will be shown as either High or Low. To see more details related to a specific range of the graph
select a start point on the graph and drag right. The Object activity for this period will be shown, as seen in the Bandwidth graph.

25
Block Objects
This listing shows all the Block Objects across the entire environment of systems monitored by CloudIQ. Click the Refine filter and
specify one or more system names to view the objects for the selected system(s).

The Block Object listing shows:

 Issues – Health of a Block Object, green checkmark (OK) or the number issues reported issues by CloudIQ
 Name – Storage object name
 Type – Type of storage object; LUN or VMware VMFS DataStore
 Total Size (GB) – Size of the provisioned storage
 Thin – Yes if provisioned as Thin, No for Thick provisioned
 Pool – Pool where the object resides
 Consistency Group – If applicable, the Consistency Group in which the object is included
 System – System where the object resides
 Model – System model

These Block Objects types are either a LUN or VMware VMFS. As with all listings, the user can sort the list by clicking any of the
column headings, refine the list to a System and export the list by selecting the Export icon in the upper right.

26
Block Objects – Properties
The Properties tab for a Block object provides details for how the Block was configured, any Health issues associated with this object,
and the Hosts that are attached to this object. The information in the Hosts table can be exported to a CSV file.

Block Objects – Capacity


The Capacity tab for a Block object provides details for how the Block capacity is being used to include Data Reduction and
Snapshots. The Historical Trend shows the capacity changes over time.

27
Block Objects – Performance
The Performance tab for Block objects provides performance details for the Block Storage Object Activity over a 24-hour period. This
can be changed to show a predefined time range or a custom data range. The performance graphs available are Workload Changes,
Workload Anomalies for Block latency, IOPS, and Bandwidth.

By default the Workload Changes graph displays values as a percentage of change. Click the By Value button to see all the
performance metrics. You can add the unchecked metrics by selecting the checkbox.

You can zoom in on a range in any graph by selecting the starting point and dragging to the right. Click Reset zoom to return to the
default view.

28
Block Objects – Data Protection
The Data Protection tab for a Block object displays how data protection has been configured for an Object. There are two levels of
Data Protection available: Replication from system to system and Snapshots.

The Replication details show a graphic illustration of the object-to-object replication and status of the replicated data.

The Snapshots detail is how data is backed up within the system using a Snapshot. A custom Snapshot rule can be defined which
determines when the snapshot is taken and how long the data is retained. The Snapshot list can be exported to a CSV file.

29
File Objects
This listing shows all the File Objects across the entire environment of Dell EMC Unity systems monitored by CloudIQ. Click the Refine
filter and specify the system name to view the objects for just that system.

The File Object listing shows:

 Issues – Health of a File Object, green checkmark (OK) or the number of issues reported issues by CloudIQ
 Name – File object name
 Type – Type of storage object: File System or VMware NFS DataStore
 Total Size (GB) – Total provisioned storage
 Used (%) – Percentage of File object capacity being used
 Protocol – Configured protocol(s) for the file object
 NAS Server – NAS Server that hosts the file object
 Pool – Pool where the object resides
 System – System where the file object resides
 Model – System model

These File Objects are either a File System or VMware NFS DataStore. As with all listings, the user can sort the list by clicking any of
the column headings, refine the list to a system, and export the list by selecting the Export icon in the upper right.

Dell EMC Unity systems track the data written to the file system which is reflected in the Used % column.

30
File Objects – Properties
The Properties tab for a File object provides details for how the object was configured and any Health issues found for the object.

File Objects – Capacity


The Capacity tab for a File object provides details for how the File capacity is being used, including Data Reduction savings and
capacity utilization by Snapshots. The File used percentage is based upon the actual data written to the file system. The Historical
Trend shows the capacity changes since the object was created. As you hover across the trend line, details will be shown.

31
File Objects – Performance
The Performance tab for File objects provides two performance graphs with aggregated metrics for a 4-hour period (default). This can
be changed to show from Last Hour to last 7 Days or a custom data range. As you hover across the graph, the metrics details will be
shown in pop-up boxes.

 File System Metrics (NFS)

o IOPS
o Bandwidth
o IO Size
o % Read

 Aggregated File System Metrics (NFS)

o IOPS
o Latency

The Aggregated File System Metrics (NFS) graph has additional breakdown information available to show both Storage Processor
Read, Write, and other.

32
File Objects – Data Protection
The Data Protection tab for File objects shows how data protection has been configured. There are two levels of Data Protection
available: Replication from system to system and Snapshots.

The Replication details show a graphic illustrating of the object to-object-replication and status of the replicated data.

The Snapshots detail is how data is backed up with the system using a Snapshot. A custom Snapshot rule can be defined which
determines when the snapshot is taken and how long the data is retained. The Snapshot list can be exported to a CSV file.

33
Hosts
The Hosts listing shows all the hosts (ESX, Linux, or Windows) which are attached to provisioned storage on Dell EMC Unity systems
being monitored by CloudIQ. Click the Refine filter and specify one or more system names to view the hosts for the selected system(s).

The Hosts listing shows:

 Issues - Health of the host, green checkmark (OK) or the number issues reported issues by CloudIQ
 Name – Host name
 Network Address – IPv4 or IPv6 IP-address
 Operating System – Host operating system version
 Initiator Protocol – Type of initiator used by the Host (FC, iSCSI)
 Initiators (#) – Number of initiators connected between the host and the monitored system(s)
 Total Size – Total size of the object provisioned to the host from the system
 System - System connected to the host
 Model – Model of the system

As with other listings, the user can sort the list by clicking any of the column headings, and export data by selecting the Export icon.

34
Hosts – Properties
The Properties tab for a Host provides details of the host type, IP-Address, and how it is connected. Any Health issues are highlighted
with suggested resolution. More details about the storage object being used by the Host and Initiators are provided in the tabs at the
bottom of the page. The information in each of the tabs can be exported to a CSV file.

Hosts – Capacity
The Capacity tab for a Host provides details for the current capacity and historical trending.

35
Appendix A – CloudIQ Security
CloudIQ’s Security Measures are as follows:

CloudIQ uses Dell EMC Secure Remote Services (SRS) to collect data from Dell EMC Unity systems, namely: system logs,
system configuration, and system capacity and performance metrics. Secure Remote Services provides sophisticated point-to-
point encryption over a dedicated VPN, multi-factor authentication, customer-controlled access policies, and RSA digital
certificates to ensure that all customers’ data is securely transported to Dell EMC. CloudIQ stores data received from Dell
EMC Unity systems in a secure Dell EMC IT managed infrastructure.

CloudIQ provides each customer an independent secure portal, and ensures that customers will only be able to see their own
systems via CloudIQ. CloudIQ access requires that each user has a valid Dell EMC support account. Each user can only see
those systems in CloudIQ which are part of that user’s site access as per configuration in Dell EMC Service Center.

CloudIQ uses a leading application security provider to perform continuous vulnerability scans as well as annual penetration
testing of the application. The underlying environment is included in regular infrastructure vulnerability scans, and any required
remediation is handled through an ongoing vulnerability remediation program. CloudIQ will soon begin the process of obtaining
a Service Organization Control (SOC2) report to provide assurance regarding security controls.

CloudIQ will maintain 2 years’ worth of historical data for systems that are actively monitored by CloudIQ. For any system that
is no longer monitored by CloudIQ, configuration, capacity, and performance data for that system is removed from
all CloudIQ Data Stores.

CloudIQ is hosted on Dell EMC infrastructure which is Highly Available, Fault Tolerant, and guarantees a 4-hour Disaster
Recovery SLA. Because it is web-based, CloudIQ is accessible anytime, anywhere.

Appendix B - Enabling CloudIQ at the System


Dell EMC Unity System
Enabling CloudIQ is through the Dell EMC Unity Unisphere interface.

Once Secure Remote Services has been configured, CloudIQ must be enabled.

For Unity 4.2 and later, navigate to Settings > Support Configuration > CloudIQ, and then select Send data to CloudIQ.

For Unity 4.1, navigate to Settings > Management > Centralized Management, select the CloudIQ tab in Centralized Management,
select Send data to CloudIQ, and then click Apply.

After this action, the system will appear in CloudIQ after one hour. The user can then simply proceed to CloudIQ.emc.com by clicking
the link on the displayed page, or the user can proceed to CloudIQ.emc.com from the main Unisphere page. On the CloudIQ.emc.com
page, users can log in with their valid service accounts to view their SC and Unity systems in CloudIQ.

For more information about onboarding the Dell EMC Unity arrays, see: https://support.emc.com/kb/481102

The Dell EMC Unity Online Documentation is available at https://support.emc.com/products/39949_Dell-EMC-Unity-


Family/Documentation.

Secure Remote Services must be configured successfully before users can send data to CloudIQ. For more information about enabling
SRS, please check the EMC Secure Remote Services for Dell EMC Unity Requirements and Configuration document that can be found
at https://support.emc.com.

36

You might also like