0% found this document useful (0 votes)
80 views

Features:: ISM Notes Unit-3

The document discusses different types of networked storage including JBOD, DAS, NAS, SAN, and CAS. It defines each type and describes their key features and differences. Direct-attached storage (DAS) connects storage directly to servers, can be internal or external, and has advantages of simple configuration but limitations around resource sharing. Network-attached storage (NAS) provides file-level access over a network. Storage area networks (SAN) connect storage and servers through dedicated fibre channel networks for centralized storage pooling. Content-addressed storage (CAS) stores content based on a unique identifier rather than location.

Uploaded by

Anamika Tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Features:: ISM Notes Unit-3

The document discusses different types of networked storage including JBOD, DAS, NAS, SAN, and CAS. It defines each type and describes their key features and differences. Direct-attached storage (DAS) connects storage directly to servers, can be internal or external, and has advantages of simple configuration but limitations around resource sharing. Network-attached storage (NAS) provides file-level access over a network. Storage area networks (SAN) connect storage and servers through dedicated fibre channel networks for centralized storage pooling. Content-addressed storage (CAS) stores content based on a unique identifier rather than location.

Uploaded by

Anamika Tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

ISM notes

Unit-3
Introduction to Networked Storage:

JBOD: Just a Bunch Of Disks Used to refer to hard disks that aren't configured according to
RAID a subsystem of disk drives that improves performance and fault tolerance.

JBOD (for "just a bunch of disks," or sometimes "just a bunch of drives") is a derogatory
term - the official term is "spanning" - used to refer to a computer's hard disks that haven't
been configured according to the RAID (for "redundant array of independent disks") system
to increase fault tolerance and improve data access performance.

The RAID system stores the same data redundantly on multiple disks that nevertheless appear
to the operating system as a single disk. Although, JBOD also makes the disks appear to be a
single one, it accomplishes that by combining the drives into one larger logical one. JBOD
doesn't deliver any advantages over using separate disks independently and doesn't provide
any of the fault tolerance or performance benefits of RAID.

Direct-attached storage (DAS): This type of storage connects directly to a server (host) or a
group of servers in a cluster. Storage can be either internal or external to the server. External
DAS alleviated the challenges of limited internal storage capacity.
Direct-attached storage (DAS) refers to a digital storage system directly attached to a server
or workstation, without a storage network in between. It is a retronym, mainly used to
differentiate non-networked storage from SAN and NAS.

Features:
A typical DAS system is made of a data storage device (for example enclosures holding a
number of hard disk drives) connected directly to a computer through a host bus adapter
(HBA). Between those two points there is no network device (like hub, switch, or router), and
this is the main characteristic of DAS.

The main protocols used for DAS connections are ATA, SATA, eSATA, SCSI, SAS, and
Fibre Channel

Storage features common to SAN, DAS and NAS:

Most functions found in modern storage do not depend on whether the storage is attached
directly to servers (DAS), or via a network (SAN and NAS).

A DAS device can be shared between multiple computers, as long as it provides multiple
interfaces (ports) that allow concurrent and direct access. This way it can be used in computer
clusters. In fact, most SAN-attachable storage devices or NAS devices can be easily used as
DAS devices – all that is needed is to disconnect their ports from the data network and
connect one or more ports directly to a computer (with a plain one-to-one cable).

http://www.rgpvonline.com
More advanced DAS devices, like SAN and NAS, can offer fault-tolerant design in many
areas: controller redundancy, cooling redundancy, and storage fault tolerance patterns known
as RAID. Some DAS systems provide embedded disk array controllers to offload RAID
processing from the server's host bus adaptor (HBA). Basic DAS devices do not have such
features.

A DAS can, like SAN or NAS, enable storage capacity extension, while keeping high data
bandwidth and access rate.

Disadvantages:
DAS has been referred to as "Islands of Information The disadvantages of DAS include its
inability to share data or unused resources with other servers. Both NAS (network-attached
storage) and SAN (storage area network) architectures attempt to address this, but introduce
some new issues as well, such as higher initial cost, manageability, security, and contention
for resources.

Types of DAS:

DAS is classified as internal or external, based on the location of the storage device with
respect to the host.

Internal DAS:
In internal DAS architectures, the storage device is internally connected to the host by a serial
or parallel bus. The physical bus has distance limitations and can only be sustained over a
shorter distance for high speed connectivity. In addition, most internal buses can support only
a limited number of devices, and they occupy a large amount of space inside the host, making
maintenance of other components difficult

External DAS:

In external DAS architectures, the server connects directly to the external storage device In
most cases, communication between the host and the storage device takes place over SCSI or
FC protocol. Compared to internal DAS, an external DAS overcomes the distance and device
count limitations and provides centralized management of storage devices.

http://www.rgpvonline.com
NAS:

Network-attached storage (NAS) is file-level computer data storage connected to a


computer network providing data access to heterogeneous clients. NAS not only operates as a
file server, but is specialized for this task either by its hardware, software, or configuration of
those elements. NAS is often made as a computer appliance – a specialized computer built
from the ground up for storing and serving files – rather than simply a general purpose
computer being used for the role.

As of 2010 NAS devices are gaining popularity, as a convenient method of sharing files
among multiple computers. Potential benefits of network-attached storage, compared to file
servers, include faster data access, easier administration, and simple configuration.

NAS systems are networked appliances which contain one or more hard drives, often
arranged into logical, redundant storage containers or RAID arrays. Network-attached storage
removes the responsibility of file serving from other servers on the network. They typically
provide access to files using network file sharing protocols such as NFS, SMB/CIFS, or AFP

http://www.rgpvonline.com
SAN:

Direct-attached storage (DAS) is often referred to as a stovepiped storage environment. Hosts


―own‖ the storage and it is difficult to manage and share resources on these isolated storage
devices. Efforts to organize this dispersed data led to the emergence of the storage area
network (SAN). SAN is a high speed, dedicated network of servers and shared storage
devices. Traditionally connected over Fibre Channel (FC) networks, a SAN forms a single-
storage pool and facilitates data centralization and consolidation. SAN meets the storage
demands efficiently with better economies of scale. A SAN also provides effective
maintenance and protection of data.
The SAN and Its Evolution:
A storage area network (SAN) carries data between servers (also known as hosts) and storage
devices through fibre channel switches (see Figure 6-1). A SAN enables storage
consolidation and allows storage to be shared across multiple servers. It enables organizations
to connect geographically dispersed servers and storage

http://www.rgpvonline.com
Content-Addressed Storage:
CAS is an object-based system that has been purposely built for storing fixed content data. It
is designed for secure online storage and retrieval of fixed content. Unlike file-level and
block-level data access that use file names and the physical location of data for storage and
retrieval, CAS stores user data and its
attributes as separate objects. The stored object is assigned a globally unique address known
as a content address (CA). This address is derived from the object’s binary representation.
CAS provides an optimized and centrally managed storage solution that can support single-
instance storage (SiS) to eliminate multiple
copies of the same data
Fixed Content and Archives
Data is accessed and modified at varying frequencies between the time it is created and
discarded. Some data frequently changes, for example, data accessed by an Online
Transaction Processing (OLTP) application. Some data that does not typically change, but is
allowed to change if required is, for example, bill of material and design documents. Another
category of data is fixed content, which defines data that cannot be changed. X-rays and
pictures are examples of fixed content data. It is mandatory for all organizations to retain
some data for an extended period of time due to government regulations and legal/contractual
obligations. Fixed data, which is retained for future reference or business value, is referred to
as fixed content asset. Some examples of fixed content asset include electronic documents, e-
mail messages, Web pages, and digital media (see Figure 9-1). Organizations make use of
these digital assets to generate new revenues, improve service levels, and leverage historical
value. This demands frequent and quick retrieval of the fixed contents at multiple locations.
An archive is a repository where fixed content is placed. Online data availability in the
archive can further increase the business value of the referenced information.

CAS Architecture:

http://www.rgpvonline.com
The CAS architecture is shown in Figure 9-2. A client accesses the CAS-Based storage over a
LAN through the server that runs the CAS API (application programming interface). The
CAS API is responsible for performing functions that enable an application to store and
retrieve the data.
CAS architecture is a Redundant Array of Independent Nodes (RAIN). It contains storage
nodes and access nodes networked as a cluster by using a private LAN that is internal to it.
The internal LAN can be reconfigured automatically to detect the configuration changes such
as the addition of storage or access nodes. Clients access the CAS on a separate LAN, which
is used for interconnecting clients and servers to the CAS. The nodes are configured with
low-cost, high-capacity ATA HDDs. These nodes run an operating system with special
software that implements the features and functionality required in a CAS system.

DAS Benefits and Limitations:


DAS requires a relatively lower initial investment than storage networking. Storage
networking architectures are discussed later in this book. DAS configuration is simple and
can be deployed easily and rapidly. Setup is managed using host-based tools, such as the host
OS, which makes storage management tasks easy for small and medium enterprises. DAS is
the simplest solution when compared to other storage networking models and requires fewer
management tasks, and less hardware and software elements to set up and operate. However,
DAS does not scale well. A storage device has a limited number of ports, which restricts the
number of hosts that can directly connect to the storage. A limited bandwidth in DAS restricts
the available I/O processing capability. When capacities are being reached, the service
availability may be compromised, and this has a ripple effect on the performance of all hosts
attached to that specific device or array. The distance limitations associated with
implementing DAS because of direct connectivity requirements can be addressed by using
Fibre Channel connectivity. DAS does not make optimal use of resources due to its limited
ability to share front end ports. In DAS environments, unused resources cannot be easily re-
allocated, resulting in islands of over-utilized and under-utilized storage pools. Disk
utilization, throughput, and cache memory of a storage device, along with virtual memory of

http://www.rgpvonline.com
a host govern the performance of DAS. RAID-level configurations, storage controller
protocols, and the efficiency of the bus are additional factors that affect the performance of
DAS. The absence of storage
interconnects and network latency provide DAS with the potential to outperform
other storage networking configurations.

Limitations Of NAS

Although extremely useful devices, there are some limitations to NAS drives. In larger
networks with a high volume of simultaneous I/O requests, a typical affordable NAS box is
not going to be able to provide adequate performance. Their built-in CPUs will be too limited
in performance and if over stretched will slow to a crawl. The CPU and network hardware in
a NAS is also rarely upgradable, because its single-purpose CPU is hard-wired into the unit
and the software stored on a firmware ROM chip. While it's possible to spend a lot of money
on more powerful NAS boxes with the capacity to handle huge quantities of traffic, the lines
between server and NAS begin to blur considerably, especially in terms of cost.

In a home environment, these limitations are unlikely to be important, since high-volume


simultaneous I/O traffic will be rare. A limitation that will be more important is one of actual
transfer rate, as it's easy to fall into the trap of assuming a NAS box will be as fast as an
internal hard drive. Although some NAS boxes offer great performance, some will only
transfer data at 5-IOMB/s or, in the case of some manufacturers' offerings, even less than
this. While this kind of transfer rate will not be a problem for streaming SD video to a single
system, it won't have the bandwidth available to serve two streams simultaneously. Even for
the best NAS boxes, you will need to invest in your home network infrastructure to ensure
that the transfer rate is not limited by the 100Mbps networks most of us use at home. This
situation only becomes slower if you are limited to wireless.
These limitations in speed are not caused by the hard drive technology within, because even a
modest 7200rpm hard drive can sustain transfer rates of well over 100MB/s. Many NAS

http://www.rgpvonline.com
drives boast of a 'fast SATA interface' or 'gigabit connection', and yet still fall well short of
even 100Mbps network performance. This limitation is caused by a bottleneck at the NAS
drive's CPU, which can only process data requests so fast. Seeing as these are a tiny fraction
as powerful as the CPU at the heart of your home PC, it's not entirely surprising you don't get
the same level of performance. In this day and age of ultra- fast computers, it's easy to forget
just how much overhead is required for good performance over a proper network.

http://www.rgpvonline.com
Unit-4

Hybrid Storage solutions:


The notion of hybrid is simple: performance when you need it, economy when you don’t. By
vitalizing multiple underlying storage technologies with an intelligent file system, Power File
offers the performance of disk, the economy of tape, and superior reliability and longevity in
a purpose-built platform optimized for the long term storage of fixed content.

Fixed content, also known as persistent data is the largest and fastest growing data type in the
enterprise. Leading analysts estimate that it accounts for as much as 80% of unstructured data
and is growing at approximately 100% CAGR. This data is not changing, not mission critical,
and infrequently accessed; however, much of the data must be retained and retrieved quickly
to satisfy increasing regulatory and legal requirements. In order to meet the online
requirements, fixed content remains intermingled with production data on expensive primary
storage arrays and is the culprit behind a myriad of inefficiencies that are rippling through
data centres worldwide.

http://www.rgpvonline.com
UNIQUE REQUIREMENTS:
Disk-based systems offer online accessibility, but are expensive to operate, need to be backed
up, and require expensive data migrations every 3-5 years. Tape-based systems can be cost-
effective to purchase, but don’t offer online accessibility, require regular maintenance, and
deliver questionable reliability. Fixed content has unique requirements that demand a fresh
approach that bridges the gap between disk and tape.

Virtualization:
Virtualization is the technique of masking or abstracting physical resources, which simplifies
the infrastructure and accommodates the increasing pace of business and technological
changes. It increases the utilization and capability of IT resources, such as servers, networks,
or storage devices, beyond their physical limits. Virtualization simplifies resource
management by pooling and sharing resources for maximum utilization and makes them
appear as logical resources with enhanced capabilities.

Memory Virtualization:
Virtual memory makes an application appear as if it has its own contiguous logical memory
independent of the existing physical memory resources. Since the beginning of the computer
industry, memory has been and continues to be an expensive component of a host. It
determines both the size and the number of applications that can run on a host. With
technological advancements, memory technology has changed and the cost of memory has
decreased. Virtual memory managers (VMMs) have evolved, enabling multiple applications
to be hosted and processed simultaneously. In a virtual memory implementation, a memory
address space is divided into contiguous blocks of fixed-size pages. A process known as
paging saves inactive memory pages onto the disk and brings them back to physical memory
When required. This enables efficient use of available physical memory among different
processes. The space used by VMMs on the disk is known as a swap file. A swap file (also
known as page file or swap space) is a portion of the hard disk that functions like physical
memory (RAM) to the operating system. The operating system typically moves the least used
data into the swap file so that RAM will be available for processes that are more active.
Because the space allocated to the swap file is on the hard disk (which is slower than the
physical memory), access to this file is slower.

Network Virtualization:
Network virtualization creates virtual networks whereby each application sees its own logical
network independent of the physical network. A virtual LAN (VLAN) is an example of
network virtualization that provides an easy, flexible, and less expensive way to manage
networks. VLANs make large networks more manageable by enabling a centralized
configuration of devices located in physically diverse locations. Consider a company in
which the users of a department are separated over a metropolitan area with their resources
centrally located at one office. In a typical network, each location has its own network
connected to the others through routers. When network packets cross routers, latency
influences network performance. With VLANs, users with similar access requirements can be
grouped together into the same virtual network. This setup eliminates the need for network
routing. As a result, although users are physically located at disparate locations, they appear
to be at the same location accessing resources locally. In addition to improving network
performance, VLANs also provide enhanced security by isolating sensitive data from the
other networks and by restricting access to the resources located within the networks.

http://www.rgpvonline.com
Server Virtualization
Server virtualization enables multiple operating systems and applications to run
simultaneously on different virtual machines created on the same physical server (or group of
servers). Virtual machines provide a layer of abstraction between the operating system and
the underlying hardware. Within a physical server, any number of virtual servers can be
established; depending on hardware capabilities (see Figure -1). Each virtual server seems
like a physical machine to the operating system, although all virtual servers share the same
underlying physical hardware in an isolated manner. For example, the physical memory is
shared between virtual servers but the address space is not. Individual virtual servers can be
restarted, upgraded, or even crashed, without affecting the other virtual servers on the same
physical machine.

With changes in computing from a dedicated to a client/server model, the physical server
faces resource conflict issues when two or more applications running on these servers have
conflicting requirements (e.g., need different values in the same registry entry, different
versions of the same DLL). These issues are further compounded with an application’s high-
availability requirements. As a result, the servers are limited to serve only one application at a
time, as shown in Figure 1(a). On the other hand, many applications do not take full
advantage of the hardware capabilities available to them. Consequently, resources such as
Processors, memory, and storage remain underutilized. Server virtualization addresses the
issues that exist in a physical server environment. The virtualization layer, shown in Figure
1(b), helps to overcome resource conflicts by isolating applications running on different
operating systems on the same machine. In addition, server virtualization can dynamically
Move the underutilized hardware resources to a location where they are needed most,
improving utilization of the underlying hardware resources.

Storage & appliances:


Storage virtualization is the process of presenting a logical view of the physical storage
resources to a host. This logical storage appears and behaves as physical storage directly
connected to the host. Throughout the evolution of storage technology, some form of storage
virtualization has been implemented. Some examples of storage virtualization are host-based
volume management, LUN creation, tape storage virtualization, and disk addressing (CHS to
LBA). The key benefits of storage virtualization include increased storage utilization, adding
or deleting storage without affecting an application’s availability, and non disruptive data
migration (access to files and storage while migrations are in progress). Figure illustrates a
virtualized storage environment. At the top are four servers, each of which has one virtual
volume assigned, which is currently in use by an application. These virtual volumes are

http://www.rgpvonline.com
mapped to the actual storage in the arrays, as shown at the bottom of the figure. When I/O is
sent to a virtual volume, it is redirected through the virtualization at the storage network layer
to the mapped physical array. The discussion that follows provides details about the different
types of storage virtualization, methods of implementation, the challenges associated with the
Implementation of storage virtualization, and examples of implementation

http://www.rgpvonline.com
HYBRID STORAGE APPLI ANCE:

UNMATCHED RELIABILITY:
The Hybrid Storage Appliance uses disk for performance caching and leverages industry
standard Bluray media with a 50 year life for storing fixed content. This allows the HSA to
completely avoid the risk of storing valuable fixed-content for many years on volatile
magnetic media. Power File developed the patent-pending Extended Verification and Self-
healing Technology (EVAST™) to make Bluray reliable for the enterprise. EVAST
establishes a new standard for data integrity by delivering a 100x improvement over RAID 6
with less than 10% penalty to usable capacity.

GREEN TECHNOLOGY:
The Hybrid Storage Appliance offers the Industry’s best combination of density, energy
efficiency, and raw-to-usable capacity by delivering up to 500TB in a standard 42U rack,
consuming less than 4 Watts per TB, and offering a fixed 90% raw-to-usable capacity.

http://www.rgpvonline.com
DATACENTER READY:
By leveraging quad-core processor technology, intelligent disk-based caching, and a
distributed processing architecture that efficiently scales both capacity and performance, the
HSA is able to ingest data at 375MB/s and handle up to 30,000 file requests per hour from
the archive grid all while using less than 10% of the energy required for a disk-based archive
solution. The Hybrid Storage Appliance offers up to 1.2PB per system with a distributed
architecture that scales performance with capacity. Ease of management is inherent
throughout the design and includes key features such as file-level WORM capabilities, thin
provisioning, retention management, volume replication, and automated monitoring with
integrated SNMP and e-mail notification.

Data center concepts & requirements:


Organizations maintain data centers to provide centralized data processing capabilities across
the enterprise. Data centers store and manage large amounts of mission-critical data. The data
center infrastructure includes computers, storage systems, network devices, dedicated power
backups, and environmental controls (such as air conditioning and fire suppression). Large
organizations often maintain more than one data center to distribute data processing
workloads and provide backups in the event of a disaster. The storage requirements of a data
center are met by a combination of various storage architectures.

http://www.rgpvonline.com
Core Elements:
Five core elements are essential for the basic functionality of a data center:

Application: An application is a computer program that provides the logic for computing
operations. Applications, such as an order processing system, can be layered on a database,
which in turn uses operating system services to perform read/write operations to storage
devices.
Database: More commonly, a database management system (DBMS) provides a structured
way to store data in logically organized tables that are interrelated. A DBMS optimizes the
storage and retrieval of data.

Server and operating system: A computing platform that runs applications and databases.

Network: A data path that facilitates communication between clients and servers or between
servers and storage.

Storage array: A device that stores data persistently for subsequent use. These core elements
are typically viewed and managed as separate entities, but all the elements must work
together to address data processing requirements. Figure 1-5 shows an example of an order
processing system that involves the five core elements of a data center and illustrates their
functionality in a business process.

Key Requirements for Data Center Elements:


Uninterrupted operation of data centers is critical to the survival and success of a business. It
is necessary to have a reliable infrastructure that ensures data is accessible at all times. While
the requirements, shown in Figure 1-6, are applicable to all elements of the data center

http://www.rgpvonline.com
infrastructure, our focus here is on storage systems. The various technologies and solutions to
meet these requirements are covered in this book.

Availability: All data center elements should be designed to ensure accessibility. The
inability of users to access data can have a significant negative impact on a business.

Security: Polices, procedures, and proper integration of the data center core elements that
will prevent unauthorized access to information must be established. In addition to the
security measures for client access, specific mechanisms must enable servers to access only
their allocated resources on storage arrays.

Scalability: Data center operations should be able to allocate additional processing


capabilities or storage on demand, without interrupting business operations. Business growth
often requires deploying more servers, new applications, and additional databases. The
storage solution should be able to grow with the business.

Performance: All the core elements of the data center should be able to provide optimal
performance and service all processing requests at high speed. The infrastructure should be
able to support performance requirements.

Data integrity: Data integrity refers to mechanisms such as error correction codes or parity
bits which ensure that data is written to disk exactly as it was received. Any variation in data
during its retrieval implies corruption, which may affect the operations of the organization.

Capacity: Data center operations require adequate resources to store and process large
amounts of data efficiently. When capacity requirements increase, the data center must be
able to provide additional capacity without interrupting availability, or, at the very least, with
minimal disruption. Capacity may be managed by reallocation of existing resources, rather
than by adding new resources.

http://www.rgpvonline.com
Manageability: A data center should perform all operations and activities in the most
efficient manner. Manageability can be achieved through automation and the reduction of
human (manual) intervention in common tasks.

Backup & Disaster Recovery: Principles

Purpose of Backup and Recovery:

As a backup administrator, your principal duty is to devise, implement, and manage a backup
and recovery strategy. In general, the purpose of a backup and recovery strategy is to protect
the database against data loss and reconstruct the database after data loss. Typically, backup
administration tasks include the following:

 Planning and testing responses to different kinds of failures


 Configuring the database environment for backup and recovery
 Setting up a backup schedule
 Monitoring the backup and recovery environment
 Troubleshooting backup problems
 Recovering from data loss if the need arises

As a backup administrator, you may also be asked to perform other duties that are related to
backup and recovery:

 Data preservation, which involves creating a database copy for long-term storage
 Data transfer, which involves moving data from one database or one host to another

The purpose of this manual is to explain how to perform the preceding tasks.

Data Protection

As a backup administrator, your primary job is making and monitoring backups for data
protection. A backup is a copy of data of a database that you can use to reconstruct data. A
backup can be either a physical backup or a logical backup.

Physical backups are copies of the physical files used in storing and recovering a database.
These files include data files, control files, and archived redo logs. Ultimately, every physical
backup is a copy of files that store database information to another location, whether on disk
or on offline storage media such as tape.

Logical backups contain logical data such as tables and stored procedures. You can use
Oracle Data Pump to export logical data to binary files, which you can later import into the
database. The Data Pump command-line clients expdp and impdp use the
DBMS_DATAPUMP and DBMS_METADATA PL/SQL packages.

Physical backups are the foundation of any sound backup and recovery strategy. Logical
backups are a useful supplement to physical backups in many circumstances but are not
sufficient protection against data loss without physical backups.

http://www.rgpvonline.com
Unless otherwise specified, the term backup as used in the backup and recovery
documentation refers to a physical backup. Backing up a database is the act of making a
physical backup. The focus in the backup and recovery documentation set is almost
exclusively on physical backups.

While several problems can halt the normal operation of an Oracle database or affect
database I/O operations, only the following typically require DBA intervention and data
recovery: media failure, user errors, and application errors. Other failures may require DBA
intervention without causing data loss or requiring recovery from backup. For example, you
may need to restart the database after an instance failure or allocate more disk space after
statement failure because of a full data file.

Media Failures

A media failure is a physical problem with a disk that causes a failure of a read from or write
to a disk file that is required to run the database. Any database file can be vulnerable to a
media failure. The appropriate recovery technique following a media failure depends on the
files affected and the types of backup available.

One particularly important aspect of backup and recovery is developing a disaster recovery
strategy to protect against catastrophic data loss, for example, the loss of an entire database
host.

User Errors

User errors occur when, either due to an error in application logic or a manual mistake, data
in a database is changed or deleted incorrectly. User errors are estimated to be the greatest
single cause of database downtime.

Data loss due to user error can be either localized or widespread. An example of localized
damage is deleting the wrong person from the employees table. This type of damage requires
surgical detection and repair. An example of widespread damage is a batch job that deletes
the company orders for the current month. In this case, drastic action is required to avoid a
extensive database downtime.

While user training and careful management of privileges can prevent most user errors, your
backup strategy determines how gracefully you recover the lost data when user error does
cause data loss.

Disaster recovery principles for any organization

Disaster recovery is becoming increasingly important for businesses aware of the threat of
both man-made and natural disasters. Having a disaster recovery plan will not only protect
your organization’s essential data from destruction, it will help you refine your business
processes and enable your business to recover its operations in the event of a disaster. Though
each organization has unique knowledge and assets to maintain, general principles can be
applied to disaster recovery. This set of planning guidelines can assist your organization in
moving forward with an IT disaster recovery project.

http://www.rgpvonline.com
Accountability and endorsement:
A key factor in the success of your disaster recovery plan will be holding someone on the
executive management team accountable. It could be the CIO, CTO, COO, or, if your
company is small, the IT director. Whether this person manages the disaster recovery
preparations or delegates them to someone else, it will be necessary for the entire
organization to know that the disaster recovery preparations are deemed essential by
executive management in order to obtain the cooperation you’ll need from all the staff
involved. Without endorsement from higher management, collecting all the information
you’ll need to make the disaster recovery project a success will be much more difficult. Even
if the disaster recovery project is managed by someone who has had the task delegated, upper
management needs to convey to the entire organization the importance and essentiality of the
project.

Identify and organize your data

One of the first steps in putting together a disaster recovery plan is to identify your mission-
critical data. Doing so will help you understand what data you need to back up for off-site
storage. It will also prompt you to document why you need this data and plan how to put it
back where it belongs in the event of a recovery operation.

Next, instruct your users to assist you in organizing the data in intuitive directories on a
central server. Even if you plan to just back up everything, knowing which files are where is
a key part of the recovery process. If, for example, when disaster strikes you have customer
data spread all over your network on different users' hard drives, finding the data will not be
easy. Restoring all the data from backup media is only half the battle. Once the data is
restored, you don't want to be walking around the office saying, "Does anyone know where
we keep the XYZ contract?" The data must be organized before you back it up.

Some data types that you should take into consideration for organization on a central
repository are as follows:

 Key customer files: contracts, agreements, contact information, proposals


 User login data: profiles, UNIX .dot files, Config.sys files, Autoexec.bat files
 Network infrastructure files: DNS, WINS, DHCP, router tables
 User directories
 Application data: databases, Web site files
 Security configuration files: ACLs, firewall rules, IDS configuration files, UNIX
password/shadow files, Microsoft Windows SAM database, VPN configuration files,
RADIUS configuration files
 Messaging files: key configuration files, user mailboxes, system accounts
 Engineer files: source code, release engineering code
 Financial and company files: general ledger, insurance policies, accounts payable
and accounts receivable files, incorporation registration, employee resource planning
(ERP) data
 License files for applications

http://www.rgpvonline.com
Asset inventory

Aside from the data itself, your company needs to have an up-to-date hardware and software
asset inventory list on hand at all times. The hardware list should include the equipment
make, model, and serial number, and a description of what each particular piece of equipment
is being used for. The software inventory should be similar, with the vendor name, version
number, patch number, license information, and what the software is being used for. The
information for each piece of equipment and software on the list should be mapped to the
corresponding devices on the company network map. Be sure to include all cables and
connectors, as well as peripheral devices such as printers, fax machines, and scanners.

You might want to submit the asset inventory list to your insurance company once a year.

Restoration and recovery procedures

Imagine that a disaster has occurred. You have the data, now what should you do with it? If
you don’t have any restoration and recovery procedures, your data won’t be nearly as useful
to you. With the data in hand, you need to be able to re-create your entire business from
brand-new systems. You’re going to need procedures for rebuilding systems and networks.
System recovery and restoration procedures are typically best written by the people that
currently administer and maintain the systems. Each system should have recovery procedures
that indicate which versions of software and patches should be installed on which types of
hardware platforms. It's also important to indicate which configuration files should be
restored into which directories. A good procedure will include low-level file execution
instructions, such as what commands to type and in what order to type them.

Document decision-making processes

Recovering your data, systems, and networks is one thing, but when you lose staff,
recovering the knowledge they held is quite different. You will never be able to recover that
knowledge completely. However, you can mitigate this loss by documenting decision-making
processes in flowcharts. To do this, have each of your staff identify decisions that they make
and then create flowcharts for their thought processes. Sample decisions could be:

 How much do you charge for a new service?


 How do you know if taking on a particular new project is worth the return?
 How do you evaluate new business?
 How do you decide whom you should partner with?
 How do you decide who your sales prospects are?
 How do you decide who your suppliers are?
 When a call comes in to your help desk, how does it get routed?
 What are your QA procedures for your product?

It's impossible to document every decision your staff is capable of making. To get started,
don't ask your staff to document every possible decision-making scenario. Ask them to

http://www.rgpvonline.com
document the three most important decision-making processes that they use on a consistent
basis. You can add new processes to your disaster recovery plan in the future, and you may
want to have employees write three new decision-making flowcharts each year at the time of
their annual reviews.

Backups are key

As an IT or network administrator, you need to bring all your key data, processes, and
procedures together through a backup system that is reliable and easy to replicate. Your IT
director's most important job is to ensure that all systems are being backed up on a reliable
schedule. This process, though it seems obvious, is often not realized. Assigning backup
responsibilities to an administrator is not enough. The IT department needs to have a written
schedule that describes which systems get backed up when and whether the backups are full
or incremental. You also need to have the backup process fully documented. Finally, test
your backup process to make sure it works. Can you restore lost databases? Can you restore
lost source code? Can you restore key system files.

Finally, you need to store your backup media off-site, preferably in a location at least 50
miles from your present office. Numerous off-site storage vendors offer safe media storage.
Iron Mountain is one example. Even if you’re using an off-site storage vendor, it doesn't hurt
to send your weekly backup media to another one of your field offices, if you have one.

Disaster strikes

Let’s say for a moment that the worst occurs and your business is devastated by a disaster, to
the point where you need to rebuild your business from scratch. Here are some of the key
steps you should take to recover your operations:

1. Notify your insurance company immediately.


2. Identify a recovery site where you will bring your business back up.
3. Obtain your asset inventory list and reorder all lost items.
4. Distribute a network map and asset inventory list to your recovery team.
5. As the new hardware comes in, have your recovery team connect the pieces.
6. Restore your network infrastructure servers first (DNS, routers, etc.).
7. Restore your application servers second.
8. Restore your user data third.
9. Perform any necessary configuration tweaks according to your guidelines.
10. Test all applications for functionality.
11. Test all user logins.
12. Put a notice on your Web site stating that your business was affected by a disaster.

http://www.rgpvonline.com
Summary recommendations

It’s likely that in the event of a real disaster, not everything will be recoverable. Your goal
should be to recover enough data, processes, and procedures so that your business can be up
and running as quickly as possible, once you’re in a new office.

Testing your plan is key to ensuring its success. A good way to test your plan is in a lab
setting. With uninstalled systems that aren’t connected to the network, see how fast you can
install your systems, configure them, and restore essential data. The best test is to use a
recovery staff other than the everyday staff that uses and administers the systems. By using
staff that aren’t familiar with everyday usage of your systems and applications, you’ll
uncover deficiencies in the processes and procedures you’ve documented. Time your
recovery scenario and see if you can improve the time it takes for recovery each time you
hold a practice drill.

Conclusion
A disaster recovery plan is essential to your company’s long-term success. Even if you never
have to use the plan, the process of putting it together will by its very nature increase the
security of your assets and improve your overall business efficiency. The preparation of a
disaster recovery plan will teach you what data is important and will necessitate that you
understand how your business works from a decision-making standpoint. Disaster recovery
can be more easily achieved if you follow this simple outline:

 Hold someone accountable for disaster recovery.


 Identify mission-critical data.
 Organize data on a central repository.
 Create procedures for recovering mission-critical servers.
 Create knowledge-based decision-making flowcharts.
 Back up your data on a regular schedule.
 Store your data off-site.
 Test your recovery plan.

Managing & Monitoring: Industry management standards (SNMP, SMI-S,


CIM):

Monitoring the Storage Infrastructure:


Monitoring helps to analyze the status and utilization of various storage infrastructure
components. This analysis facilitates optimal use of resources and proactive management.
Monitoring supports capacity planning, trend analysis, and root cause/impact analysis. As the
business grows, monitoring helps to optimize the

http://www.rgpvonline.com
storage infrastructure resources. The monitoring process also includes the storage
infrastructure’s environmental controls and the operating environments for key components
such as storage arrays and servers.
Parameters Monitored:
Storage infrastructure components should be monitored for accessibility, capacity,
performance, and security. Accessibility refers to the availability of a component to perform a
desired operation. A component is said to be accessible when it is functioning without any
fault at any given point in time. Monitoring hardware components (e.g., a SAN interconnect
device, a port, an HBA, or a disk drive) or software components (e.g., a database instance) for
accessibility involves checking their availability status by listening to pre-determined alerts
from devices. For example, a port may go down resulting in a chain of availability alerts. A
storage infrastructure uses redundant components to avoid a single point of failure. Failure of
a component may cause an outage that affects application availability, or it may cause serious
performance degradation even though accessibility is not compromised. For example, an
HBA failure can restrict the server to a few paths for access to data devices in a multipath
environment, potentially resulting in degraded performance. In a single-path environment, an
HBA failure results in complete accessibility loss between the server and the storage.
Continuously monitoring for expected accessibility of each component and reporting any
deviations helps the administrator to identify failing components and plan corrective action to
maintain SLA requirements.
Capacity refers to the amount of storage infrastructure resources available. Examples of
capacity monitoring include examining the free space available on a file system or a RAID
group, the mailbox quota allocated to users, or the numbers of ports available on a switch.
Inadequate capacity may lead to degraded performance or affect accessibility or even
application/service availability. Capacity monitoring ensures uninterrupted data availability
and scalability by averting outages before they occur. For example, if a report indicates that
90 percent of the ports are utilized in a particular SAN fabric, a new switch should be added
if more arrays and servers need to be installed on the same fabric. Capacity monitoring is
preventive and predictive, usually leveraged with advanced analytical tools for trend analysis.
These trends help to understand emerging challenges, and can provide an estimation of time
needed to meet them.

Performance monitoring evaluates how efficiently different storage infrastructure


components are performing and helps to identify bottlenecks. Performance monitoring
usually measures and analyzes behavior in terms of response time or the ability to perform at
a certain predefined level. It also deals with utilization of resources, which affects the way
resources behave and respond. Performance measurement is a complex task that involves
assessing various components on several interrelated parameters. The number of I/Os to
disks, application response time, network utilization, and server CPU utilization are examples
of performance monitoring.

SNMP (Simple Network Management Protocol):

Simple Network Management Protocol (SNMP) is an "Internet-standard protocol for


managing devices on IP networks." Devices that typically support SNMP include routers,
switches, servers, workstations, printers, modem racks, and more. It is used mostly in
network management systems to monitor network-attached devices for conditions that
warrant administrative attention. SNMP is a component of the Internet Protocol Suite as
defined by the Internet Engineering Task Force (IETF). It consists of a set of standards for

http://www.rgpvonline.com
network management, including an application layer protocol, a database schema, and a set of
data objects]

SNMP exposes management data in the form of variables on the managed systems, which
describe the system configuration. These variables can then be queried (and sometimes set)
by managing applications.

In typical SNMP uses, one or more administrative computers, called managers, have the task
of monitoring or managing a group of hosts or devices on a computer network. Each
managed system executes, at all times, a software component called an agent which reports
information via SNMP to the manager.

Essentially, SNMP agents expose management data on the managed systems as variables.
The protocol also permits active management tasks, such as modifying and applying a new
configuration through remote modification of these variables. The variables accessible via
SNMP are organized in hierarchies. These hierarchies, and other metadata (such as type and
description of the variable), are described by Management Information Bases (MIBs).

An SNMP-managed network consists of three key components:

 Managed device
 Agent — software which runs on managed devices
 Network management system (NMS) — software which runs on the manager

A managed device is a network node that implements an SNMP interface that allows
unidirectional (read-only) or bidirectional access to node-specific information. Managed
devices exchange node-specific information with the NMSs. Sometimes called network
elements, the managed devices can be any type of device, including, but not limited to,
routers, access servers, switches, bridges, hubs, IP telephones, IP video cameras, computer
hosts, and printers.

An agent is a network-management software module that resides on a managed device. An


agent has local knowledge of management information and translates that information to or
from an SNMP specific form.

A network management system (NMS) executes applications that monitor and control
managed devices. NMSs provide the bulk of the processing and memory resources required
for network management. One or more NMSs may exist on any managed network

http://www.rgpvonline.com
SMI-S:

SMI-S defines a method for the interoperable management of a heterogeneous Storage Area
Network (SAN), and describes the information available to a WBEM Client from an SMI-S
compliant CIM Server and an object-oriented, XML-based, messaging-based interface
designed to support the specific requirements of managing devices in and through SANs.
Learn more about SMI-S. Developer support for SMI-S is available through the SNIA SMI-S
Google group.

CIM (Common Information Model):

The Common Information Model (CIM) is an open standard that defines how managed
elements in an IT environment are represented as a common set of objects and relationships
between them. This is intended to allow consistent management of these managed elements,
independent of their manufacturer or provider.

Overview:

One way to describe CIM is to say that it allows multiple parties to exchange management
information about these managed elements. However, this falls short in expressing that CIM
not only represents these managed elements and the management information, but also
provides means to actively control and manage these elements. By using a common model of
information, management software can be written once and work with many implementations
of the common model without complex and costly conversion operations or loss of
information.

The CIM standard is defined and published by the Distributed Management Task Force
(DMTF). A related standard is Web-Based Enterprise Management (WBEM, also defined by
DMTF) which defines a particular implementation of CIM, including protocols for
discovering and accessing such CIM implementations.

Schema and specifications:


The CIM standard includes the CIM Infrastructure Specification and the CIM Schema:

 CIM Infrastructure Specification

The CIM Infrastructure Specification defines the architecture and concepts of CIM,
including a language by which the CIM Schema (including any extension schema) is
defined, and a method for mapping CIM to other information models, such as SNMP.
The CIM architecture is based upon UML, so it is object-oriented: The managed
elements are represented as CIM classes and any relationships between them are
represented as CIM associations. Inheritance allows specialization of common base
elements into more specific derived elements.

http://www.rgpvonline.com
 CIM Schema

The CIM Schema is a conceptual schema which defines the specific set of objects and
relationships between them that represent a common base for the managed elements
in an IT environment. The CIM Schema covers most of today's elements in an IT
environment, for example computer systems, operating systems, networks,
middleware, services and storage. The CIM Schema defines a common basis for
representing these managed elements. Since most managed elements have product
and vendor specific behavior, the CIM Schema is extensible in order to allow the
producers of these elements to represent their specific features seamlessly together
with the common base functionality defined in the CIM Schema.

Key management metrics (Thresholds, availability, capacity, security, performance):

Monitoring Examples:
A storage infrastructure requires implementation of an end-to-end solution to actively
monitor all the parameters of its critical components. Early detection and instant alerts ensure
the protection of critical assets. In addition, the monitoring tool should be able to analyze the
impact of a failure and deduce the root cause of symptoms.

Accessibility Monitoring
Failure of any component may affect the accessibility of one or more components due to their
interconnections and dependencies, or it may lead to overall performance degradation.
Consider an implementation in a storage infrastructure with three servers: H1, H2, and H3.
All the servers are configured with two HBAs, each connected to the storage array through
two switches, SW1 and SW2, as shown in Figure 16-1. The three servers share two storage
ports on the storage array. Path failover software has been installed on all three servers.

http://www.rgpvonline.com
Capacity Monitoring
In the scenario shown in Figure 16-4, each of the servers is allocated storage on the storage
array. When a new server is deployed in this configuration, the applications on the new
servers have to be given access to the storage devices from the array through switches SW1
and SW2. Monitoring the available capacity on the array helps to proactively decide whether
the array can provide the required storage to the new server. Other considerations include the
availability of ports on SW1 and SW2 to connect to the new server as well as the availability
of storage ports to connect to the switches. Proactive monitoring also helps to identify the
availability of an alternate fabric or an array to connect to the server.

http://www.rgpvonline.com
The following example illustrates the importance of monitoring file system capacity on file
servers. If file system capacity monitoring is not implemented, as shown in Figure 16-5 (a),
and the file system is full, the application most likely will not function properly. Monitoring
can be configured to issue a message when thresholds are reached on file system capacity.
For example, when the file system reaches 66 percent of its capacity a warning message is
issued, and a critical message when the file system reaches 80 percent of its capacity (see
Figure 16-5 [b]). This enables the administrator to take action manually or automatically to
extend the file system before the full condition is reached. Proactively monitoring the The
following example illustrates the importance of monitoring file system capacity on file
servers. If file system capacity monitoring is not implemented, as shown in Figure 16-5 (a),
and the file system is full, the application most likely will not function properly. Monitoring
can be configured to issue a message when thresholds are reached on file system capacity.
For example, when the file system reaches 66 percent of its capacity a warning message is
issued, and a critical message when the file system reaches 80 percent of its capacity (see
Figure 16-5 [b]).

http://www.rgpvonline.com
Performance Monitoring
The example shown in Figure 16-6 illustrates the importance of monitoring performance on
storage arrays. In this example, servers H1, H2, and H3 (with two HBAs each) are connected
to the storage array through switches SW1 and SW2. The three servers share the same storage
ports on the storage array. A new server, H4 running an application with high work load, has
to be deployed to share the same storage ports as H1, H2, and H3. Monitoring array port
utilization ensures that the new server does not adversely affect the performance of the other
servers. In this example, utilization for the shared ports is shown by the solid and dotted lines
in the line graph for the storage ports. Notice that the port represented by a solid line is close
to 100 percent utilization. If the actual utilization of both ports prior to deploying the new
server is closer to the dotted line, there is room to add the new server. Otherwise, deploying
the new server will affect the performance of all servers.

http://www.rgpvonline.com
Most servers offer tools that enable interactive monitoring of server CPU usage. For example,
Windows Task Manager displays CPU and memory usage, as shown in Figure 16-7. These
interactive tools are useful only when a few servers need to be managed. A storage
infrastructure requires performance monitoring tools that are capable of monitoring many
servers simultaneously. Although it is inefficient to monitor hundreds of servers continuously
in real-time, this monitoring often uses polling servers at regular intervals. These monitoring
tools must have the capability to send alerts whenever the

Security Monitoring
The example shown in Figure 16-8 illustrates the importance of monitoring security breaches
in a storage array. In this example, the storage array is shared between two workgroups, WG1
and WG2. The data of WG1 should not be accessible by WG2. Likewise, WG2 should not be
accessible by WG1. A user from WG1 may try to make a local replica of the data that
belongs to WG2. Usually, available mechanisms prevent such an action. However, if this
action is not monitored or recorded, it is difficult to track such a violation of security
protocols. Conversely, if this action is monitored, a warning message can be sent to prompt a
corrective action or at least enable discovery as part of regular auditing operations.

http://www.rgpvonline.com
Example of host security monitoring involves login failures at the host. These login failures
may be accidental (mistyping) or a deliberate attempt to access a server. Many servers
usually allow two successive login failures, prohibiting additional attempts after three
consecutive login failures. In most environments, login information is recorded in a system
log file. In a monitored environment, three successive login failures usually triggers a
message, warning of a possible security threat.

http://www.rgpvonline.com

You might also like