0% found this document useful (0 votes)
6 views

ECS - ECS PS Procedures-ECS Appliance Tech Refresh - 3.6.x.x

Uploaded by

ali2k2sec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

ECS - ECS PS Procedures-ECS Appliance Tech Refresh - 3.6.x.x

Uploaded by

ali2k2sec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

ECS ™ Procedure Generator

Solution for Validating your engagement

ECS Appliance Tech Refresh - 3.6.x.x

Topic
ECS PS Procedures
Selections
ECS Professional Services Procedures: ECS Appliance Tech Refresh - 3.6.x.x

Generated: July 7, 2022 6:32 PM GMT

REPORT PROBLEMS

If you find any errors in this procedure or have comments regarding this application, send email to
[email protected]

Copyright © 2022 Dell Inc. or its subsidiaries. All Rights Reserved. Dell Technologies, Dell, EMC, Dell
EMC and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be
trademarks of their respective owners.

The information in this publication is provided “as is.” Dell Inc. makes no representations or warranties of
any kind with respect to the information in this publication, and specifically disclaims implied warranties of
merchantability or fitness for a particular purpose.

Use, copying, and distribution of any software described in this publication requires an applicable
software license.

This document may contain certain words that are not consistent with Dell's current language guidelines.
Dell plans to update the document over subsequent future releases to revise these words accordingly.

This document may contain language from third party content that is not under Dell's control and is not
consistent with Dell's current guidelines for Dell's own content. When such third party content is updated
by the relevant third parties, this document will be revised accordingly.

Publication Date: July, 2022

Dell Technologies Confidential Information version: 2.3.6.91

Page 1 of 51
Contents
Preliminary Activity Tasks .......................................................................................................3
Read, understand, and perform these tasks.................................................................................................3

ECS Appliance Tech Refresh - 3.6.x.x....................................................................................5

Dell Technologies Confidential Information version: 2.3.6.91

Page 2 of 51
Preliminary Activity Tasks
This section may contain tasks that you must complete before performing this procedure.

Read, understand, and perform these tasks


1. Table 1 lists tasks, cautions, warnings, notes, and/or knowledgebase (KB) solutions that you need to
be aware of before performing this activity. Read, understand, and when necessary perform any
tasks contained in this table and any tasks contained in any associated knowledgebase solution.

Table 1 List of cautions, warnings, notes, and/or KB solutions related to this activity

2. This is a link to the top trending service topics. These topics may or not be related to this activity.
This is merely a proactive attempt to make you aware of any KB articles that may be associated with
this product.

Note: There may not be any top trending service topics for this product at any given time.

ECS Top Service Topics

Dell Technologies Confidential Information version: 2.3.6.91

Page 3 of 51
Dell Technologies Confidential Information version: 2.3.6.91

Page 4 of 51
ECS Appliance Tech Refresh - 3.6.x.x

Note: The next section is an existing PDF document that is inserted into this procedure. You may see
two sets of page numbers because the existing PDF has its own page numbering. Page x of y on the
bottom will be the page number of the entire procedure.

Dell Technologies Confidential Information version: 2.3.6.91

Page 5 of 51
ECS Tech Refresh
3.6

October 2021
Rev. 1.3

Page 6 of 51
Notes, cautions, and warnings

NOTE: A NOTE indicates important information that helps you make better use of your product.

CAUTION: A CAUTION indicates either potential damage to hardware or loss of data and tells you how to avoid
the problem.

WARNING: A WARNING indicates a potential for property damage, personal injury, or death.

© 2021 Dell Inc. or its subsidiaries. All rights reserved. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other
trademarks may be trademarks of their respective owners.

Page 7 of 51
Overview
This document provides guidance on replacing ECS Gen 1 and Gen 2 hardware to Gen 3 hardware.
This document provides information about:
● Simplifying the hardware replacement process
● Extending an existing cluster
● Reducing the size of existing cluster
The following graphic provides an overview of the process.

Figure 1. ECS Tech Refresh process

Tech Refresh terminology and process overview


Learn about the terms and processes that are referenced throughout this document.

Terminology
Source node: Node to be decommissioned
Target node: New node to receive data
Extension: Service procedure to add new nodes to an existing ECS cluster.
Migration: Object data migration from set of source nodes to target nodes.
Node Evacuation: : Service procedure to remove a node from the cluster.

Process overview
The following graphics provide figures of the processes that are referenced in this document.

Overview 3

Page 8 of 51
Figure 2. Node Extension

Figure 3. Data Migration

4 Overview

Page 9 of 51
Figure 4. Node Evacuation

Overview 5

Page 10 of 51
Contents
Overview......................................................................................................................................................................................3
Tech Refresh terminology and process overview........................................................................................................3
Figures.......................................................................................................................................... 7

Tables........................................................................................................................................... 8

Chapter 1: Extend Nodes and Rack ............................................................................................... 9


ECS software extend overview......................................................................................................................................10
Prerequisites....................................................................................................................................................................... 10
ECS software extend limitations.................................................................................................................................... 12
Software extend readiness checklist............................................................................................................................ 13
Connect to the ECS appliance........................................................................................................................................13
Connect from a remote location...............................................................................................................................13
Connect a service laptop to a U-Series or D-Series rack on site.....................................................................13
Connect a service laptop to the EX Series rack...................................................................................................14
ECS software extend procedure....................................................................................................................................16

Chapter 2: Migrate Data...............................................................................................................17


Migration planning..............................................................................................................................................................17
Migration prerequisites, restrictions, and limitations.................................................................................................17
Run optional premigration health checks..................................................................................................................... 18
.......................................................................................................................................................................................... 18
Trigger data migration...................................................................................................................................................... 19
Manage migration ............................................................................................................................................................ 23
Monitor migration status using ECS Service Console........................................................................................ 23
Monitor migration status using ECS UI Grafana Dashboard.............................................................................25
Pause and resume migration ................................................................................................................................... 26
Data migration throttling........................................................................................................................................... 29
Migration alerts..................................................................................................................................................................30
Enable migration capacity alerts....................................................................................................................................30

Chapter 3: Remove a node from a cluster using ECS Service Console.......................................... 31


Run optional pre—node evacuation health checks................................................................................................... 31
.......................................................................................................................................................................................... 31
Remove a node from a cluster using ECS Service Console....................................................................................32
Move licenses to new ECS system............................................................................................................................... 39
Carry out post—Tech Refresh checks .......................................................................................................................39

Part I: Document feedback.......................................................................................................... 44


Index........................................................................................................................................... 45

6 Contents

Page 11 of 51
Figures

1 ECS Tech Refresh process......................................................................................................................................3


2 Node Extension...........................................................................................................................................................4
3 Data Migration............................................................................................................................................................ 4
4 Node Evacuation........................................................................................................................................................ 5
5 Fox switch.................................................................................................................................................................. 14
6 Total capacity on target calculation.....................................................................................................................17
7 ECS UI Grafana Dashboard Data Migration Status......................................................................................... 26
8 ECS UI Storage Pools Management................................................................................................................... 43

Figures 7

Page 12 of 51
Tables

1 Tasks to complete before arriving onsite........................................................................................................... 10


2 Final checks before initiating the Software extend......................................................................................... 13

8 Tables

Page 13 of 51
1
Extend Nodes and Rack
Node and Rack extension is the required first step in the Tech Refresh. Ensure that you carry out the steps documented.
Topics:
• ECS software extend overview
• Prerequisites
• ECS software extend limitations
• Software extend readiness checklist
• Connect to the ECS appliance
• ECS software extend procedure

Extend Nodes and Rack 9

Page 14 of 51
ECS software extend overview
This section provides overview, and workflow for Node and Rack extend procedures.

Workflow
This section provides the flow of the software expansion procedures.
● Connect to existing VDC.
● Configure and upgrade Service Console (SC).
● Run the SC health check from R1N1.
● Run the collection on each of the existing racks. you can get the script from the KB article KB 531528. Provide the
information to Customer Service (CS) for review.
● The Professional Services (PS) or partner team sends an email with UPDATE in subject to mailto:[email protected] to
get the latest designer spreadsheet.
● The PS team fills out the designer with the new nodes and racks information.
NOTE: On dark sites where you cannot share the information, you must send an email to mailto:[email protected]
for procedure.
● The PS team installs the operating system:
○ Installs the operating system for new rack.
○ Installs the operating system for new nodes in the existing rack.
● The PS team uses the extend scripts which configure the networking on the new nodes and rack. They run the basic
validations and provide the extend.ini in /tmp/extend/extend.ini folder.
● PS team runs the SC Extend process using extend.ini and waits until the fabric allocates all the drives.
● PS team runs health checks
● PS team validates new nodes that are shown in the user interface for VDC endpoints.

Prerequisites
Table 1. Tasks to complete before arriving onsite
Item (engage with the customer) Link to procedure in these instructions OR next step...
Obtain the ECS Portal login credentials from the onsite Provision new storage for the extended nodes
contact.
Run the Compatibility Checker. Run the Compatibility Checker
Assign IPs to new nodes. ● Static—Configure on existing VDC using the command:
setrackinfo
● Dynamic host configuration protocol (DHCP)—Add nodes
to DHCP server
Set Domain Name Services (DNS) for extended nodes. Add extended nodes to DNS (both forward and reverse
lookup).
Is network separation implemented? If the customer chose Inquire before arriving onsite.
to use network separation, you must have completed all
prerequisites and procedures that are outlined in the ECS
Networks document in SolVe Desktop or online.
Any custom switch configurations implemented in the existing If custom switch configurations exist, then you must replicate
ECS appliance? across extended nodes.
Determine current ECS software version running at the Connect remotely and run the command:
customer site.
admin@provo-yellow:~> svc_version svc_version
NOTE: If the environment is running general patches
v1.0.7 (svc_tools v1.5.0)
code, download the production.tgz from a different
location.

10 Extend Nodes and Rack

Page 15 of 51
Table 1. Tasks to complete before arriving onsite (continued)
Item (engage with the customer) Link to procedure in these instructions OR next step...

Example Output:

ECS Version: 3.2.2.1


Object Version 3.2.2.1-102513.515d86e
OS Version 3.2.2.0-1964.f8d017f.44
Fabric Version 1.5.0.0-3545.d53cc93
Fabric-agent Version 1.5.0.0-3545.d53cc93

Syslog Version <Unknown>


Zookeeper Version 3.4.9.0-82.0ecec52

Registry Version 2.3.1.0-58.3a6dfaf


Utilities Version 1.5.0.0-3545.d53cc93

SC Version 3.0.0.0-19361.2c53303a9*
xDoctor Version 4.7-49

svc_tools Version 1.5.0

*Versions differ between nodes

|
Config Changes Mismatched Invalid
Patch(es) installed

| Detected Patch(es)
Patch(es)
--------------------
Config Changes Detected
Mismatched Patch(es)
Invalid Patch(es)
--------------------
<None - running GA release>

Familiarize yourself with the applicable ECS documentation. ● ECS Release Notes
● Pertinent KnowledgeBase (KB) articles
● Available in SolVe Desktop or online:
○ ECS Compatibility Checker User Guide
○ xDoctor User's Guide

Extend Nodes and Rack 11

Page 16 of 51
ECS software extend limitations
This section provides the limitations for performing the node and rack extend procedure.
● If there are more than five static routes configured on the existing racks, then you must configure and validate the additional
static routes manually in NAN, before performing the extend operation.
● If the existing VDC contains SSD Read cache, the extended nodes must have the SSD Read Cache hardware that is installed
before performing the extend. Mixed configuration is not supported for SSD read cache.
● Do not use the Service Console extend process when creating a storage pool with the extended nodes.
● Service Console (SC) extend supports extending with one rack at a time. For multiple racks, SC extend command should be
run for each rack.
● Automatic extend node imaging or configuration is not supported.

12 Extend Nodes and Rack

Page 17 of 51
Software extend readiness checklist
Do not proceed with the software extend until all items are complete.

Table 2. Final checks before initiating the Software extend


Insert Item
check

Relevant ECS documentation reviewed?


Same ECS OS version is installed on the new hardware as on the existing hardware?
Extend node IPs added to DHCP OR static information available?
Extend nodes added to DNS?
Network separation IP addresses available to configure network separation?
ECS Public switches configured?
New ECS hardware installed and validated as outlined in the ECS Capacity Expansion section of SolVe Desktop
or online?
Latest version of xDoctor installed? See the xDoctor User's Guide available in SolVe Desktop or online.
Latest version of Service Console installed?

Connect to the ECS appliance


This section outlines how to connect to the ECS appliance (Node1 Rack1) locally onsite or remotely.

Connect from a remote location


Use Secure Remote Services to connect remotely. See the ECS Software Installation Guide for procedures.

Connect a service laptop to a U-Series or D-Series rack on site


Access the U-Series or D-Series rack using the private network (192.168.219.XXX) from the laptop.

Steps
1. Connect your laptop to port 24 of the 1 GbE Turtle switch.
2. Configure your laptop with the following network parameters:
● IP: 192.168.219.99
● Netmask: 255.255.255.0
● No Gateway
3. Validate and ping node1 or the rack that you connected.
4. Ping the node 1 or the rack at: 192.168.219.1 so that later you can ssh to the node1 or that rack later in the procedure.

ping 192.168.219.1

NOTE: If 192.168.219.1 does not answer, try 192.168.219.2. If there is no response, verify the laptop IP/subnet mask,
network connection, and switch port connection. If the service laptop is connected to Dell VPN, ping to 192.168.219.x
does not return a response.

Extend Nodes and Rack 13

Page 18 of 51
Connect a service laptop to the EX Series rack
Access an ECS EX-Series rack using the private (192.168.219.XXX) network from a laptop.

Prerequisites
● Access to private network IP addresses (192.168.219.1 to 16 and 192.168.219.101 to 116) are limited to the nodes connected in
the rack backend 1/10/25GbE fox management switch.
● Private.4 (NAN) network IP addresses (169.254.x.x) of all nodes in all racks in the ECS Virtual Data Center (VDC) are
accessible from any node in the ECS VDC once you SSH in to a node using a private IP address (192.168.219.x).
● If security lock down is not enabled, access to public network IP addresses for all ECS racks are available once you SSH to
one of the ECS nodes .
● Two Switches Fox and Hound are used for the private network, or Nile Area Network (NAN). For example, node 8 must
connect to Hound port 8 and Fox port 8. For more information, see ECS EX Series Hardware Guide.

Steps
1. Connect your service laptop to the VDC.

Option Description
If the cabinet contains a service Open the service shelf, and connect the red network cable to the service laptop.
shelf with a red network cable...
The red cable connects to port 34 on the fox switch. The fox switch is the bottom
back-end switch in a dual switch configuration.

If the cabinet does not contain a From the rear of the rack, connect directly to either port 34 or 36 on the fox
service shelf with a red network switch, whichever port contains a 1GB SFP.
cable...
If you want to connect a service Locate port 36 on the fox switch. The fox switch is the bottom back-end switch in
laptop to the rear of the rack... a dual switch configuration.
Port 36 has a 1GB SFP that you can connect your service laptop to with a Cat6
cable.

Figure 5. Fox switch

1 - Port 34 for service tray connection


2 - Port 36 for connection from rear

2. Set the network interface on the laptop to the static address 192.168.219.99, subnet mask 255.255.255.0, with no gateway
required.
3. Verify that the temporary network between the laptop and rack's private management network is functioning by using the
ping command.
NOTE: If 192.168.219.1 does not answer, try 192.168.219.2. If neither responds, verify the laptop IP/subnet mask,
network connection, and switch port connection. If the service laptop is connected to Dell's VPN, ping to 192.168.219.x
may not return a response.

C:\>ping 192.168.219.1
Pinging 192.168.219.1 with 32 bytes of data:
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64

14 Extend Nodes and Rack

Page 19 of 51
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64
Reply from 192.168.219.1: bytes=32 time<1ms TTL=64

Ping statistics for 192.168.219.1:


Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 0ms, Maximum = 0ms, Average = 0ms

Extend Nodes and Rack 15

Page 20 of 51
ECS software extend procedure
For detailed node and rack software extend procedures, see ECS 3.6 Software Extend Instructions guide.

16 Extend Nodes and Rack

Page 21 of 51
2
Migrate Data
This chapter provides information about how to migrate data.
The example outputs in the procedures may not represent the exact output. Use it for reference only.
Topics:
• Migration planning
• Migration prerequisites, restrictions, and limitations
• Run optional premigration health checks
• Trigger data migration
• Manage migration
• Migration alerts
• Enable migration capacity alerts

Migration planning
Read about how to determine how many nodes you must carry out a migration for a storage pool, and other planning activities.
Calculate the capacity needed for target hardware, and add enough new hardware to the storage pool as target nodes.
● Estimate if new hardware has enough capacity for the migration to successfully complete. Use the following loose formula to
calculate capacity:

Figure 6. Total capacity on target calculation


● Use existing capacity tools to calculate total capacity.
● Metadata update garbage is 1% of total capacity on the source.
● Estimate data injection during migration from current load pattern and confirm with the customer.
● The target hardware must have at least five nodes, which is the minimum requirement for an ECS system.

Migration prerequisites, restrictions, and limitations


Learn about the prerequisites, restrictions, and limitations for migrating data.

Prerequisites
● The ECS system must be running ECS 3.5 or later for Tech Refresh.
Ensure that the source nodes and target nodes are running the same ECS code version of 3.5 or later.
● Ensure that the target nodes are added into the ECS system storage pool before you start the tech refresh data migration.
● Ensure that the target nodes are clean and not provisioned before adding it to the existing system.
● Ensure that any ongoing PSO or VDC removal from RG is complete before you can trigger tech refresh data migration.
● Ensure that any ongoing CAS migration is complete before you trigger tech refresh data migration.
● Ensure that the NTP server is working and that there is no material NTP drifting between nodes.

Migrate Data 17

Page 22 of 51
Restrictions and limitations
● ECS does not support upgrading to a major ECS version during Tech Refresh data migration or triggering Tech migration
during an ECS major version operating system upgrade. If you need an ECS patch upgrade, contact the Dell EMC support
team.
● Migration cannot be canceled or reverted once it is triggered.
● Tech Refresh does not support adding new source nodes after you trigger data migration. If the end user fails to include all
the source node IP addresses in the trigger data migration command, they must wait for the command to complete its run,
and then rerun the command with any missed source node IP addresses.
● Tech Refresh data migration does not support performing a user-initiated Planned Site Outage (PSO) during data migration.
● If the environment has Geo Clusters configured in at least three sites with a large Delete load, or has a large XOR DECODE
task backlog, data migration may be limited or blocked by data chunks in GEO DELETING status. Data migration continues
after XOR DECODE tasks complete.

Run optional premigration health checks


If you choose, use the ECS Service Console to run health checks before you trigger data migration. There are two different
commands, which run different checks.

Steps
From the ECS Service Console, run the run Health_Check command with the - pre_data_migration tag. For example:
service-console run Health_Check --tags pre_data_migration
Output such as the following appears:

service-console run Health_Check --tags pre_data_migration

Service Console is running on node 169.254.89.1 (suite 20200408_200510_Health_Check)


Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-console/log/20200408_200505_run_Health_Check/
dbg_robot.log
================================================================================
Health Check
20200408 20:05:29.577: Execute Health Checks
20200408 20:05:29.586: | Validate that all nodes are available - OS
20200408 20:05:33.640: | | PASS (4 sec)
20200408 20:05:33.641: | Validate time drift
20200408 20:05:36.099: | | PASS (2 sec)
20200408 20:05:36.101: | Validate that all partitions are under control
20200408 20:07:44.609: | | PASS (2 min 8 sec)
20200408 20:07:44.612: | Check DT status
Checking DT status (with timeout 10 min).
20200408 20:08:25.353: | | PASS (40 sec)
20200408 20:08:25.354: | Check on-going PSO or VDC removal from RG
20200408 20:08:50.959: | | PASS (25 sec)
20200408 20:08:50.961: | Validate that there are no transformation instances
20200408 20:08:55.278: | | PASS (4 sec)
20200408 20:08:55.279: | PASS (3 min 25 sec)
================================================================================
Status: PASS
Time Elapsed: 3 min 56 sec
Debug log: /opt/emc/caspian/service-console/log/20200408_200505_run_Health_Check/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200408_200505_run_Health_Check/log.html
================================================================================

18 Migrate Data

Page 23 of 51
Trigger data migration
Trigger data migration from source nodes to newly extended target nodes.

Steps
1. From the ECS Service Console, run the run Data_Migration command. For example:
service-console run Data_Migration --target-node <source node private.4 IPs for data
migration>
Where the IP addresses used are those IP addresses of the source nodes. That is, the nodes from which the data is being
migrated. Multiple nodes IP addresses are comma-separated. For example:

service-console run Data_Migration --target-node


169.254.89.1,169.254.89.2,169.254.89.3,169.254.89.4,169.254.89.5,169.254.89.6,169.254.
89.7,169.254.89.8

2. In the ECS Service Console, look for output: Do you confirm the nodes selected for the migration?
[yes/no]:
Output such as the following appears:

Data Migration command output for confirmation:


Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-console/log/20200417_172741_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200417 17:28:11.980: Is migration running
20200417 17:28:19.992: | PASS (8 sec)
20200417 17:28:19.996: Check data migration parameters
20200417 17:28:20.001: | PASS
20200417 17:28:20.048: Check for multiple Storage pools
20200417 17:28:20.049: | PASS
20200417 17:28:20.049: Validate number of VNEST members
20200417 17:28:21.362: | PASS (1 sec)
20200417 17:28:21.365: Check migration capacity
20200417 17:28:41.752: | PASS (20 sec)
20200417 17:28:41.753: Check source and target nodes version
20200417 17:29:09.813: | PASS (28 sec)
20200417 17:29:09.815: System is fully upgraded
20200417 17:29:18.485: | PASS (8 sec)
Data Migration Pre Check
20200417 17:29:18.737: Run health check
20200417 17:29:18.862: | Validate that all nodes are available - OS
20200417 17:29:23.026: | | PASS (4 sec)
20200417 17:29:23.028: | Validate time drift
20200417 17:29:26.503: | | PASS (3 sec)
20200417 17:29:26.505: | Validate that all partitions are under control
20200417 17:31:31.161: | | PASS (2 min 4 sec)
20200417 17:31:31.163: | Check DT status
Checking DT status (with timeout 10 min).
20200417 17:32:34.126: | | PASS (1 min 2 sec)
20200417 17:32:34.128: | Check on-going PSO or VDC removal from RG
20200417 17:32:53.221: | | PASS (19 sec)
20200417 17:32:53.222: | Validate that there are no transformation instances
20200417 17:32:58.036: | | PASS (4 sec)
20200417 17:32:58.038: | PASS (3 min 39 sec)
Data Migration Plan
20200417 17:32:58.438: Format data migration message
20200417 17:32:58.440: | PASS
We are going to start data migration from the nodes 169.254.89.1, 169.254.89.2,
169.254.89.3, 169.254.89.4, 169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
Once data migration is triggered for a node, it could not be reverted.
Do you confirm the nodes selected for the migration? [yes/no]: 20200417 17:33:56.945:
Mark nodes as migration source
20200417 17:41:04.299: | PASS (7 min 7 sec)
Data Migration Trigger
20200417 17:41:04.454: Show migration progress
Host : layton-brass.ecs.lab.emc.com

Migrate Data 19

Page 24 of 51
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : logan-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : murray-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : sandy-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : ogden-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : lehi-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : orem-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

Host : provo-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
NOT_TRIGGERED 0.00TiB/0.00TiB

20200417 17:41:05.732: | PASS (1 sec)


We are ready to start data migration.
Do you want to continue? [yes/no]:

If you select No, Service Console carries out a health check only and does not mark source nodes for migration.
If you select Yes, Service Console output lists the source nodes and the amount of data that will be migrated from each
node.
3. Verify in the output the amount of data to be migrated and that the migration is triggered.

20200417 17:51:58.660: Stop chunk re-balance


20200417 17:52:05.742: | PASS (7 sec)
20200417 17:52:05.743: Trigger Data Migration
20200417 17:52:07.832: | PASS (2 sec)
Data Migration Main Phase
20200417 17:52:15.133: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.06TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com

20 Migrate Data

Page 25 of 51
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.04TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.03TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.07TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.02TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[-------------------------------------------------------------------------------------
---------------] 0.00%
ONGOING 43.03TiB/43.03TiB

20200417 17:52:16.284: | PASS (1 sec)


20200417 17:52:16.287: Print data migration instruction and exit
The data migration is running.
20200417 17:52:16.299: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 24 min 38 sec
Debug log: /opt/emc/caspian/service-console/log/20200417_172741_run_Data_Migration/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200417_172741_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

The following output shows a complete migration:

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200422_192851_Data_Migration)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200422 19:29:14.784: Is migration running
20200422 19:29:18.853: | PASS (4 sec)
20200422 19:29:18.860: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200422 19:29:18.862: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200422 19:29:33.417: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com

Migrate Data 21

Page 26 of 51
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

20200422 19:29:33.988: | PASS


20200422 19:29:33.990: Print data migration instruction and exit
The data migration is done.
20200422 19:29:33.991: | PASS
Data Migration Post Check
20200422 19:29:38.220: Run health check
20200422 19:29:38.322: | Validate that all partitions are under control
20200422 19:31:04.722: | | PASS (1 min 26 sec)
20200422 19:31:04.724: | PASS (1 min 26 sec)
Data Migration Teardown
20200422 19:31:09.975: Start chunk re-balance
20200422 19:31:14.696: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 2 min 36 sec
Debug log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is done.
================================================================================

When the migration is complete, the Service Console shows that the migration is complete in the output. If you rerun
the run Data_Migration command, the output shows a migration failure. This behavior is according to problem
CONSOLE-2383.

22 Migrate Data

Page 27 of 51
Manage migration
Learn how you can manage your data migration in ECS tech refresh.

Monitor migration status using ECS Service Console


Use the ECS Service Console to monitor the migration status.

Steps
1. From the ECS Service Console, run the run Data_Migration command:
When the migration is complete, the Service Console shows that the migration is complete in the output.
service-console run Data_Migration

For example:

service-console run Data_Migration

2. Verify that the output shows that the migration is running and in progress.

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200420_140658_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-console/log/20200420_140654_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200420 14:07:17.065: Is migration running
20200420 14:07:21.273: | PASS (4 sec)
20200420 14:07:21.281: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200420 14:07:21.284: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200420 14:07:22.127: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.82%
ONGOING 19.02TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.87%
ONGOING 18.99TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.76%
ONGOING 19.05TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.95%
ONGOING 18.96TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.13%
ONGOING 19.31TiB/43.03TiB

Migrate Data 23

Page 28 of 51
Host : lehi-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.92%
ONGOING 18.98TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[******************************************************-------------------------------
---------------] 54.46%
ONGOING 19.59TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[*******************************************************------------------------------
---------------] 55.63%
ONGOING 19.09TiB/43.03TiB

20200420 14:07:22.889: | PASS


20200420 14:07:22.891: Pause or resume migration
20200420 14:07:22.892: | PASS
20200420 14:07:22.893: Print data migration instruction and exit
The data migration is running.
20200420 14:07:22.895: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 31 sec
Debug log: /opt/emc/caspian/service-console/log/20200420_140654_run_Data_Migration/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200420_140654_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
==================================================

3. Verify that migration is complete.

/opt/emc/bin/service-console run Data_Migration


Service Console is running on node 169.254.89.1 (suite 20200422_192851_Data_Migration)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200422 19:29:14.784: Is migration running
20200422 19:29:18.853: | PASS (4 sec)
20200422 19:29:18.860: Check data migration parameters
Warning: target node option has no effect on the already started migration.
20200422 19:29:18.862: | PASS
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200422 19:29:33.417: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%

24 Migrate Data

Page 29 of 51
COMPLETE 0.00TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[*************************************************************************************
***************] 100.00%
COMPLETE 0.00TiB/43.03TiB

20200422 19:29:33.988: | PASS


20200422 19:29:33.990: Print data migration instruction and exit
The data migration is done.
20200422 19:29:33.991: | PASS
Data Migration Post Check
20200422 19:29:38.220: Run health check
20200422 19:29:38.322: | Validate that all partitions are under control
20200422 19:31:04.722: | | PASS (1 min 26 sec)
20200422 19:31:04.724: | PASS (1 min 26 sec)
Data Migration Teardown
20200422 19:31:09.975: Start chunk re-balance
20200422 19:31:14.696: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 2 min 36 sec
Debug log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200422_192849_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is done.
================================================================================

Monitor migration status using ECS UI Grafana Dashboard


Use the ECS UI Grafana Dashboard to monitor the migration status.

About this task

NOTE: Allow 5 to 10 minutes for the Grafana dashboard to display all the nodes.

Steps
1. In the ECS UI, go to the Grafana Dashboard: Go to Advanced Monitoring.
2. Select Tech Refresh: Data Migration from the pulldown menu at the top of the page.
The dashboard provides various migration details.
For example:

Migrate Data 25

Page 30 of 51
Figure 7. ECS UI Grafana Dashboard Data Migration Status

Pause and resume migration


Use the ECS Service Console to pause and resume data migration.

About this task


When data migration is paused for a node, on-going data migration is stopped and migration framework periodically checks and
waits until data migration resumes.

Pause and resume migration using ECS Service Console


Steps
1. From the ECS Service Console, run the pause Data_Migration command. For example:
service-console run Data_Migration --operation pause
Output that indicates that the migration is paused, such as the following bolded text, appears:

/opt/emc/bin/service-console run Data_Migration --operation pause


Service Console is running on node 169.254.89.1 (suite 20200418_174821_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970
Debug log: /opt/emc/caspian/service-console/log/20200418_174817_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200418 17:48:39.484: Is migration running
20200418 17:48:43.963: | PASS (4 sec)
20200418 17:48:43.971: Check data migration parameters
Node(s) to be paused: 169.254.89.1, 169.254.89.2, 169.254.89.3, 169.254.89.4,
169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
20200418 17:48:47.449: | PASS (3 sec)
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase

26 Migrate Data

Page 31 of 51
20200418 17:48:48.072: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.83%
ONGOING 22.89TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.80%
ONGOING 22.90TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.84%
ONGOING 22.89TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.90%
ONGOING 22.86TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.12%
ONGOING 23.19TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.96%
ONGOING 22.84TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*********************************************----------------------------------------
---------------] 45.52%
ONGOING 23.44TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.68%
ONGOING 22.94TiB/43.03TiB

20200418 17:48:48.938: | PASS


20200418 17:48:48.939: Pause or resume migration
The migration is paused
20200418 17:48:51.930: | PASS (2 sec)
20200418 17:48:51.931: Print data migration instruction and exit
The data migration is running.
20200418 17:48:51.933: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 35 sec
Debug log: /opt/emc/caspian/service-console/log/20200418_174817_run_Data_Migration/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200418_174817_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

2. Verify that the output shows that the migration is paused.


3. From the ECS Service Console, run the resume Data_Migration command. For example:
service-console run Data_Migration --operation resume
Output which indicates that the migration is resumed, such as the following, appears:

/opt/emc/bin/service-console run Data_Migration --operation resume


Service Console is running on node 169.254.89.1 (suite 20200419_003106_Data_Migration)
Service console version: 5.0.0.0-20640.8c7060970

Migrate Data 27

Page 32 of 51
Debug log: /opt/emc/caspian/service-console/log/20200419_003103_run_Data_Migration/
dbg_robot.log
================================================================================
Data Migration Setup
20200419 00:31:23.896: Is migration running
20200419 00:31:27.679: | PASS (3 sec)
20200419 00:31:27.687: Check data migration parameters
Node(s) to be resumed: 169.254.89.1, 169.254.89.2, 169.254.89.3, 169.254.89.4,
169.254.89.5, 169.254.89.6, 169.254.89.7, 169.254.89.8.
20200419 00:31:30.314: | PASS (2 sec)
Action Succeeded on Previous Run
Data Migration Pre Check
Action Succeeded on Previous Run
Data Migration Plan
Action Succeeded on Previous Run
Data Migration Trigger
Action Succeeded on Previous Run
Data Migration Main Phase
20200419 00:31:30.849: Show migration progress
Host : layton-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.90%
PARTIALLY_PAUSED 22.86TiB/43.04TiB

Host : logan-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.87%
PARTIALLY_PAUSED 22.87TiB/43.04TiB

Host : murray-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.91%
PARTIALLY_PAUSED 22.86TiB/43.06TiB

Host : sandy-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.98%
PARTIALLY_PAUSED 22.82TiB/43.04TiB

Host : ogden-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.19%
PARTIALLY_PAUSED 23.16TiB/43.03TiB

Host : lehi-brass.ecs.lab.emc.com
[***********************************************--------------------------------------
---------------] 47.05%
PARTIALLY_PAUSED 22.80TiB/43.07TiB

Host : orem-brass.ecs.lab.emc.com
[*********************************************----------------------------------------
---------------] 45.58%
PARTIALLY_PAUSED 23.41TiB/43.02TiB

Host : provo-brass.ecs.lab.emc.com
[**********************************************---------------------------------------
---------------] 46.75%
PARTIALLY_PAUSED 22.91TiB/43.03TiB

20200419 00:31:31.487: | PASS


20200419 00:31:31.489: Pause or resume migration
The migration is resumed
20200419 00:31:34.190: | PASS (2 sec)
20200419 00:31:34.192: Print data migration instruction and exit
The data migration is running.
20200419 00:31:34.194: | PASS
Data Migration Post Check
Data Migration Teardown
================================================================================
Status: PASS
Time Elapsed: 34 sec
Debug log: /opt/emc/caspian/service-console/log/20200419_003103_run_Data_Migration/
dbg_robot.log

28 Migrate Data

Page 33 of 51
HTML log: /opt/emc/caspian/service-console/log/20200419_003103_run_Data_Migration/
log.html
================================================================================
Messages:
The data migration is running.
Migration is still running
================================================================================

Data migration throttling


ECS provides the ability to throttle data migration throughput, or data movement, from source nodes to target nodes to better
meet customer environment needs. You can prioritize data migration by using the available settings.

About this task


ECS provides three settings:
● Low - This is the default setting, where there is no throttling. Data migration is at the fastest possible throughput.
● Mid - Data migration is throttled. Data migration throughput is lower than 'Low' and more resources are available for
front-end workload.
● High - Data migration is throttled the most. Data migration throughput is the lowest
A high setting slows down the migration the most of all the throttling settings, and leaves resources for customers workloads.
A low setting allows for the fastest migration of the settings available. However, it leaves fewer resources available for
customers workloads.
In most cases, start with throttling set to high and then reduce the throttle over time. These settings can also be used to
customize the solution by throttling more during peak hours and reducing to no throttling during off-peak hours.
Dell EMC Professional Services can review the customer requirements and determine the appropriate level of throttling to be
configured accordingly. For example, an environment or configuration that is sensitive to latency or additional workloads on the
system may configure high throttling. A customer environment that is not sensitive to latency or additional workloads, and would
like the migration to complete in the fastest way possible may opt to leave the default (low) setting.
Select the appropriate setting based on customer expectation and needs.

Tune migration throttling


Steps
Run the following command to change the throttle traffic rate:

service-console run Data_Migration_Throttling --set low|mid|high|

Where the values are:


● low, or maximum migration throughput. This is the default setting.
● mid, or medium migration throughput.
● high, or lowest migration throughput.
For example,

service-console run Data_Migration_Throttling --set low


Service Console is running on node 169.254.89.1 (suite
20200414_154415_Data_Migration_Throttling)
Service console version: 5.0.0.0-20635.ac5b92b33
Debug log: /opt/emc/caspian/service-console/log/
20200414_154412_run_Data_Migration_Throttling/dbg_robot.log
================================================================================
Data Migration Throttling
20200414 15:44:26.856: Set or get max task number in data movement worker
The max task number in data movement worker is set to 50
20200414 15:44:30.875: | PASS (4 sec)
================================================================================
Status: PASS
Time Elapsed: 22 sec

Migrate Data 29

Page 34 of 51
Debug log: /opt/emc/caspian/service-console/log/
20200414_154412_run_Data_Migration_Throttling/dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/
20200414_154412_run_Data_Migration_Throttling/log.html
================================================================================

NOTE: If you want to check the current migration speed before changing the throttle, run this command: service-
console run Data_Migration

For example,

Service console version: 6.7.0.0-21476.9955dc902e


Debug log: /opt/emc/caspian/service-console/log/
20210604_205228_run_Data_Migration_Throttling/dbg_robot.log
================================================================================
Data Migration Throttling
20210604 20:52:38.939: Set or get max task number in data movement worker
The max task number in data movement worker is 50
20210604 20:52:42.053: | PASS (3 sec)
================================================================================
Status: PASS
Time Elapsed: 15 sec
Debug log: /opt/emc/caspian/service-console/log/
20210604_205228_run_Data_Migration_Throttling/dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/
20210604_205228_run_Data_Migration_Throttling/log.html
================================================================================

Migration alerts
Learn about migration alerting.
The ECS UI provides alerts for the following circumstances:
● Data migration for each node has two alert levels: Level-1 and Level-2.
● Alerts are reported when Level-1 or Level-2 data for each node are migrated
● Data migration for each node is complete when both Level-1 data and Level-2 data are migrated.
● Data migration has had no progress for six hours.
● Capacity on target nodes has reached a certain threshold.

Enable migration capacity alerts


You can enable migration capacity alerts in the ECS UI Storage Pool Management section.

Steps
1. In the ECS UI, go to Manage > Storage Pools.
The Storage Pool Management section opens.
2. Select the storage pool for which you would like to enable migration capacity alerts, and click Edit.
The Edit Storage Pool window for the selected storage pool opens.
3. Set the alert thresholds to the wanted values, and click Save.

30 Migrate Data

Page 35 of 51
3
Remove a node from a cluster using ECS
Service Console
This chapter provides information on removing a node from a cluster. You do this by using ECS Service Console node evacuation
commands.
The example outputs in the procedures may not represent the exact output. Use it for reference only.
NOTE: Remove the source nodes from the load balancers after completing the data migration and before initiating the node
evacuation.

Topics:
• Run optional pre—node evacuation health checks
• Remove a node from a cluster using ECS Service Console
• Move licenses to new ECS system
• Carry out post—Tech Refresh checks

Run optional pre—node evacuation health checks


If you choose, use the ECS Service Console to run health checks before you remove a node from a cluster. There are two
different commands, which run different checks.

Steps
1. From the ECS Service Console, run the run Health_Check command with the -- pre_node_evacuation tag. For
example:
service-console run Health_Check --tags pre_node_evacuation.
Output such as the following appears:

Service Console is running on node 169.254.89.1 (suite 20200408_201513_Health_Check)


Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-console/log/20200408_201506_run_Health_Check/
dbg_robot.log
================================================================================
Health Check
20200408 20:15:35.961: Execute Health Checks
20200408 20:15:35.975: | Validate source nodes availability
20200408 20:15:39.431: | | PASS (3 sec)
20200408 20:15:39.432: | PASS (3 sec)
================================================================================
Status: PASS
Time Elapsed: 38 sec
Debug log: /opt/emc/caspian/service-console/log/20200408_201506_run_Health_Check/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200408_201506_run_Health_Check/
log.html
================================================================================

2. Run the run Health_Check command with the -- pre_evacuation tag. For example:
service-console run Health_Check --tags pre_evacuation
Output such as the following appears:

service-console run Health_Check --tags pre_evacuation

Remove a node from a cluster using ECS Service Console 31

Page 36 of 51
Service Console is running on node 169.254.89.1 (suite 20200408_201709_Health_Check)
Service console version: 5.0.0.0-20597.e8eda88ed
Debug log: /opt/emc/caspian/service-console/log/20200408_201703_run_Health_Check/
dbg_robot.log
================================================================================
Health Check
20200408 20:17:31.785: Execute Health Checks
20200408 20:17:31.804: | Validate time drift
20200408 20:17:37.685: | | PASS (5 sec)
20200408 20:17:37.687: | Check DT status
Checking DT status (with timeout 10 min).
20200408 20:18:10.573: | | PASS (32 sec)
20200408 20:18:10.576: | Check on-going PSO or VDC removal from RG
20200408 20:18:33.316: | | PASS (22 sec)
20200408 20:18:33.317: | Validate that there are no transformation instances
20200408 20:18:37.117: | | PASS (3 sec)
20200408 20:18:37.118: | PASS (1 min 5 sec)
================================================================================
Status: PASS
Time Elapsed: 1 min 37 sec
Debug log: /opt/emc/caspian/service-console/log/20200408_201703_run_Health_Check/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200408_201703_run_Health_Check/
log.html
================================================================================

Remove a node from a cluster using ECS Service


Console
Use the ECS Service Console run Node_Evacuation command on the current installer node to configure the new installer
node. This command relocates the ECS installer and Service Console to the "target node '1'."

Prerequisites
● Ensure that the cluster is in healthy state before you run the run Node_Evacuation command. If there are any failed
source nodes, replace them before starting this procedure.
● Ensure that the version of the production package used in the node evacuation operation is the same as the one running on
the source nodes.
● Ensure that you copy the production package and Service Console bundle to the proper locations on the installer node.
Ensure that you select the first target node as the new installer node.

Steps
1. From the ECS Service Console, run the run Node_Evacuation :
service-console run Node_Evacuation --target-node <private.4 IP addresses of nodes to
be evacuated (source nodes)> --production-package /home/admin/install/production.tgz --
service-console-bundle /tmp/sc/service-console.tgz
For example:

service-console run
Node_Evacuation --target-node169.254.19.9,169.254.19.10,169.254.19.11,169.254.19.12--
production-package/tmp/install/production.tgz --service-console-bundle /tmp/
service_console/service-console.tgz

2. Verify that the output shows that the node is being evacuated and that the Service Console is being relocated.

admin@boston-auburn:~> service-console run Node_Evacuation --target-


node 169.254.19.9,169.254.19.10,169.254.19.11,169.254.19.12 --production-package
/tmp/install/production.tgz --service-console-bundle /tmp/service_console/service-
console.tgz
Service console version: 6.0.0.0-20939.4fee7380c
Debug log: /opt/emc/caspian/service-console/log/20200908_170939_run_Node_Evacuation/
dbg_robot.log
================================================================================

32 Remove a node from a cluster using ECS Service Console

Page 37 of 51
Node Evacuation Setup
20200908 17:09:56.495: Run health check
20200908 17:09:56.573: | Validate source nodes availability
20200908 17:09:58.213: | | PASS (1 sec)
20200908 17:09:58.214: | PASS (1 sec)
20200908 17:09:58.218: Get new installer node for node evacuation
The installer node 169.254.19.9 is going to be evacuated
New installer node: 169.254.104.1
20200908 17:09:58.219: | PASS
Node Evacuation Installer Check
20200908 17:10:03.320: Relocate installer
Extracting /tmp/install/production.tgz to 169.254.104.1:/tmp/
service_console_production_package
20200908 17:12:01.528: | PASS (1 min 58 sec)
Node Evacuation SC Check
20200908 17:12:05.655: Relocate SC
Service Console was installed on node 169.254.104.1
20200908 17:12:31.237: | PASS (25 sec)
Node Evacuation SC Configuration Check
20200908 17:12:35.360: Check cluster.ini after SC relocation
20200908 17:15:03.100: | PASS (2 min 27 sec)
Node Evacuation Next Steps
Installer and SC relocation is done.
Node Evacuation Pre Check
Node Evacuation VNEST Data Migration
Pending
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 5 min 31 sec
Debug log: /opt/emc/caspian/service-console/log/20200908_170939_run_Node_Evacuation/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200908_170939_run_Node_Evacuation/
log.html
================================================================================
Messages:
To proceed, run the following command from node 169.254.104.1:
service-console run Node_Evacuation
================================================================================

3. Rerun the run Node_Evacuation command from new installer node:


service-console run Node_Evacuation
For example:

service-console run Node_Evacuation

Remove a node from a cluster using ECS Service Console 33

Page 38 of 51
Output such as the following appears:

service-console run Node_Evacuation


Service Console is running on node 169.254.97.1 (suite
20200422_201550_Node_Evacuation)
Cannot write to ZK server, the operation Node_Evacuation is running locally (suite
20200422_201620_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-console/log/20200422_201547_run_Node_Evacuation/
dbg_robot.log
================================================================================
Node Evacuation Setup
20200422 20:16:38.361: Run health check
20200422 20:16:38.439: | Validate source nodes availability
20200422 20:16:41.653: | | PASS (3 sec)
20200422 20:16:41.654: | PASS (3 sec)
20200422 20:16:41.656: Get new installer node for node evacuation
20200422 20:16:41.658: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Node Evacuation Pre Check
20200422 20:16:57.203: Check target nodes data migration status
Data migration status for node 169.254.89.1 is COMPLETE
Data migration status for node 169.254.89.2 is COMPLETE
Data migration status for node 169.254.89.3 is COMPLETE
Data migration status for node 169.254.89.4 is COMPLETE
Data migration status for node 169.254.89.5 is COMPLETE
Data migration status for node 169.254.89.6 is COMPLETE
Data migration status for node 169.254.89.7 is COMPLETE
Data migration status for node 169.254.89.8 is COMPLETE
20200422 20:17:01.838: | PASS (4 sec)
20200422 20:17:01.839: Run health check
20200422 20:17:01.846: | Validate time drift
20200422 20:17:07.661: | | PASS (5 sec)
20200422 20:17:07.663: | Check DT status
Checking DT status (with timeout 10 min).
20200422 20:17:36.470: | | PASS (28 sec)
20200422 20:17:36.471: | Check on-going PSO or VDC removal from RG
20200422 20:17:50.986: | | PASS (14 sec)
20200422 20:17:50.987: | Validate that there are no transformation instances
20200422 20:17:54.639: | | PASS (3 sec)
20200422 20:17:54.640: | PASS (52 sec)
20200422 20:17:54.641: Validate DT Table Rebalancing
20200422 20:18:10.956: | PASS (16 sec)
Node Evacuation VNEST Data Migration
20200422 20:18:14.338: VNEST data migration status
20200422 20:18:15.764: | PASS (1 sec)
Number of remaining VNEST member nodes to migrate: 5
Starting migration of VNEST data from node 169.254.89.2 to node 169.254.97.1
20200422 20:18:16.409: Migrate VNEST data
20200422 20:18:17.805: | PASS (1 sec)
In Progress
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending

34 Remove a node from a cluster using ECS Service Console

Page 39 of 51
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 2 min 1 sec
Debug log: /opt/emc/caspian/service-console/log/20200422_201547_run_Node_Evacuation/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200422_201547_run_Node_Evacuation/
log.html
================================================================================
Messages:
Cluster.ini check is not needed
VNEST data migration is in progress, please wait some time and re-run the operation.
======================================================================

4. Rerun the command on the same node to check the status.


The more clusters you have, the more often you must run the command. You must monitor it. You can rerun the command as
often as your cycles and bandwidth permit.
Output such as the following appears:

service-console run Node_Evacuation


Service Console is running on node 169.254.97.1 (suite
20200422_203316_Node_Evacuation)
Cannot write to ZK server, the operation Node_Evacuation is running locally (suite
20200422_203346_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-console/log/20200422_203313_run_Node_Evacuation/
dbg_robot.log
================================================================================
Node Evacuation Setup
20200422 20:34:04.269: Run health check
20200422 20:34:04.360: | Validate source nodes availability
20200422 20:34:07.692: | | PASS (3 sec)
20200422 20:34:07.694: | PASS (3 sec)
20200422 20:34:07.696: Get new installer node for node evacuation
20200422 20:34:07.697: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Completed on previous run
Node Evacuation Pre Check
Completed on previous run
Node Evacuation VNEST Data Migration
20200422 20:34:26.656: VNEST data migration status
20200422 20:34:28.109: | PASS (1 sec)
20200422 20:34:28.127: Finalize VNEST data migration
20200422 20:34:29.824: | PASS (1 sec)
Number of remaining VNEST member nodes to migrate: 3
Starting migration of VNEST data from node 169.254.89.1 to node 169.254.97.5
20200422 20:36:30.683: Migrate VNEST data
20200422 20:36:32.131: | PASS (1 sec)
In Progress
Node Evacuation Disks Removal
Pending
Node Evacuation Store Nodes Mapping
Pending
Node Evacuation Stat Migrate Totals
Pending
Node Evacuation Initiate Fabric Migration
Pending
Node Evacuation Proceed Fabric Migration

Remove a node from a cluster using ECS Service Console 35

Page 40 of 51
Pending
Node Evacuation DC Node Removal
Pending
Node Evacuation Endpoints Removal
Pending
Node Evacuation Node Removal
Pending
Node Evacuation VNEST Node Removal
Pending
Node Evacuation Monitoring Node Removal
Pending
Node Evacuation Post Check
Pending
Node Evacuation Teardown
Pending
================================================================================
Status: PASS
Time Elapsed: 2 min 52 sec
Debug log: /opt/emc/caspian/service-console/log/20200422_203313_run_Node_Evacuation/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200422_203313_run_Node_Evacuation/
log.html
================================================================================
Messages:
VNEST data migration is in progress, please wait some time and re-run the operation.
================================================================================

5. Verify that the output shows that the VNEST data migration, node evacuation disks removal, and fabric migration are
complete.
Output such as the following appears:

service-console run Node_Evacuation


Service Console is running on node 169.254.97.1 (suite
20200423_131352_Node_Evacuation)
Service console version: 5.0.0.0-20661.84846585e
Debug log: /opt/emc/caspian/service-console/log/20200423_131349_run_Node_Evacuation/
dbg_robot.log
================================================================================
Node Evacuation Setup
20200423 13:14:12.800: Run health check
20200423 13:14:12.890: | Validate source nodes availability
20200423 13:14:16.417: | | PASS (3 sec)
20200423 13:14:16.418: | PASS (3 sec)
20200423 13:14:16.421: Get new installer node for node evacuation
20200423 13:14:16.422: | PASS
Node Evacuation Installer Check
Completed on previous run
Node Evacuation SC Check
Completed on previous run
Node Evacuation SC Configuration Check
Completed on previous run
Node Evacuation Next Steps
Completed on previous run
Node Evacuation Pre Check
Completed on previous run
Node Evacuation VNEST Data Migration
Completed on previous run
Node Evacuation Disks Removal
Completed on previous run
Node Evacuation Store Nodes Mapping
Completed on previous run
Node Evacuation Stat Migrate Totals
Completed on previous run
Node Evacuation Initiate Fabric Migration
Completed on previous run
Node Evacuation Proceed Fabric Migration
Completed on previous run
Node Evacuation DC Node Removal
Completed on previous run
Node Evacuation Endpoints Removal
Completed on previous run
Node Evacuation Node Removal

36 Remove a node from a cluster using ECS Service Console

Page 41 of 51
20200423 13:15:01.075: Delete nodes for evacuation
20200423 18:17:00.881: | PASS (5 hour(s) 1 min 59 sec)
Node Evacuation VNEST Node Removal
20200423 18:17:05.170: Delete nodes from VNEST
20200423 18:17:08.180: | PASS (3 sec)
Node Evacuation Monitoring Node Removal
20200423 18:17:11.438: Delete nodes from Monitoring
20200423 18:17:13.163: | PASS (1 sec)
Node Evacuation Post Check
20200423 18:17:16.414: Validate Fabric nodes evacuation
20200423 18:17:22.613: | PASS (6 sec)
20200423 18:17:22.614: Get nodes from API
20200423 18:17:35.510: | PASS (12 sec)
20200423 18:17:35.511: Validate SSM nodes evacuation
20200423 18:17:35.516: | PASS
Node Evacuation Teardown
20200423 18:17:38.828: Generate var files templates
20200423 18:17:39.825: | PASS
20200423 18:17:39.827: Generate cluster.ini
Saved existing cluster.ini to /opt/emc/config/local/cluster.ini.back
Preserved files:
/opt/emc/config/local/host_vars
/opt/emc/config/local/host_vars/169.254.89.1
/opt/emc/config/local/host_vars/169.254.97.1
/opt/emc/config/local/group_vars
/opt/emc/config/local/group_vars/datanodes
[INFO] generated cluster.ini file with the content:

######
# This file was automatically generated by the Service Console.
# Please verify that it reflects the actual cluster topology.
# Credentials (BMC, Mgmt API, etc) should be set in separate files.
# Use file group_vars/datanodes to set cluster-wide variables.
# Use file host_vars/HOST_IP to set node-specific variables.
######

[datanodes:children]
vdc2

[vdc2:children]
orchid

[orchid:vars]
rack_id=97
rack_name=orchid
rack_psnt=psnt2
rack_dns_server=10.249.255.254
rack_dns_search=ecs.lab.emc.com,lss.emc.com,isus.emc.com,centera.emc.com,corp.emc.com,
emc.com
rack_ntp_server=10.249.255.254,10.243.84.254
rack_ns_switch=files,mdns4_minimal,[NOTFOUND=return],dns,mdns4
sc_collected=True

[orchid:children]
node_169_254_97_1 # Installer / SC node
node_169_254_97_2
node_169_254_97_3
node_169_254_97_4
node_169_254_97_5
node_169_254_97_6
node_169_254_97_7
node_169_254_97_8

[node_169_254_97_1]
169.254.97.1

[node_169_254_97_1:vars]
bmc_ip=10.249.252.201

[node_169_254_97_2]
169.254.97.2

Remove a node from a cluster using ECS Service Console 37

Page 42 of 51
[node_169_254_97_2:vars]
bmc_ip=10.249.252.202

[node_169_254_97_3]
169.254.97.3

[node_169_254_97_3:vars]
bmc_ip=10.249.252.203

[node_169_254_97_4]
169.254.97.4

[node_169_254_97_4:vars]
bmc_ip=10.249.252.204

[node_169_254_97_5]
169.254.97.5

[node_169_254_97_5:vars]
bmc_ip=10.249.252.211

[node_169_254_97_6]
169.254.97.6

[node_169_254_97_6:vars]
bmc_ip=10.249.252.212

[node_169_254_97_7]
169.254.97.7

[node_169_254_97_7:vars]
bmc_ip=10.249.252.213

[node_169_254_97_8]
169.254.97.8

[node_169_254_97_8:vars]
bmc_ip=10.249.252.214

20200423 18:20:44.066: | PASS (3 min 4 sec)


20200423 18:20:44.073: Show SRS notification
20200423 18:20:47.086: | PASS (3 sec)
================================================================================
Status: PASS
Time Elapsed: 5 hour(s) 7 min 4 sec
Debug log: /opt/emc/caspian/service-console/log/20200423_131349_run_Node_Evacuation/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200423_131349_run_Node_Evacuation/
log.html
================================================================================
Found private.4 IPs from zookeeper: 169.254.97.8, 169.254.97.1, 169.254.97.5,
169.254.97.4, 169.254.97.3, 169.254.97.7, 169.254.97.6, 169.254.97.2
Found private.4 IPs from zookeeper: 169.254.97.1, 169.254.97.2, 169.254.97.3,
169.254.97.4, 169.254.97.5, 169.254.97.6, 169.254.97.7, 169.254.97.8
Created cluster configuration file /opt/emc/config/local/cluster.ini at 169.254.97.1
Successfully generated cluster.ini.
However it is suggested to verify the generated cluster.ini before proceeding with
other service procedures.
If the cluster.ini contains WARNING comment or has <vdc>_unknown section(s),
it is mandatory to correct the cluster.ini manually before proceeding with other
service procedures.
================================================================================

If the node fails during node evacuation before fabric migration completes, replace the node and go to node evacuation.
If the node fails during node evacuation after fabric migration, VNEST data is already migrated and this failure does not
affect it at all.

38 Remove a node from a cluster using ECS Service Console

Page 43 of 51
Move licenses to new ECS system

Steps
1. Go to eLicensing Central https://powerlinklicensing.emc.com/, regenerate the license and add the capacity as required to
the existing license.
a. Regenerate license to remove decommissioned product serial number tags (PSNT) and add new PSNTs.
The license maintains the same software ID (SWID).
b. Check and add unstructured storage capacity to the license as required.
NOTE: Do not use a license file with a different SWID unless directed to do so.

2. In the ECS UI, follow these steps to re-apply license to tech refreshed ECS VDC.
a. Go to Settings > Licensing and note the VDC Serial Number before applying the regenerated license.
b. Select Settings > Licensing > New License and browse to xxxxx.lic file and apply.
c. Browse to the xxxxx.lic file, click Open
d. Click UPLOAD to upload the license file.
e. Verify that the VDC Serial Number has not changed.
f. Click the arrow mark on any row and Verify the PSNT column. Verify the number of racks and PSNT serial numbers.
3. In the ECS UI, follow these steps to delete the SRS entries and re-create them as required.
a. In Settings > ESRS , note the IP and PORT number of each SRS entry.
b. Under Settings > ESRS > Actions, delete the entry.
c. Click New Server option to re-create the entry.
d. After the entries are created and the status shows as connected, click Action > Test Dial Home for each entry.
e. Verify that Test Dial Home status is Passed for each entry.
4. File IBG report requesting decommissioned PSNTs and set them to uninstalled.

Carry out post—Tech Refresh checks


Verify complete procedures in the Service Console Health Check and ECS UI Storage Pool page after completing the Tech
Refresh.

About this task


Run the Health Check from the Service Console. Check the ECS UI Storage Pool page for evacuated nodes.

Steps
1. From the ECS Service Console, run the run Health_Check command:
service-console run Health_Check
Output such as the following appears:

service-console run Health_Check


Service console version: 5.0.0.0-20670.6cb91f5d6
Debug log: /opt/emc/caspian/service-console/log/20200504_202021_run_Health_Check/
dbg_robot.log
================================================================================
Health Check
20200504 20:20:37.490: Execute Health Checks
20200504 20:20:37.498: | Check DNS settings
20200504 20:20:45.763: | | PASS (8 sec)
20200504 20:20:45.765: | Check swap memory
20200504 20:20:47.228: | | PASS (1 sec)
20200504 20:20:47.230: | Check MAC 3A patch
20200504 20:20:48.865: | | PASS (1 sec)
20200504 20:20:48.867: | Check network interfaces
20200504 20:20:50.318: | | PASS (1 sec)
20200504 20:20:50.320: | Check PBR consistency
20200504 20:20:53.122: | | PASS (2 sec)
20200504 20:20:53.124: | Check preset.cfg file

Remove a node from a cluster using ECS Service Console 39

Page 44 of 51
20200504 20:21:00.990: | | PASS (7 sec)
20200504 20:21:00.992: | Check static routes config files consistency for racks
20200504 20:21:00.999: | | PASS
20200504 20:21:01.000: | Check that no nodes need a cold or warm power cycle
20200504 20:21:02.701: | | PASS (1 sec)
20200504 20:21:02.703: | DOM disks
Skip on current hardware.
20200504 20:21:04.005: | | PASS (1 sec)
20200504 20:21:04.007: | Check LVRoot size
20200504 20:21:05.503: | | PASS (1 sec)
20200504 20:21:05.505: | NAN Vlan
20200504 20:21:07.889: | | PASS (2 sec)
20200504 20:21:07.891: | Check SATA DOM OS installation
20200504 20:21:36.623: | | PASS (28 sec)
20200504 20:21:36.625: | Static routes validation
20200504 20:21:36.638: | | PASS
20200504 20:21:36.639: | Validate BE switch OS version
20200504 20:22:07.810: | | PASS (31 sec)
20200504 20:22:07.811: | Validate BMC availability
20200504 20:22:17.619: | | PASS (9 sec)
20200504 20:22:17.622: | Validate BMC settings
20200504 20:22:22.733: | | PASS (5 sec)
20200504 20:22:22.735: | Validate that disk SMART self-test is enabled in /etc/
cron.daily
20200504 20:22:23.324: | | PASS
20200504 20:22:23.326: | Validate EX300 NIC FW version
Skip: Validate only EX300 NIC FW version
20200504 20:22:23.332: | | PASS
20200504 20:22:23.333: | Validate FE switch OS version
20200504 20:22:48.205: | | PASS (24 sec)
20200504 20:22:48.207: | Validate FE switches uplink status
20200504 20:24:00.419: | | PASS (1 min 12 sec)
20200504 20:24:00.421: | Validate the kernel version
20200504 20:24:04.323: | | PASS (3 sec)
20200504 20:24:04.325: | Validate NAN config
20200504 20:24:07.061: | | PASS (2 sec)
20200504 20:24:07.063: | Check NIC FW versions for consistency
20200504 20:24:14.563: | | PASS (7 sec)
20200504 20:24:14.565: | Check NIC
[WARN] Check corresponding ECS EX-Series Firmware Matrix along with Firmware Update
Guide for minimum required or latest recommended versions of NIC and other server
firmware. This documentation can be found on Support site or ECS SolVe Desktop or
SolVe online
20200504 20:24:14.567: | | PASS
20200504 20:24:14.568: | Validate root FS free space
20200504 20:24:16.146: | | PASS (1 sec)
20200504 20:24:16.148: | Validate source nodes availability
20200504 20:24:16.150: | | PASS
20200504 20:24:16.151: | Validate that SSH banner is not added on the nodes
Skip - this check is for ECS version below 3.2.1
20200504 20:24:16.523: | | PASS
20200504 20:24:16.525: | Validate that STIG rules were applied
20200504 20:24:16.531: | | PASS
20200504 20:24:16.532: | Check for stuck disk subsystem processes
20200504 20:24:18.866: | | PASS (2 sec)
20200504 20:24:18.868: | Validate that all nodes are available - OS
20200504 20:24:18.870: | | PASS
20200504 20:24:18.871: | Validate that OS version is equal between nodes
20200504 20:24:24.960: | | PASS (6 sec)
20200504 20:24:24.962: | Validate time drift
20200504 20:24:26.464: | | PASS (1 sec)
20200504 20:24:26.467: | Validate swap space consumers
20200504 20:24:39.964: | | PASS (13 sec)
20200504 20:24:39.966: | Confirm that docker health is GOOD and docker exec works
on all nodes
20200504 20:24:53.369: | | PASS (13 sec)
20200504 20:24:53.371: | Validate agent version
20200504 20:24:54.949: | | PASS (1 sec)
20200504 20:24:54.952: | Validate application role operational mode
Skipped: unnecessary for ECS version 3.5.0.0.120936.e86d8252415
20200504 20:24:59.466: | | PASS (4 sec)
20200504 20:24:59.468: | Validate applications and services health
20200504 20:25:10.662: | | PASS (11 sec)

40 Remove a node from a cluster using ECS Service Console

Page 45 of 51
20200504 20:25:10.663: | Validate cluster compliance status
20200504 20:25:17.191: | | Pass (6 sec)
20200504 20:25:17.194: | Validate that there is cluster master
20200504 20:25:21.204: | | PASS (4 sec)
20200504 20:25:21.206: | Validate docker containers are running where they should be
20200504 20:25:27.335: | | PASS (6 sec)
20200504 20:25:27.337: | Validate event streams by sending agent health
20200504 20:25:45.137: | | PASS (17 sec)
20200504 20:25:45.139: | Validate API availability of fabric services
20200504 20:25:53.483: | | PASS (8 sec)
20200504 20:25:53.484: | Validate that ports for fabric services are open
20200504 20:26:00.794: | | PASS (7 sec)
20200504 20:26:00.795: | Validate services owner and that goalstates are equal on
LM and agents
20200504 20:26:28.153: | | PASS (27 sec)
20200504 20:26:28.155: | Validate that expected number of drives is formatted for
Object
20200504 20:26:55.343: | | PASS (27 sec)
20200504 20:26:55.345: | Validate that all lifecycles are active
20200504 20:26:57.130: | | PASS (1 sec)
20200504 20:26:57.132: | Validate that the correct number of disks are mounted
inside the object container
20200504 20:27:17.471: | | PASS (20 sec)
20200504 20:27:17.473: | Validate number of nodes with zookeeper
20200504 20:27:21.489: | | PASS (4 sec)
20200504 20:27:21.490: | Validate object configuration files between nodes
20200504 20:27:24.617: | | PASS (3 sec)
20200504 20:27:24.619: | Validate that all partitions are under control
20200504 20:28:34.955: | | PASS (1 min 10 sec)
20200504 20:28:34.957: | Validate that provisioned drives are GOOD
20200504 20:28:34.962: | | PASS
20200504 20:28:34.963: | Validate services owner and that realized goalstates are
equal on LM and agents
20200504 20:29:03.358: | | PASS (28 sec)
20200504 20:29:03.360: | Validate SSD disks consistency
20200504 20:29:09.933: | | PASS (6 sec)
20200504 20:29:09.935: | Validate that all nodes are available - Fabric
20200504 20:29:09.936: | | PASS
20200504 20:29:09.937: | Validate that diskset is the same for all disks and cache
files
20200504 20:29:14.500: | | PASS (4 sec)
20200504 20:29:14.502: | Validate zookeeper
20200504 20:29:14.503: | | PASS
20200504 20:29:14.504: | Verify BIOS version
20200504 20:29:16.242: | | PASS (1 sec)
20200504 20:29:16.244: | Check that BTree GC is enabled
Checking BTree GC parameters...
com.emc.ecs.chunk.gc.repo.enabled = true
com.emc.ecs.chunk.gc.repo.verification.enabled = true
com.emc.ecs.chunk.gc.btree.scanner.verification.enabled = true
com.emc.ecs.chunk.gc.btree.scanner.copy.enabled = true
com.emc.ecs.chunk.gc.btree.enabled = true
BTree GC is enabled
20200504 20:29:29.283: | | PASS (13 sec)
20200504 20:29:29.285: | Check Upgrade Completion flags
version across the cluster: 3.5.0.0.120936.e86d8252415
Checking flags on VDC vdc_mantis_a-acid...
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_3_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_5_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_4_0_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.2_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.2_2_1_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_2_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.timeFormat.rfc822_date_time_format' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_0_upgrade_complete' is 'true'
[VDC vdc_mantis_a-acid] 'com.emc.ecs.upgrade.3_4_upgrade_complete' is 'true'
20200504 20:29:44.777: | | PASS (15 sec)
20200504 20:29:44.779: | Check counts of replication groups
20200504 20:29:47.938: | | PASS (3 sec)
20200504 20:29:47.940: | Check DT Load Balancing

Remove a node from a cluster using ECS Service Console 41

Page 46 of 51
20200504 20:29:47.942: | | Check Lb Enabled
Checking LB parameters...
com.emc.ecs.ownership.LoadBalanceEnabled = true
LB is enabled
20200504 20:29:50.758: | | | PASS (2 sec)
20200504 20:29:50.759: | | PASS (2 sec)
20200504 20:29:50.761: | Check DT status
Checking DT status (with timeout 10 min).
20200504 20:30:01.718: | | PASS (10 sec)
20200504 20:30:01.719: | Check that rejoin task keys not present in LS table
20200504 20:30:01.723: | | PASS
20200504 20:30:01.724: | Check that the system is not in TSO state
20200504 20:30:01.726: | | PASS
20200504 20:30:01.726: | Check Journal GC
20200504 20:30:05.051: | | PASS (3 sec)
20200504 20:30:05.053: | Check Object version across the cluster
20200504 20:30:13.566: | | PASS (8 sec)
20200504 20:30:13.569: | Check whether each node has reserve SSD
20200504 20:30:19.663: | | PASS (6 sec)
20200504 20:30:19.665: | Validate all OB and LS tables FPP
20200504 20:30:19.667: | | PASS
20200504 20:30:19.668: | Validate BE ECS UI availability
Private IP/Port 192.168.219.254:443 is disabled on installer rack 1
20200504 20:30:20.234: | | PASS
20200504 20:30:20.236: | Validate that all nodes are available - Object
20200504 20:30:20.238: | | PASS
20200504 20:30:20.239: | Validate that data recovery is enabled for all nodes
20200504 20:30:23.461: | | PASS (3 sec)
20200504 20:30:23.463: | Validate that nginx is listening on all nodes
20200504 20:30:27.826: | | PASS (4 sec)
20200504 20:30:27.834: | PASS (9 min 50 sec)
================================================================================
Status: PASS
Time Elapsed: 10 min 10 sec
Debug log: /opt/emc/caspian/service-console/log/20200504_202021_run_Health_Check/
dbg_robot.log
HTML log: /opt/emc/caspian/service-console/log/20200504_202021_run_Health_Check/
log.html
================================================================================

2. Verify that the output shows that all checks are passed.
3. In the ECS UI, go to Storage Pools Management.
4. Verify that the old cluster nodes are deleted or gone, and that the list displays a new storage pool ready to use.

42 Remove a node from a cluster using ECS Service Console

Page 47 of 51
Figure 8. ECS UI Storage Pools Management

Next steps
● Reconfigure the Secure Remote Services.
○ Delete the Secure Remote Services server from the ECS UI and add it back. This reconfiguration ensures that the old
(evacuated) nodes are removed from Secure Remote Services and the new nodes are added.
○ If this step is not performed, the Secure Remote Services gets disconnected post evacuation.
● Apply an updated ECS license.
○ Any reference to product serial number tags (PSNTs) of old (evacuated) racks should be removed in the updated license.
○ The capacity of the old rack should be subtracted from the total capacity.
● Disable the rack interconnect. This step ensures that,
○ The old (evacuated) rack does not display in getclusterinfo.
○ All references to the old (evacuated) rack are removed from NAN.

Remove a node from a cluster using ECS Service Console 43

Page 48 of 51
I
Document feedback
To provide any feedback or suggestions on the document, go to Content Feedback Router portal. For more information, see
Content Feedback Router - Support.

44 Document feedback

Page 49 of 51
Index
E
Extension 3

M
Migration 3

N
Node evacuation 3

S
Source node 3

T
Target node 3

Page 50 of 51
Dell Technologies Confidential Information version: 2.3.6.91

Page 51 of 51

You might also like