0% found this document useful (0 votes)
4 views

x400 Dimm Replacement Guide

Uploaded by

tachyon.20230417
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

x400 Dimm Replacement Guide

Uploaded by

tachyon.20230417
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 14

906278724.doc (296.

00 KB)
6/30/2025 1:29 PM
Last saved by EMC

DIMM Replacement Guide

Isilon
994-0010-01 Rev H

Replace a DIMM

Replacing a DIMM
You can replace a failed dual in-line memory module (DIMM) in the field.

CAUTION: Perform this procedure on only one node at a time. Performing maintenance on
multiple nodes in parallel may lower the protection level of the cluster, put data at risk, and lead to
the interruption of client workflows.

Working with clusters in SmartLock compliance mode


Clusters running in SmartLock compliance mode require a sudo prefix to run root commands.
If a cluster is running in SmartLock compliance mode, root access is disabled on the cluster. Because of
this, you can run some commands only through the sudo program. Prefixing a command with sudo
enables you to run commands that require root access. For example, if you do not have root access, the
following command fails:
isi drivefirmware status
However, if you are on the sudoers list, the following command succeeds:
sudo isi drivefirmware status
Compliance mode commands that require changes beyond the sudo prefix are noted in the procedure
steps.
For more information on the sudo program and compliance mode commands, see the OneFS CLI
Administration Guide.

Task 1: Download a Field Replacement Unit (FRU) package


Before you replace a component in a configure-to-order (CTO) node, obtain a Field Replacement Unit
(FRU) package from the EMC FTP site. The FRU package updates the CTO and as-built information on
the node, then forwards the updated information to Isilon Technical Support.
About this task
Procedure
1. [ ] Download the latest FRU package from ftp://ftp.emc.com/outgoing/Fru_Package/.
2. [ ] Note the name of the FRU package. You will use the name for other commands.
Package names follow this convention:
IsiFru_Package_ <date-time-stamp> .tgz

1
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

For example: IsiFru_Package_201507072125.tgz


3. [ ] Place the FRU package on the cluster through a network drop, or by asking someone at
the cluster site to place the package for you. If neither of these options is available to you, contact
Isilon Technical Support for assistance.

Task 2: Gather logs


Before you begin any maintenance on a cluster, gather cluster logs.
About this task
You must collect cluster logs before all maintenance procedures. Cluster logs provide snapshots of the
cluster, which you can review to make sure that maintenance is successful.
Procedure
1. [ ] Open a secure shell (SSH) connection to any node in the cluster and log in.
2. [ ] Gather cluster logs by running the following command:
isi_gather_info

Install the DIMM replacement


Remove the failed DIMM and install the replacement hardware.

Task 3: Identify a failed DIMM


When performing a DIMM replacement on a node, first determine the location of the failed DIMM in the
node.
Procedure
1. [ ] Open a secure shell (SSH) connection to the node on which the failed DIMM was
reported.
2. [ ] Identify the failed DIMM by typing the following command:
isi_dmilog
The system displays output similar to the following:
04/01/150 2:08:31 COT Correctable ECC memory error:
2 times on P2-DIMM1A

ia32_mc8_status[0] 0x0000000000000000
ia32_mc8_addr[0] 0x0000000000000000
ia32_mc8_misc[0] 0x0000000000000000
ia32_mc8_status[1] 0x0000000000000000
ia32_mc8_addr[1] 0x0000000000000000

This example output shows an error with P2-DIMM1A, so you would replace the DIMM in slot 1A of
the P2 bank of DIMMs.
3. [ ] Write down the slot number of the failed DIMM to ensure that you replace the correct
module.

Task 4: Clear the ECC policy history


Before you replace a failed DIMM, clear the ECC policy history.
Procedure

2
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

1. [ ] Clear the ECC policy history by typing the following command:


isi_dmilog -z all

Task 5: Power down the node


Power down the node before performing maintenance.
Procedure
1. [ ] Connect to an available node in the cluster with a serial cable or network drop.
2. [ ] Determine the IP address of the node you are powering down by typing the command:
isi status -q
3. [ ] From the node that you connected to, open a secure shell (SSH) connection to the node
that is to be shut down by typing the command:
ssh <node_ip_address>
4. [ ] Power down the node by typing the following command:
shutdown -p now
If the node does not respond to the shutdown command, press the Power button on the node three
times, and then wait five minutes. If the node still does not shut down, you are at risk for losing data.
Do not proceed. Contact EMC Isilon Technical support for assistance.

CAUTION: A forced power down should be attempted only if a node is unresponsive. Forcing
the power down of a healthy node can result in data loss.

5. [ ] Verify that the node is powered down by typing the following command:
isi status -q
Confirm that the node has a status of D--R (Down, Read Only). See node 3 in the following example.
ID |IP Address |DASR| In Out Total| Used / Size |Used / Size
---+---------------+----+-----+-----+-----+------------------+-
1|10.53.217.201 | OK | 48M| 0| 48M| 19G/ 6.2T(< 1%)|(No SSDs)
2|10.53.217.202 | OK | 46M| 0| 46M| 23G/ 6.2T(< 1%)|(No SSDs)
3|10.53.217.203 |D--R| n/a| n/a| n/a| n/a/ n/a( n/a)|n/a/n/a( n/a)

Task 6: Slide the node out of the rack


Slide the node away from the rack to access the contents of the node.
Procedure
1. [ ] Label the InfiniBand, ethernet, and power cables connected to the back of the node to
ensure that they are reconnected correctly.
2. [ ] Disconnect all cables from the back of the node.

Note: If there are transceivers connected to the end of your IB or ethernet cables, make sure to
remove them with the cables. If you are using fiber ethernet cables, you will need to disconnect the
cable from the transceiver, then remove the transceiver from the node.

3. [ ] Remove the node front panel.

3
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

4. [ ] Remove the retaining screws that secure the node to the rack cabinet.
5. [ ] Slide the node from the rack cabinet to fully extend the slide rails and provide clear
access to the node. Do not remove the node from the slide rails.

DANGER: Slide the node out from the rack slowly. Do not extend the rails completely until
you confirm that the node is latched and safely secured to the rails.

Task 7: Remove the node top panel


You remove the top panel to gain access to the contents of the node.
About this task

WARNING: Properly ground yourself to prevent electrostatic discharge from damaging the node.
For example, attach an ESD strap to your wrist and the node chassis.

Procedure
1. [ ] Loosen the captive screw that secures the node top panel.
2. [ ] Slide the top panel toward the rear of the node, and then lift the top panel to access the
node interior.

Task 8: Remove the cross bracket


Removing the cross bracket provides clear access to the inside of the node.
Procedure
1. [ ] Locate the cross bracket within the node.

4
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Figure 1 Cross bracket

2. [ ] Remove the cross bracket by pressing on the side of the node chassis where the cross
bracket is connected. Unhook the cross bracket from the chassis, then lift straight up to unhook the
other side of the bracket.

Task 9: Remove the air baffle


In order to gain full access to the internal components of the node, you must remove the air baffle.
Procedure
1. [ ] Locate the air baffle within the node.

5
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Figure 2 Air baffle

2. [ ] Raise the front end of the air baffle, unhook the tabs at the back end of the baffle, and
then lift the baffle out of the node.

Task 10: Remove the failed DIMM


Remove the failed DIMM from the node.
Procedure
1. [ ] Locate the slot number of the failed DIMM.

6
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Figure 3 Failed DIMM

2. [ ] Press down on the two DIMM locking arms on either side of the failed DIMM to release
the DIMM from the slot.

CAUTION: If you are replacing a DIMM in slot P2 DIMM 3A or P2 DIMM 3B, remove the network
interface card (NIC) to allow enough space to remove the DIMM without damaging the NIC.

7
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Figure 4 Release the DIMM

Task 11: Install the new DIMM


You must install the new DIMM in the slot from which you removed the old DIMM.
Procedure
1. [ ] Remove the new DIMM from the antistatic package.
2. [ ] Locate the open slot that the old DIMM was removed from. Align the notch in the DIMM
with the tab of the open slot and press down firmly on both ends of the DIMM until the two arms lock
into place, securing the DIMM.

Note: Install the new DIMM in the empty slot that used to hold the old DIMM. A DIMM that is installed
in another open slot runs the risk of not being recognized by the system.

Task 12: Install the air baffle


You must replace the air baffle by inserting the baffle tabs into slots in the chassis.
Procedure
1. [ ] Hook the tabs on the back end of the air baffle into the metal slots at the back of the
node.
2. [ ] Lower the front of the air baffle back into its original position within the node.

8
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Task 13: Install the cross bracket


You must install the cross bracket by hooking it to the bracket holes on the interior of the node and then
snapping the other end into the chassis wall.
About this task

WARNING: The cross bracket sits directly above the boot drives. Use caution when installing the
cross bracket so that the boot drives are not dislodged or damaged.

Task 14: Install the node top panel


You must secure the top panel onto the node.
Procedure
1. [ ] Place the top panel on the node so that the front edge of the top panel is about one inch
behind the drive bays, and then slide the top panel forward into place.

WARNING: The chassis intrusion switch can be damaged if the top panel is slid too far back
on the node.

2. [ ] Tighten the captive top panel screw to secure the top panel to the node.

Task 15: Return the node to the rack


Return the node to the rack after all work is complete.
Procedure
1. [ ] Slide the node back into the rack cabinet.

WARNING: Slide the node slowly so you do not slam the node into the rack and damage the
node.

2. [ ] Reconnect the ethernet, InfiniBand, and power cables to the back of the node.
3. [ ] Secure the node to the rack cabinet.
4. [ ] Replace the node front panel.

Task 16: Power up the node


Power up the node by pressing the power button on the back panel.
Procedure
1. [ ] Power up the node by pressing the power button on the back panel of the node. It is
located just left of center, toward the upper part of the back panel.

9
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Task 17: Clear DIMM error messages


After you replace a failed DIMM, clear the DIMM error messages from the log.
Procedure
1. [ ] Open a secure shell (SSH) connection to the updated node.
2. [ ] Clear DIMM errors from the log by typing the following command:
isi_dmilog -c

Task 18: Verify healthy DIMMs


Verify that the new DIMM is active after the replacement DIMM is installed in the node.
Procedure
1. [ ] Verify that the DIMM is healthy by typing the following command:
sysctl -n hw.physmem
If the node returns the correct value for the amount of RAM installed to the node, the DIMM is
functioning correctly.

Task 19: Gather logs


After you complete maintenance on a cluster, gather cluster logs.
About this task
You must collect cluster logs after all maintenance. Cluster logs provide snapshots of the cluster that you
can review to make sure that maintenance is successful.
Procedure
1. [ ] Gather cluster logs by typing the command:
isi_gather_info

Install the FRU package and run scripts


Update the configure-to-order (CTO) and as-built information on the node by installing a FRU package.

Note: If your cluster is running in SmartLock compliance mode with OneFS 7.0.2.10 or later, 7.0.1.4 or
later, or 7.1.1.0 or later you will need to enter the provided compliance mode commands to run the FRU
scripts. If your cluster is running in compliance mode but is not running one of these versions, you will
need to upgrade your OneFS version to support the compliance mode commands. Contact Isilon
Technical Support.

Task 20: Install the FRU package on the node


Unpack and install the FRU package on the node.
Procedure
1. [ ] Place the FRU package on the node.
2. [ ] Unpack the FRU package by running the following command:
tar -zxvf IsiFru_Package_<date-time-stamp>.tgz
3. [ ] Type cd to change to the directory containing the FRU tar.

10
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

4. [ ] Install the package. Depending on your version of OneFS, run one of the following
commands:
OneFS 8.0 or later
isi upgrade patches install IsiFru_Package_<date-time-stamp>_.tar
Earlier than OneFS 8.0
isi pkg install IsiFru_Package_<date-time-stamp>.tar
As the package installs, the following message appears:
Preparing to install the package...
Checking the package for installation...
Installing the package
Committing the installation...
Package is committed.

Task 21: Run the update script


After the FRU package is installed on the node, run the update script.
Procedure
1. [ ] Move to the FRU package location by running the following command:
cd /var/crash/cto/fruPackages/IsiFru_Package_<date-time-stamp>
2. [ ] Perform the update script by running the following command:
./isi_fru_update_cluster
The system displays confirmation of the following items:
 CTO capability
 Current node hardware configuration

Task 22: Run the ABR script


Run the As Built Record (ABR) script to report the updated hardware to Isilon Technical Support.
Procedure
1. [ ] Verify installation of the updated hardware by running the following command:
./isi_cto_update --abr
The update is verified and a series of status messages confirm the node configuration, and if an FTP
connection is available, an updated ABR is sent to Isilon Technical Support.
2. [ ] If an external connection is not available, manually collect and deliver to Isilon Technical
Support the updated ABR.
3. [ ] If the cluster is running in SmartLock compliance mode, verify installation of the updated
hardware by running the following command:
sudo /usr/bin/isi_hwtools/isi_cto_update --abr --filepath .

Note: You must include the period at the end of the command.

Sending an ABR to Isilon with no connectivity


If no external connectivity is available, the As Built Record on a Configure to Order (CTO) node cannot be
automatically delivered to Isilon Technical Support.

11
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

If external connectivity is available, the ABR is automatically generated and delivered to Isilon Technical
Support. If there is no external connectivity available, you must generate and copy the ABR from the
node, and then send the ABR to Isilon Technical Support through an alternate connection.

Task 23: Generate an ABR


You can manually send an As Built Record (ABR) by copying an XML file from the node and emailing the
file to Isilon Technical Support. You need network access to the node, or you can request that the
customer provide the file to you.
Procedure
1. [ ] Generate an ABR by running the following command:
isi_make_abr
The command generates a temporary file named asbuilt_ <serial-number>_<date-time-
stamp> .xml.
2. [ ] Identify the full name of the ABR file by running the following command:
isi_inventory_tool --display --itemType asbuilt | grep asbuiltFileName=
The system output contains information about the ABR file.
3. [ ] Place the ABR file where you can copy it by running the following command:
isi_inventory_tool --display --itemType asbuilt > /ifs/asbuilt_ <serial-
number>_<date-time-stamp> .xml
4. [ ] Copy the generated asbuilt_ <serial-number>_<date-time-stamp> .xml file.
5. [ ] If an FTP connection is not available, contact Isilon Technical Support for an alternate
delivery method.

Task 24: Remove the FRU package from the node


After all scripts are run, remove the FRU package from the node.
Procedure
1. [ ] Change out of the FRU package directory by running the following command:
cd /
2. [ ] Delete the FRU package from the node. Depending on your version of OneFS, run one of
the following commands:
OneFS 8.0 or later
isi upgrade patches uninstall IsiFru_Package_ <date-time-stamp>
Earlier than OneFS 8.0
isi pkg delete IsiFru_Package_ <date-time-stamp>

Task 25: Gather logs


After you complete maintenance on a cluster, gather cluster logs.
About this task
You must collect cluster logs after all maintenance. Cluster logs provide snapshots of the cluster that you
can review to make sure that maintenance is successful.

12
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

Procedure
1. [ ] Gather cluster logs by typing the command:
isi_gather_info

Task 26: Returning a failed part to Isilon


Return the failed part to Isilon Technical Support.
Procedure
1. [ ] Contact Isilon Technical Support to notify them that you are returning a failed part.
2. [ ] Package the failed part in the packaging materials provided with the replacement part.
3. [ ] Attach the return label that was included with the replacement part.
4. [ ] For the RMA number, write the support case number provided by Isilon Technical
Support.
5. [ ] Ship the failed part to the address specified on the return label.

Task 27: Update the install database


After all work is complete, update the install database.
Procedure
1. [ ] Browse to the EMC Product Registration and Install base Maintenance service portal,
at: http://emc.force.com/createPSCcase.
2. [ ] Select the Product Registration and Install Base Maintenance option.
3. [ ] To open the form, select the IB Status Change option.
4. [ ] Complete the form with the applicable information.
5. [ ] To submit the form, click Submit.

Where to go for support


Contact EMC Isilon Technical Support for any questions about EMC Isilon products.

Online Support Live Chat


Create a Service Request
Telephone Support United States: 1-800-SVC-4EMC (800-782-4362)
Canada: 800-543-4782
Worldwide: +1-508-497-7901
For local phone numbers for a specific country, see
EMC Customer Support Centers.
Help with Online Support For questions specific to EMC Online Support
registration or access, email [email protected].
Isilon Info Hubs For the list of Isilon info hubs, see the Isilon Info Hubs
page on the EMC Isilon Community Network. Isilon info
hubs organize Isilon documentation, videos, blogs, and
user-contributed content into topic areas, making it easy
to find content about subjects that interest you.

Support for IsilonSD Edge

13
906278724.doc (296.00 KB)
6/30/2025 1:29 PM
Last saved by EMC

If you are running a free version of IsilonSD Edge, community support is available through the EMC Isilon
Community Network. However, if you have purchased one or more licenses of IsilonSD Edge, you can
contact EMC Isilon Technical Support for assistance, provided you have a valid support contract for the
product.

14

You might also like