0% found this document useful (0 votes)
8 views103 pages

Imanager U2000 Troubleshooting - (V100R001C00 - 01)

The document provides a comprehensive troubleshooting guide for the iManager U2000 Unified Network Management System, detailing procedures for fault handling, data collection, and various troubleshooting scenarios. It covers issues related to network elements, operating systems, databases, server and client operations, and includes specific troubleshooting cases for Veritas HA systems and distributed systems. Intended for system administrators and technical support engineers, the document emphasizes the importance of accurate fault identification and resolution.

Uploaded by

alerufino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views103 pages

Imanager U2000 Troubleshooting - (V100R001C00 - 01)

The document provides a comprehensive troubleshooting guide for the iManager U2000 Unified Network Management System, detailing procedures for fault handling, data collection, and various troubleshooting scenarios. It covers issues related to network elements, operating systems, databases, server and client operations, and includes specific troubleshooting cases for Veritas HA systems and distributed systems. Intended for system administrators and technical support engineers, the document emphasizes the importance of accurate fault identification and resolution.

Uploaded by

alerufino
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 103

iManager U2000 Unified Network Management System

V100R001C00

Troubleshooting

Issue 01
Date 2009-09-25

Huawei Proprietary and Confidential


Copyright © Huawei Technologies Co., Ltd.
Huawei Technologies Co., Ltd. provides customers with comprehensive technical support and service. For any
assistance, please contact our local office or company headquarters.

Huawei Technologies Co., Ltd.


Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China

Website: http://www.huawei.com
Email: [email protected]

Copyright © Huawei Technologies Co., Ltd. 2009. All rights reserved.


No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are the property of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective holders.

Notice
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but the statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Proprietary and Confidential


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting Contents

Contents

About This Document.....................................................................................................................1


1 Basic Principles of Troubleshooting......................................................................................1-1
2 Troubleshooting Process...........................................................................................................2-1
3 Fault Data Collection.................................................................................................................3-1
4 NE Management Troubleshooting.........................................................................................4-1
4.1 Failed to Create an NE....................................................................................................................................4-2
4.2 Frequent Change of the Online and Offline Statuses of Certain NEs on the NMS........................................4-2
4.3 Failed to Connect the U2000 Server and NE..................................................................................................4-3
4.4 Abnormal Data Generated After the U2000 Restarts......................................................................................4-3

5 Faults of the Operating System...............................................................................................5-1


5.1 Solaris OS Troubleshooting............................................................................................................................5-2
5.1.1 Starting the Operating System Fails.......................................................................................................5-2
5.1.1.1 Operating System Enters the Single-User Mode After Restart...........................................................5-3
5.1.1.2 Repeated Startup of the Operating System.........................................................................................5-4
5.1.1.3 System Prompts Unadapted Display...................................................................................................5-5
5.1.2 Failed to Log In to the GUI of the OS....................................................................................................5-5
5.1.3 System Prompts That Interfaces of Graphical Tools Cannot Be Displayed..........................................5-6
5.1.4 Failed to Eject the CD-ROM..................................................................................................................5-6
5.1.5 Operation Anomaly Caused by Insufficient Disk Space........................................................................5-7
5.1.6 Slow Running of the System Caused by Insufficient Memory..............................................................5-7
5.1.7 Slow Running of the System Caused by High CPU Usage...................................................................5-8
5.2 Linux OS Troubleshooting..............................................................................................................................5-8
5.2.1 Failed to Log In to the GUI....................................................................................................................5-9

6 Faults of the Database...............................................................................................................6-1


6.1 Sybase Database Troubleshooting..................................................................................................................6-2
6.1.1 Failed to Back up the Database..............................................................................................................6-2
6.1.2 Starting the Sybase Database Fails.........................................................................................................6-2
6.1.2.1 Prompting Permission denied in Logs...............................................................................................6-3
6.1.2.2 Prompting Shared memory segment *.krg is in use in Logs...........................................................6-4
6.1.2.3 Prompting the Incorrect Setting of the Shared Memory in Logs........................................................6-5
6.1.2.4 Prompting the Failure of Opening lv_master in Logs........................................................................6-6

Issue 01 (2009-09-25) Huawei Proprietary and Confidential i


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Contents Troubleshooting

6.1.2.5 Incorrect Configuration File for the sybase User................................................................................6-7


6.1.3 Database Cannot Be Started Automatically...........................................................................................6-8
6.1.4 Sybase Database Is Started Abnormally................................................................................................6-9
6.1.4.1 Prompting dopen: open '/opt/sybase/data/lv_LogDB_dev' in Logs..............................................6-10
6.1.4.2 Prompt suspect in Logs.....................................................................................................................6-12
6.1.4.3 Disk of the Database Logs Is Full.....................................................................................................6-14
6.2 SQL Server Database Troubleshooting.........................................................................................................6-16
6.2.1 Failed to Re-install the SQL Database.................................................................................................6-16
6.2.2 How to Solve the Problem That an Attempt to Log In to the SQL Server Fails After the Windows Password
Is Changed.....................................................................................................................................................6-17
6.2.3 Initializing the Database Fails..............................................................................................................6-18
6.2.3.1 System Prompts login database failure ..........................................................................................6-18
6.2.3.2 Prompt Failed to open the database 'U2000DB'Failed to open the database 'VSMDB' in Logs
.......................................................................................................................................................................6-23
6.2.3.3 Prompt Cannot insert duplicate key in object 'TrailServiceType' in Logs ................................6-24
6.2.3.4 System Prompts Incorrect Parameter of Java Virtual Machine .................................................6-25
6.2.4 Backing up the Database Fails.............................................................................................................6-25

7 U2000 Server Troubleshooting................................................................................................7-1


7.1 Starting the U2000 Server Fails......................................................................................................................7-2
7.1.1 Abnormal Termination of the Server Application.................................................................................7-2
7.1.2 System Prompting Connection Failure to the Database.........................................................................7-3
7.1.3 Prompting Invalid License.....................................................................................................................7-4
7.1.4 U2000 Environment Variable Is Set Incorrectly....................................................................................7-5
7.1.5 Startup Failure Because of the Authority Problem of the U2000 Installation Path...............................7-6
7.1.6 Certain Processes of the U2000 Server Fail to Start..............................................................................7-7
7.2 Abnormal NMS Functions Due to Modified OS Time...................................................................................7-7
7.3 U2000 Runs Slowly........................................................................................................................................7-7

8 Faults of the U2000 Client.........................................................................................................8-1


8.1 Starting the U2000 Client Fails.......................................................................................................................8-2
8.2 U2000 Client Login Failure............................................................................................................................8-2
8.3 U2000 Client Runs Abnormally......................................................................................................................8-4
8.4 Main Menu or Icons Cannot Be Loaded in the U2000 Client Window..........................................................8-4
8.5 The NE Manager GUI of Certain Equipment Is Displayed Abnormally on the U2000 Client......................8-4

9 Veritas HA System Troubleshooting.....................................................................................9-1


9.1 Troubleshooting Policies for the Veritas HA System.....................................................................................9-2
9.1.1 Confirming the System Status................................................................................................................9-2
9.2 Veritas Troubleshooting Cases........................................................................................................................9-4
9.2.1 Switching Between Primary and Secondary Nodes Fails......................................................................9-4
9.2.2 Starting the U2000 HA System Fails.....................................................................................................9-5
9.2.3 Data Replication Cannot Be Performed Between Primary and Secondary Nodes................................9-5
9.2.4 Communication Between Primary and Secondary Nodes Fails.............................................................9-6
9.2.5 Resource in the Frozen State..................................................................................................................9-7

ii Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting Contents

9.2.6 Resource in the Fault State....................................................................................................................9-7


9.2.7 Frequent Dual-Host State of the System................................................................................................9-8
9.2.8 Connection Failure Between the Rlink and the Remote Host................................................................9-9
9.2.9 Abnormal Status of the Disk Volume..................................................................................................9-10
9.2.10 Failed to Start the VCS Because of the Errors in the Configuration File..........................................9-10
9.2.11 Faults on the Primary Site..................................................................................................................9-10
9.2.12 Unstable DCN Between the Primary and Secondary Sites................................................................9-11

10 Distributed System Troubleshooting................................................................................10-1


10.1 Slave Server in the Disconnected State.......................................................................................................10-2
10.2 Inconsistent Statuses of the U2000s on the Slave and Master Servers.......................................................10-2
10.3 Other Faults on the Master Server...............................................................................................................10-4
10.4 Other Faults on the Slave Server.................................................................................................................10-4

11 NMS System Maintenance Tool Troubleshooting..........................................................11-1


11.1 Troubleshooting the Inconsistency of the Instance Status..........................................................................11-2
11.2 An Error Message Is Displayed When the U2000 Maintenance Tool Client Is Started.............................11-2

A Obtaining the Technical Support.........................................................................................A-1


Index.................................................................................................................................................i-1

Issue 01 (2009-09-25) Huawei Proprietary and Confidential iii


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting Figures

Figures

Figure 2-1 Troubleshooting process.....................................................................................................................2-2

Issue 01 (2009-09-25) Huawei Proprietary and Confidential v


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting Tables

Tables

Table 3-1 Fault data collection.............................................................................................................................3-1

Issue 01 (2009-09-25) Huawei Proprietary and Confidential vii


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting About This Document

About This Document

Purpose
This document describes the procedure for handling a fault, information collecting, fault
identifying, fault handling, and suggestions on U2000 troubleshooting.

Related Versions
The following table lists the product versions related to this document.

Product Name Product Version

iManager U2000 V100R001C00

Intended Audience
This document is intended for:
l U2000 system administrators
l Technical support engineers

Organization
This document describes the operations that are performed by the NMS administrators on the
U2000 .

Chapter Description

1 Basic Principles of You need to locate and clear a fault by observing the
Troubleshooting troubleshooting principles and cautions.

2 Troubleshooting Process This topic describes the troubleshooting process.

3 Fault Data Collection In the case of a system fault, you need to collect the
related data in a timely manner, to locate and handle the
fault.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
About This Document Troubleshooting

Chapter Description

4 NE Management This topic describes how to troubleshoot NE


Troubleshooting management.

5 Faults of the Operating System This topic describes how to troubleshoot the faults of the
operating system.

6 Faults of the Database This topic describes how to troubleshoot the faults of the
database.

7 U2000 Server Troubleshooting This topic describes how to troubleshoot the U2000
server.

8 Faults of the U2000 Client This topic describes how to troubleshoot the faults of the
U2000 client.

9 Veritas HA System This topic describe how to troubleshoot the Veritas HA


Troubleshooting system.

10 Distributed System This topic describes how to troubleshoot the distributed


Troubleshooting system.

11 NMS System Maintenance This topic describes how to troubleshoot the NMS
Tool Troubleshooting system maintenance tool.

A Obtaining the Technical This topic describes how to obtain the technical support
Support in the case of any problems encountered during routine
maintenance.

Conventions
Symbol Conventions
The symbols that may be found in this document are defined as follows.

Symbol Description

Indicates a hazard with a high level of risk, which if not


avoided, will result in death or serious injury.
DANGER

Indicates a hazard with a medium or low level of risk, which


if not avoided, could result in minor or moderate injury.
WARNING

Indicates a potentially hazardous situation, which if not


avoided, could result in equipment damage, data loss,
CAUTION
performance degradation, or unexpected results.
TIP Indicates a tip that may help you solve a problem or save
time.

2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting About This Document

Symbol Description

NOTE Provides additional information to emphasize or supplement


important points of the main text.

General Conventions
The general conventions that may be found in this document are defined as follows.

Convention Description

Times New Roman Normal paragraphs are in Times New Roman.

Boldface Names of files, directories, folders, and users are in


boldface. For example, log in as user root.

Italic Book titles are in italics.


Courier New Examples of information displayed on the screen are in
Courier New.

Command Conventions
The command conventions that may be found in this document are defined as follows.

Convention Description

Boldface The keywords of a command line are in boldface.

Italic Command arguments are in italics.

[] Items (keywords or arguments) in brackets [ ] are optional.

{ x | y | ... } Optional items are grouped in braces and separated by


vertical bars. One item is selected.

[ x | y | ... ] Optional items are grouped in brackets and separated by


vertical bars. One item is selected or no item is selected.

{ x | y | ... }* Optional items are grouped in braces and separated by


vertical bars. A minimum of one item or a maximum of all
items can be selected.

[ x | y | ... ]* Optional items are grouped in brackets and separated by


vertical bars. Several items or no item can be selected.

GUI Conventions
The GUI conventions that may be found in this document are defined as follows.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
About This Document Troubleshooting

Convention Description

Boldface Buttons, menus, parameters, tabs, window, and dialog titles


are in boldface. For example, click OK.

> Multi-level menus are in boldface and separated by the ">"


signs. For example, choose File > Create > Folder.

Keyboard Operations
The keyboard operations that may be found in this document are defined as follows.

Format Description

Key Press the key. For example, press Enter and press Tab.

Key 1+Key 2 Press the keys concurrently. For example, pressing Ctrl+Alt
+A means the three keys should be pressed concurrently.

Key 1, Key 2 Press the keys in turn. For example, pressing Alt, A means
the two keys should be pressed in turn.

Mouse Operations
The mouse operations that may be found in this document are defined as follows.

Action Description

Click Select and release the primary mouse button without moving
the pointer.

Double-click Press the primary mouse button twice continuously and


quickly without moving the pointer.

Drag Press and hold the primary mouse button and move the
pointer to a certain position.

Update History
Updates between document versions are cumulative. Therefore, the latest document version
contains all updates made to previous versions.

Updates in Issue 01 (2009-09-25)


Initial release.

4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting About This Document

Updates in Issue 01 (2009-10-01)


Initial release.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 1 Basic Principles of Troubleshooting

1 Basic Principles of Troubleshooting

You need to locate and clear a fault by observing the troubleshooting principles and cautions.

Troubleshooting Principles
To analyze, locate, and clear a fault, observe the following principles:
l Restore the system monitoring as soon as possible.
l Before locating a fault, collect the fault data in a timely manner, and save the collected data
to a mobile storage medium or another computer in the network.
l When determining the troubleshooting scheme, evaluate the impact first, to ensure the
normal transmission of services.
l If the fault point cannot be located or the fault cannot be cleared, contact Huawei to obtain
technical support. Cooperate with engineers from Huawei for the troubleshooting, to
minimize the period of service interruption.

Troubleshooting Cautions
l Analyze the fault symptom, and handle the fault after locating the cause. If the cause is
unknown, do not perform operations blind, to prevent the problem from being enlarged.
The repairing of faults on the U2000 does not affect the NE running.
l Before handling a fault, keep all onsite records concerning the fault and do not delete any
data or log randomly.
l Before any modification, back up the data of the U2000 by exporting the script or backing
up the NMS data.
l After the system recovers, observe the running status, to make sure that the fault is cleared.
Complete the related handling report in a timely manner.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 1-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 2 Troubleshooting Process

2 Troubleshooting Process

When the U2000 is abnormal because of mis-operations, external causes such as power failure,
and software and hardware faults of the U2000 , the network may fail to be monitored. In this
case, you can locate the fault and repair the system by referring to the troubleshooting process
and observing the troubleshooting principles and cautions. If the problem persists, contact the
local office or customer service center of Huawei.
Figure 2-1 shows the troubleshooting process.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 2-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
2 Troubleshooting Process Troubleshooting

Figure 2-1 Troubleshooting process

Start

Yes
Generate an alarm? Process the alarm

No

No
Collect fault information Fault removed?

Yes
Yes
Emergency? End

No

Locate the fault

Emergency maintenance Perform troubleshooting

No Contact Huawei technical


Fault removed?
support

Yes

Record the experience

End

NOTE

l Normally, the troubleshooting consists of three stages: locating the fault, collecting the information,
and clearing the fault.
l If an alarm or abnormal event occurs on the U2000, clear the fault according to the prompt.

2-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 3 Fault Data Collection

3 Fault Data Collection

In the case of a system fault, you need to collect the related data in a timely manner, to locate
and handle the fault.
When a fault occurs on the U2000, see Table 3-1 to collect the fault data.

NOTE
It is recommended that you use the Quick Step tool to collect the related data. For details, refer to the
Huawei iManager U2000 User Guide (Quick Step).

Table 3-1 Fault data collection


Collection Item Description

Time and place Collect the information about the time and place of the fault. The time
should be accurate to the minute.

Symptom Describe the symptom when the fault occurs. The fault can be located
description better based on a more specific description.

Measures taken After you take some preliminary troubleshooting measures in field, new
and result problems may occur. Therefore, you need to record the procedure of
taking measures and the subsequent result in details.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 3-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
3 Fault Data Collection Troubleshooting

Collection Item Description

Version l View the version information about the U2000.


information – In the Solaris and Linux OS, the default directory storing the
imap.cfg file is /opt/U2000/server/conf.
– In the Windows OS, the default directory storing the imap.cfg file
is D:\U2000\server\conf.
The last line of the file displays the version information about the
U2000.
l In the Solaris and Linux OS, do as follows to view the version
information about the OS:
Log in to the OS as the root user. Then, run the # uname -a command.
l In the UNIX OS, do as follows to view the version information about
the database:
Log in to the OS as the sybase user. Then, run the following
commands:
# . /opt/sybase/SYBASE.sh
# cd /opt/sybase/OCS-*/bin
# isql -SDBSVR -Usa -P<sa password>
1>select @@version
2>go

In the Windows OS, run the following commands in the command


line interface (CLI):
>isql –SDBSVR -Usa –P<sa password>
1>select @@version
2>go

IP information Run the following commands to view the IP address and MAC address:
l On Solaris and Linux, log in as user root and run the ifconfig -a
command.
l On Windows, open the command prompt window and run the
ipconfig /all command.

Alarm information Collect the alarm information, especially the U2000 alarms or abnormal
events.

3-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 3 Fault Data Collection

Collection Item Description

Log information In the Solaris and Linux OS, do as follows to collect the log information
about the OS, database, and U2000:
l Use the Quick Step tool to collect the information about the OS and
database. For details, refer to the Huawei iManager U2000 User
Guide (Quick Step).
l For the details about collecting the log information about the
U2000, refer to Log Management in the Huawei iManager U2000
Administrator Guide.
In the Windows OS, collect the log information about the operating
system, database, and U2000 in the following method:
l Choose Start > Run from the desktop. Enter eventvwr.msc and then
press Enter. In Event Viewer, select the corresponding event name,
and right-click to save the log information of the operating system.
l In the MSSQLServer_installation_directory\MSSQL\LOG
directory, collect all the logs.
l Collect U2000 information, for details, refer to Log Management
in the Huawei iManager U2000 Administrator Guide.

Networking If the fault is caused by networking problems, you need to view the
diagram networking diagram.

ICMR-related files If the server runs on Solaris and Linux, you need to collect the ICMR-
related files:
l All files in the /etc/ICMR directory
l Files in the /var/ICMR directory

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 3-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 4 NE Management Troubleshooting

4 NE Management Troubleshooting

About This Chapter

This topic describes how to troubleshoot NE management.

4.1 Failed to Create an NE


4.2 Frequent Change of the Online and Offline Statuses of Certain NEs on the NMS
4.3 Failed to Connect the U2000 Server and NE
4.4 Abnormal Data Generated After the U2000 Restarts

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 4-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
4 NE Management Troubleshooting Troubleshooting

4.1 Failed to Create an NE


Symptom
Adding a device on the NMS fails. The system prompts Operation failed. Failure cause: NO
response from device.

Possible Cause
The possible causes are:
l The DCN between the NMS and the NE is faulty.
l The communication parameters of the NMS or the NE are incorrectly set.
l The NE is being restarted and does not respond.

Procedure
l Check the DCN between the U2000 and the NE.
1. Check that the U2000 and the NE are reachable. You can use the ping command to
check the network connectivity between the NMS and the NE and the packet loss ratio.
2. Rectify the fault according to the onsite condition.
l Check the settings of the parameters on the NMS and the NE.
1. Check the settings of the NMS communication parameters, including the IP address
and the parameters related to the gateway.
2. Check the settings of the NE parameters, including the IP address, ID, extension ID,
and the parameters related to the gateway.
3. Check whether the name and password of the user logging in to the NE are correct.
4. Make sure that the settings of the parameters for the creation of the NE are the same
as those on the device side.
l If the NE is being restarted and does not respond, add the NE after the restart is complete.

----End

4.2 Frequent Change of the Online and Offline Statuses of


Certain NEs on the NMS
Symptom
The online and offline statuses of certain NEs frequently change.

Possible Cause
l The number of NEs exceeds the maximum management capability of the NMS.
l The disk space is insufficient.

4-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 4 NE Management Troubleshooting

Procedure
Step 1 Check whether the number of NEs exceeds the maximum management capability of the NMS.
For the performance indicators, refer to the Huawei iManager U2000 Product Description.

Step 2 Check the disk space of the server. In normal situations, the disk usage cannot exceed 80%. If
the disk usage exceeds 80%, clear the disk. You can delete and back up related files to free the
disk space.

----End

4.3 Failed to Connect the U2000 Server and NE


Symptom
The U2000 server is normal, but a large number of NEs are disconnected.

Possible Cause
There are too many non-gateway NEs that are connected to a gateway NE. Thus, the scale of
the subnets is too large and the ECC storm occurs.

Procedure
Step 1 Run the ping command to check whether the IP addresses of the disconnected gateway NEs are
available.

Step 2 Check whether the number of non-gateway NEs connected to a gateway NE exceeds the
maximum.
For the maximum number of non-gateway NEs connected to a gateway NE, refer to the product
description of the related version. If the actual number exceeds the maximum, modify the actual
number according to the planning.

----End

4.4 Abnormal Data Generated After the U2000 Restarts


Symptom
U2000Certain NEs are missing in the NMS and the topology is disorderly displayed.

Possible Cause
The NMS database is abnormal.

Procedure
Step 1 Initialize the database. For details, refer to Backing Up and Restoring the U2000 Database in
the Huawei iManager U2000 Administrator Guide.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 4-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
4 NE Management Troubleshooting Troubleshooting

Step 2 Manually recover the U2000 data. For details, refer to Backing Up and Restoring the U2000
Database in the Huawei iManager U2000 Administrator Guide.

----End

4-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 5 Faults of the Operating System

5 Faults of the Operating System

About This Chapter

This topic describes how to troubleshoot the faults of the operating system.
5.1 Solaris OS Troubleshooting
This topic describes how to troubleshoot the Solaris OS.
5.2 Linux OS Troubleshooting
This topic describes how to troubleshoot the Linux OS.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
5 Faults of the Operating System Troubleshooting

5.1 Solaris OS Troubleshooting


This topic describes how to troubleshoot the Solaris OS.

5.1.1 Starting the Operating System Fails


5.1.2 Failed to Log In to the GUI of the OS
5.1.3 System Prompts That Interfaces of Graphical Tools Cannot Be Displayed
5.1.4 Failed to Eject the CD-ROM
5.1.5 Operation Anomaly Caused by Insufficient Disk Space
5.1.6 Slow Running of the System Caused by Insufficient Memory
5.1.7 Slow Running of the System Caused by High CPU Usage

5.1.1 Starting the Operating System Fails


The operating system cannot be started or is started repeatedly. Therefore, a certain user fails to
enter the login interface.

Locate and rectify the fault according to the following sequence:

Seque Current Symptom Troubleshooting


nce

1 The screen displays nothing. Check whether the connection between


the display and server is normal.

2 The screen displays error prompts. Troubleshoot according to the error


prompts.
Rectify the fault according to the
following symptoms:
l 5.1.1.1 Operating System Enters
the Single-User Mode After Restart
l 5.1.1.2 Repeated Startup of the
Operating System
l 5.1.1.3 System Prompts Unadapted
Display

3 In other cases. Contact Huawei engineers for


troubleshooting.

5.1.1.1 Operating System Enters the Single-User Mode After Restart


5.1.1.2 Repeated Startup of the Operating System
5.1.1.3 System Prompts Unadapted Display

5-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 5 Faults of the Operating System

5.1.1.1 Operating System Enters the Single-User Mode After Restart

Symptom
The operating system enters the single-user mode after restart. A message is displayed indicating
"WARNING - Unable to repair the / filesystem. Run fsck manually (fsck -F ufs /dev/rdsk/
c*t*d*s*)."

NOTE
In the warning prompt "Unable to repair the / filesystem", the / may indicate another directory.

Possible Cause
The server is switched off illegally or powered off. Therefore, the file system that is running is
damaged. After the powered supply is restored, the system performs a self-check during the
startup of the server. If the file system is detected damaged, the self-check fails and the system
enters the single-user mode during the startup.

Procedure
Step 1 Log in to the operating system as user root.

Step 2 To restore the file system, run the following command:


# fsck -y

CAUTION
l If the disk capacity is large and the file system is damaged severely, it may take a long time
to restore the file system by using the fsck -y command. During the restoration, do not
perform any operation to the server. Otherwise, the operating system cannot recover.
l The fsck command can be used to rectify only normal faults. For the fault on the Solaris
startup parameters or kernel damage due to abnormal power failure, the command is invalid.

Step 3 Observe the information displayed on the screen. Check whether the file systems of all partitions
are correct and whether the file system of the damaged partition is restored.
If the error information or the information that requires restoration is displayed again, run the
fsck -y command repeatedly until such information is not displayed again.

Step 4 To synchronize the files and restart the operating system, run the following commands:
# sync;sync;sync;sync;sync;sync
# init 6

----End

Suggestion and Summary


It is prohibited to shut down the server illegally. It is recommended that the server be configured
with the UPS to effectively prevent power failures.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
5 Faults of the Operating System Troubleshooting

5.1.1.2 Repeated Startup of the Operating System

Symptom
The operating system is started repeatedly. A message is displayed indicating "Cannot open‘/
etc/path_to_inst’Program terminated." Then the system is started repeatedly.

Possible Cause
The server is powered off abnormally or other abnormal operations are performed. This causes
that the operating system is damaged and the path_to_inst system file cannot be opened.
Therefore, the operating system cannot be started.

Procedure
Step 1 During self-check of the operating system (before entering the operating system), press STOP
+A to exit the startup. The ok prompt is displayed.

Step 2 Insert the installation CD-ROM of Solaris 10. To start from the CD-ROM and enter the single-
user mode, run the following command:
ok boot cdrom -s

NOTE
Wait for 5 minutes. When SINGLE USER MODE and # are displayed, the system enters the single user
start mode.

Step 3 To search for the corresponding raw equipment name of the system root directory, run the
following commands:
# cat /etc/vfstab

The terminal displays:

NOTE
The displayed message changes according to different actual conditions.
#device device mount FS fsck mount mount
#to mount to fsck point type pass at boot options
#
fd - /dev/fd fd - no -
/proc - /proc proc - no -
/dev/dsk/c1t0d0s1 - - swap - no -
/dev/dsk/c1t0d0s0 /dev/rdsk/c1t0d0s0 / ufs 1 no -
/dev/dsk/c1t0d0s7 /dev/rdsk/c1t0d0s7 /T2000 ufs 2 yes -
/dev/dsk/c1t0d0s6 /dev/rdsk/c1t0d0s6 /opt ufs 2 yes -
/devices - /devices devfs - no -
ctfs - /system/contract ctfs - no -
objfs - /system/object objfs - no -
swap - /tmp tmpfs - yes -
/dev/dsk/c1t1d0s0 /dev/rdsk/c1t1d0s0 /version ufs 2
yes -

In the preceeding message, the corresponding raw partition of the root directory (/) is /dev/dsk/
c1t0d0s0.

Step 4 Set the corresponding raw equipment of the root directory to the /mnt directory to restore the
damaged operating system.
# mount raw equipment name /mnt

For example, run the following commands to set the /dev/dsk/c1t0d0s0 to the /mnt:
# mount /dev/dsk/c1t0d0s0 /mnt

5-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 5 Faults of the Operating System

Step 5 If /etc/path_to_inst is lost, run the following commands to restore it by using the path_to_inst-
INSTALL template that is reserved in the /etc directory by the system.
# cd /mnt/etc
# cp path_to_inst-INSTALL path_to_inst

Step 6 Run the following commands to synchronize the file and restart the operating system:
# sync;sync;sync;sync;sync;sync
# init 6

Step 7 After the system restarts normally, run the fsck -y command to repair the file system.

----End

5.1.1.3 System Prompts Unadapted Display

Symptom
After the workstation is started, a message is displayed indicating that the display is unadapted
and errors occur in the /var/dt/Xerrors file.

Possible Cause
The peripherals of the workstation are incorrectly connected. For example, the mouse or
keyboard is not connected or connected improperly.

Procedure
Step 1 Repair the connection of the peripherals (such as the mouse, keyboard, and display) according
to the information displayed on the screen.

Step 2 Stop the NMS processes and the database process.

Step 3 To restart the workstation, run the following commands:


# sync;sync;sync;sync;sync
# shutdown -y -g0 -i6

----End

5.1.2 Failed to Log In to the GUI of the OS

Symptom
After the Solaris OS is started, the user cannot log in to the GUI.

Possible Cause
Abnormal shutdown may damage the file system. Consequently, the user cannot log in to the
GUI after the Solaris OS is started. In this case, you can use the fsck command to restore the
file system.

Procedure
Step 1 After the Solaris OS is started, enter the password of the root user according to the prompt to
access the CLI.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
5 Faults of the Operating System Troubleshooting

Step 2 Run the following command for several times to automatically rectify the fault:
# fsck -y

NOTE
The fsck command can be used to rectify only normal faults. For the fault on the Solaris startup parameters
or kernel damage due to abnormal power failure, the command is invalid.

Step 3 Run the following commands to restart the workstation:


# sync;sync;sync;sync;sync
# shutdown -y -g0 -i6

----End

5.1.3 System Prompts That Interfaces of Graphical Tools Cannot Be


Displayed

Symptom
When the graphical tools are used on Solaris, such as the smc, a message is displayed indicating
"can’t open to display."

Possible Cause
The DISPLAY environment variable may not be set in GUI mode.

Procedure
Step 1 Log in to the OS in GUI mode.

Step 2 To query the terminal number, run the following commands as user root:
# set | grep DISPLAY
# xhost +

Step 3 To set the DISPLAY environment variable, run the following commands:
# DISPLAY=local host name (or IP address):local terminal No.
# export DISPLAY
# set | grep DISPLAY
DISPLAY=10.70.77.62:0.0
# xhost +
# DISPLAY=10.70.77.62:0.0
# export DISPLAY

Step 4 Open the interfaces of the graphical tools again.

----End

5.1.4 Failed to Eject the CD-ROM

Symptom
A CD-ROM is in the CD-ROM drive. When you use the eject command to open the drive, the
system prompts Device busy and the CD-ROM cannot be ejected.

Possible Cause
The data in the CD-ROM is in use.

5-6 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 5 Faults of the Operating System

Procedure
Step 1 Check that the data in the current CD-ROM is not in use.

Step 2 Run the following command as the root user:


# /etc/init.d/volmgt stop

Step 3 Press the eject button on the drive panel to take out the disk from the CD-ROM.

Step 4 Run the following command to resume the drive:


# /etc/init.d/volmgt start

----End

5.1.5 Operation Anomaly Caused by Insufficient Disk Space

Symptom
Certain operations are abnormal. For example, the operation system cannot be logged in to, the
operation system runs at a low speed, the database cannot be started, or the U2000 cannot be
started.

Possible Cause
Normally, the disk space occupancy should be 80% or below.

Procedure
Step 1 Check the disk space. Do as follows:
(1) Log in to the Solaris OS as the root user.
(2) Run the following command to check the disk usage:
# df -k

(3) View the usage of the directories including the / directory, /opt directory, and /opt/
U2000 directory in the displayed information.

Step 2 If the size of the disk space exceeds the normal value, you need to manually clear the disk. For
details, refer to Managing U2000 Files and Disks in the Huawei iManager U2000
Administrator Guide.

----End

5.1.6 Slow Running of the System Caused by Insufficient Memory

Symptom
The U2000 runs at a low speed.

Possible Cause
The memory may be insufficient.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5-7


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
5 Faults of the Operating System Troubleshooting

Procedure
Step 1 To check the memory occupancy status, run the following command as user root:
# vmstat 2

The terminal displays:


kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s3 -- in sy cs us sy id
0 0 0 16940400 763008 7 30 20 6 13 0 12 2 -1 0 0 384 1773 380 1 1 98
0 0 0 16968504 737784 2 10 24 0 0 0 0 0 0 0 0 365 450 328 0 0 99
0 0 0 16968504 737832 0 0 0 0 0 0 0 2 0 0 0 386 1416 337 1 1 99
0 0 0 16968504 737832 0 0 0 0 0 0 0 0 0 0 0 369 433 330 0 0 99
......

If the value of the sr column remains at a value from 200 to 300 page/sec, it indicates that the
physical memory may be insufficient.
Step 2 Close unnecessary applications.
Step 3 If the memory occupancy remains high, you need to replace the physical memory.

----End

5.1.7 Slow Running of the System Caused by High CPU Usage


Symptom
The U2000 runs at a low speed.

Possible Cause
The CPU usage may be over high.

Procedure
Step 1 To check the memory occupancy status, run the following command as user root:
# vmstat 2

The terminal displays:


kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s0 s1 s3 -- in sy cs us sy id
0 0 0 16940400 763008 7 30 20 6 13 0 12 2 -1 0 0 384 1773 380 1 1 98
0 0 0 16968504 737784 2 10 24 0 0 0 0 0 0 0 0 365 450 328 0 0 99
0 0 0 16968504 737832 0 0 0 0 0 0 0 2 0 0 0 386 1416 337 1 1 99
0 0 0 16968504 737832 0 0 0 0 0 0 0 0 0 0 0 369 433 330 0 0 99
......

In the last column, id indicates the idle CPU ratio. If the idle CPU ratio remains below 10% for
a long time, the dominant frequency of the CPU mainly bottlenecks the running efficiency.
Step 2 Close unnecessary applications.

----End

5.2 Linux OS Troubleshooting


This topic describes how to troubleshoot the Linux OS.

5.2.1 Failed to Log In to the GUI

5-8 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 5 Faults of the Operating System

5.2.1 Failed to Log In to the GUI


Symptom
After the Linux OS is started, the user cannot access the GUI.

Possible Cause
The settings of the parameters on the SaX2 tool do not match those of the related parameters on
the video card drive of the OS.

Procedure
Step 1 Log in to the system as the root user. Run the following commands to open the GUI for
configuring the SaX2 tool:
# init 3
# sax2

Step 2 Set the resolution of the monitor to VESA 1024*768@60HZ. Click OK.

----End

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 5-9


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

6 Faults of the Database

About This Chapter

This topic describes how to troubleshoot the faults of the database.


6.1 Sybase Database Troubleshooting
This topic describes how to troubleshoot the Sybase database.
6.2 SQL Server Database Troubleshooting
This topic describes how to troubleshoot the SQL Server database.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

6.1 Sybase Database Troubleshooting


This topic describes how to troubleshoot the Sybase database.
6.1.1 Failed to Back up the Database
6.1.2 Starting the Sybase Database Fails
6.1.3 Database Cannot Be Started Automatically
6.1.4 Sybase Database Is Started Abnormally

6.1.1 Failed to Back up the Database

Symptom
The backup file does not exist in the path specified in the backup task of the database backup
tool.

Possible Cause
The possible causes that result in the database backup failure are as follows:
l The database is not started.
l Full Disk Space.
l The authorities of the backup path may be incorrect.

Procedure
Step 1 Check that the database is normally started.

Run the /opt/sybase/ASE-*/install/showserver command as the sybase user. If the


dataserver process and backupserver process exist, it indicates that the database service
process is started.

Step 2 Check the disk space. For details, see 5.1.5 Operation Anomaly Caused by Insufficient Disk
Space.

Step 3 Check the right and owner of the backup directory.


You can run the ls -al command to check the right of the backup directory. The owner of the
directory storing the backup file must be sybase. In addition, the directory must be writable,
readable, and executable. For details, refer to the common commands of the Solaris OS or SUSE
Linux OS.

----End

6.1.2 Starting the Sybase Database Fails


The dataserver and backupserver processes cannot be found after the Sybase database is started
for a period of time.

Locate and rectify the fault according to the following sequence:

6-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Sequ Problem Location Troubleshooting


ence

1 Check whether the disk Rectify the fault with reference to 5.1.5 Operation
usage exceeds the limit. Anomaly Caused by Insufficient Disk Space.

2 Check whether the Rectify the fault with reference to 6.1.2.5 Incorrect
configuration file for user Configuration File for the sybase User.
sybase is incorrect.

3 Check whether there is any Rectify the fault according to the following error
error message in logs. messages:
l 6.1.2.1 Prompting Permission denied in Logs
l 6.1.2.2 Prompting Shared memory segment *.krg
is in use in Logs
l 6.1.2.3 Prompting the Incorrect Setting of the
Shared Memory in Logs
l 6.1.2.4 Prompting the Failure of Opening
lv_master in Logs

4 The preceding measures Contact Huawei engineers for troubleshooting.


do not work.

6.1.2.1 Prompting Permission denied in Logs


6.1.2.2 Prompting Shared memory segment *.krg is in use in Logs
6.1.2.3 Prompting the Incorrect Setting of the Shared Memory in Logs
6.1.2.4 Prompting the Failure of Opening lv_master in Logs
6.1.2.5 Incorrect Configuration File for the sybase User

6.1.2.1 Prompting Permission denied in Logs

Symptom
In the single-node cluster, the Sybase database cannot be started.
The following message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log:
00:00000:00000:2004/10/10 00:03:16.63 kernel dopen: open '/dev/rdsk/c1t1d0s3',
Permission denied 00:00000:00000:2004/10/10 00:03:16.63 kernel kdconfig: unable to
read primary master device 00:00000:00000:2004/10/10 00:03:16.65 kernel kiconfig:
read of config block failed

Possible Cause
In the preceding message, Permission denied indicates that the authorities to the file are
insufficient, which causes that the file cannot be read. Therefore, the database server cannot be
started.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

CAUTION
The following operations of rectifying the fault are specific only to the single server system. If
similar faults occur to the HA system, contact the local office or customer service center of
Huawei for troubleshooting.

Procedure
Step 1 Determine the user (nmsuser, sybase, root, or other names) that is used to start the Sybase. The
correct user should be sybase.

Step 2 Check the raw partition or the file that reports Permission denied in the log, and check whether
the user that is used to start the database has the authorities to access the file or raw partition (a
disk partition without having a file system imposed over it). If the user does not have the
authorities, assign authorities to the user.
NOTE
The equipment files are placed in the $SYBASE/data directory. You can change the authorities to an
equipment file by running the chmod 755 equipment file name command.

Step 3 Restart the database.

----End

6.1.2.2 Prompting Shared memory segment *.krg is in use in Logs

Symptom
In the single-node cluster, the Sybase database cannot be started.

The following message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log:


00:00000:00000:2005/07/15 17:21:32.74 kernel Using config area from primary master
device. 00:00000:00000:2005/07/15 17:21:33.01 kernel Warning: Using default file
'/opt/sybase/ASE-15_0/DBSVR.cfg' since a configuration file was not specified.
Specify a configuration file name in the RUNSERVER file to avoid this message.
00:00000:00000:2005/07/15 17:21:33.13 kernel os_create_keyfile: Shared memory
segment /opt/sybase/ASE-15_0/DBSVR.krg is in use. Check if SQL Server is already
running. If NOT remove old .srg/.krg files & restart. 00:00000:00000:2005/07/15
17:21:33.18 kernel kbcreate: couldn't get shmid for kernel region.
00:00000:00000:2005/07/15 17:21:33.18 kernel kistartup: could not create shared
memory

Possible Cause
The Sybase database server is shut down improperly. Therefore, the DBSVR.krg and
DBSVR.srg junk files exist in the $SYBASE or $SYBASE/$SYBASE_ASE directory.

CAUTION
The following operations of rectifying the fault are specific only to the single server system. If
similar faults occur to the HA system, contact the local office or customer service center of
Huawei for troubleshooting.

6-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Procedure
Step 1 Log in to the operating system as user sybase.
Step 2 Run the following commands, and check whether the DBSVR.krg and DBSVR.srg files exist
in the $SYBASE or $SYBASE/$SYBASE_ASE directory.
$ cd $SYBASE
$ ls -al
$ cd $SYBASE/$SYBASE_ASE
$ ls -al

Step 3 If the DBSVR.krg and DBSVR.srg files exist, run the following commands to delete the files.
$ rm -rf DBSVR.krg
$ rm -rf DBSVR.srg

Step 4 Restart the database.

----End

6.1.2.3 Prompting the Incorrect Setting of the Shared Memory in Logs

Symptom
In the single-node cluster, the Sybase database cannot be started.
The following message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log:
00:00000:00000:2005/07/20 17:07:15.41 kernel Using config area from primary master
device. 00:00000:00000:2005/07/20 17:07:16.65 kernel Warning: Using default file
'/opt/sybase1192/DBSVR.cfg' since a configuration file was not specified. Specify
a configuration file name in the RUNSERVER file to avoid this message.
00:00000:00000:2005/07/20 17:07:17.39 kernel os_create_region: can't allocate
260775936 bytes 00:00000:00000:2005/07/20 17:07:17.42 kernel kbcreate: couldn't
create kernel region. 00:00000:00000:2005/07/20 17:07:17.42 kernel kistartup:
could not create shared memory

Possible Cause
The /etc/system file is not configured with correct shared memory.

CAUTION
The following operations of rectifying the fault are specific only to the single server system. If
similar faults occur to the HA system, contact the local office or customer service center of
Huawei for troubleshooting.

Procedure
Step 1 Add set shmsys:shminfo_shmmax=memory (MB) x 1024 x 1024/2 at the end of the /etc/
system file.
(1) To check the memory, run the following command as user root:
# prtdiag

The terminal displays:


NOTE
The displayed message changes according to different on-site equipment configuration.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

Memory size:2GB

(2) Add set shmsys:shminfo_shmmax=memory (MB) x 1024 x 1024/2 at the end of the /etc/
system file.
For example, if the memory is 2 GB, the value of the memory (2048MB) x 1024 x 1024/2
is 1073741824.
Then, add the following contents at the end of the /etc/system file:
set shmsys:shminfo_shmmax=1073741824
TIP

l In the case of GUI, see the methods of opening and editing a file in the Solaris Online Help.
l In the case of CLI, edit the file by running the vi command. For the specific method, see the
commands that are commonly used on Solaris.

Step 2 Restart the database.

----End

6.1.2.4 Prompting the Failure of Opening lv_master in Logs

Symptom
In the single-node cluster, the Sybase database cannot be started.

The following message is found in the $SYBASE/$SYBASE_ASE/install/DBSVR.log:


00:00000:00000:2005/07/20 17:43:43.65 kernel dopen: open '/opt/sybase/data/
lv_master', No such file or directory 00:00000:00000:2005/07/20 17:43:43.65 kernel
kdconfig: unable to read primary master device 00:00000:00000:2005/07/20
17:43:43.65 kernel kiconfig: read of config block failed

Possible Cause
The equipment file of the master database is lost.

CAUTION
The following operations of rectifying the fault are specific only to the single server system. If
similar faults occur to the HA system, contact the local office or customer service center of
Huawei for troubleshooting.

Procedure
Step 1 Back up the U2000 data to the local server. For details, see the Huawei iManager U2000
Administrator Guide.

Step 2 Reinstall the Sybase database. For details, see the Huawei iManager U2000 Installation Guide

CAUTION
The U2000 monitoring may be interrupted during the database reinstallation. Therefore, ensure
that the database data is backed up for data restoration.

6-6 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Step 3 Initialize the U2000 database. For details, see the administrator guide for the corresponding
version and solution.

CAUTION
Data may be lost during the database initialization. Therefore, ensure that the database data is
backed up before the initialization.

Step 4 Restore the U2000 database data. For details, see the administrator guide for the corresponding
version and solution.
Step 5 Restart the database.

----End

6.1.2.5 Incorrect Configuration File for the sybase User

Symptom
In the single-node cluster, the Sybase database cannot be started.
After switching to the sybase user by running the su - sybase command, a certain user runs the
showserver command. The query result does not contain the dataserver and backupserver
processes.

Possible Cause
The following configuration files for the sybase user may be faulty:
l The sybase user group does not exist.
l The sybase user does not exist.
l The .profile file does not exist in the home directory of the sybase user.
l The .profile file of the sybase user is incorrect.

CAUTION
The following operations of rectifying the fault are specific only to the single server system. If
similar faults occur to the HA system, contact the local office or customer service center of
Huawei for troubleshooting.

Procedure
Step 1 To check whether the sybase user group exists, run the following command as the root user:
# cat /etc/group

The terminal displays:


...... sybase::101:sybase ......

If sybase is displayed before the first : in the preceding message, it indicates that the sybase
user group exists. Otherwise, run the following command as the root user to create the sybase
user group manually:

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-7


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

# groupadd sybase

Step 2 To check whether the sybase user exists, run the following command as the root user:
# cat /etc/passwd

The terminal displays:


...... sybase:x:101:104::/opt/sybase:/usr/bin/bash ......

If sybase is displayed before the first : in the preceding message, it indicates that the sybase
user exists. Otherwise, run the following command as the root user to create the sybase user
manually:
# useradd -d /opt/sybase -g sybase -s /usr/bin/sh sybase

Step 3 To check whether the .profile file exists in the home directory of the sybase user, run the
following command as the root user:
# su - sybase
$ cd $HOME
$ ls -a

The terminal displays:


...... .profile ......

If the .profile file is displayed, it indicates that the .profile file exists. Otherwise, run the
following command as the root user to create the file manually:
# touch /opt/sybase/.profile

Step 4 To check whether the .profile file is correct, run the following command as the sybase user:
$ cat .profile

The terminal displays:


#!/usr/bin/sh
PS1=$
export PS1
. /opt/sybase/SYBASE.sh
LANG=C
export LANG

If the preceding information is displayed, it indicates that the .profile file is correct. Otherwise,
add the following information to the .profile file in the /opt/sybase/ directory as the root user:
#!/usr/bin/sh
PS1=$
export PS1
. /opt/sybase/SYBASE.sh
LANG=C
export LANG

Step 5 Set the host and authorities of the /opt/sybase/ directory to the correct values.
# chmod -R 755 /opt/sybase
# chown -R sybase:sybase /opt/sybase

Step 6 Restart the database.

----End

6.1.3 Database Cannot Be Started Automatically

Symptom
In the single server system, the database cannot be started automatically after the Solaris or
SUSE Linux server is started.

6-8 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Possible Cause
The Sybase database is manually started by different users, which causes the number of
devices configuration item in the DBSVR.cfg file to restore the default value. As a result, the
database process cannot automatically restart.

Procedure
Step 1 Check whether the DBSVR.krg file exists in the /opt/sybase/ASE-* directory. If the file exists,
delete the file.
Step 2 Modify the DBSVR.cfg file in the /opt/sybase/ASE-* directory. Change the value of the
number of devices configuration item to 255.
Step 3 Log in to the operating system as user sybase.
Step 4 To start the database manually, run the following commands:
$ cd $SYBASE/$SYBASE_ASE/install
$ ./startserver -f ./RUN_DBSVR
$ ./startserver -f ./RUN_DBSVR_back

----End

6.1.4 Sybase Database Is Started Abnormally


This topic describes how to troubleshoot the startup exception of the Sybase database. Locate
and rectify the fault according to the log information:

Log Information Troubleshooting

The log indicates that the Rectify the fault with reference to 6.1.4.1 Prompting dopen:
equipment file cannot be open '/opt/sybase/data/lv_LogDB_dev' in Logs.
opened.

The log indicates suspect. Rectify the fault with reference to 6.1.4.2 Prompt suspect in
Logs.

The log indicates the disk Rectify the fault with reference to 6.1.4.3 Disk of the
allocated for the database Database Logs Is Full.
logs is full.

In other cases. Contact Huawei engineers for troubleshooting.

6.1.4.1 Prompting dopen: open '/opt/sybase/data/lv_LogDB_dev' in Logs


6.1.4.2 Prompt suspect in Logs
6.1.4.3 Disk of the Database Logs Is Full

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-9


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

6.1.4.1 Prompting dopen: open '/opt/sybase/data/lv_LogDB_dev' in Logs

Symptom
A message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log indicating that
the equipment file cannot be opened. The message displayed is as follows:
NOTE
The contents in () are explanations of the message.
00:00000:00001:2005/07/20 17:18:29.57 server Activating disk 'LogDB_dev'.
00:00000:00001:2005/07/20 17:18:29.57 kernel Initializing virtual device 13, '/
opt/sybase1192/data/lv_LogDBR6' 00:00000:00001:2005/07/20 17:18:29.57 kernel
dopen: open '/opt/sybase/data/lv_LogDB_dev', No such file or directory (The
equipment file does not exist.) 00:00000:00001:2005/07/20 17:18:29.57 kernel
udactivate: error starting virtual disk 13 (The equipment cannot be activated
because the equipment file does not exist.) ...... 00:00000:00001:2005/07/20
17:18:46.38 kernel udstartio: vdn 13 has not been set up (The equipment 13 is not
activated.) 00:00000:00001:2005/07/20 17:18:46.40 server Error: 840, Severity: 17,
State: 1 (Error code) 00:00000:00001:2005/07/20 17:18:46.40 server Device
'LogDB_dev' (with physical name '/opt/sybase1192/data/lv_LogDB_dev', and virtual
device number 13) has not been correctly activated at startup time. Please contact
a user with System Administrator (SA) role. (The equipment cannot be started.)
00:00000:00001:2005/07/20 17:18:46.40 server Unable to proceed with the recovery
of dbid <8> because of previous errors. Continuing with the next database. (The
database cannot be restored because the equipment cannot be started.)

Possible Cause
The equipment file of the database is lost. The file may be deleted by mistake or lost due to the
power failure.

Fault Diagnosis
To find the name of the database where the fault occurs, run the following commands as user
root:
# su - sybase
$ isql -Usa -P<sa password> -SDBSVR
1> select name,status from sysdatabases
2> go

The terminal displays:

NOTE
Assume that the physical file of LogDB is deleted by mistake.
name status
------------------------------ ------
Eml_multinesvrDB 12
FaultDB 12
LogDB 76
master 0
model 0
sybsystemdb 0
sybsystemprocs 8
tempdb 12

The status value of LogDB is 76, it indicates that the physical file of LogDB is deleted by
mistake.

6-10 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Procedure
Step 1 To start the database, run the following commands as user sybase:
$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 2 To log in to the database, run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 3 Run the following commands:


1> sp_configure 'allow update', 1
2> go
1> update master..sysdatabases set status = 320 where name = 'database name'
2> go
1> select name,status from sysdatabases
2> go

In the message displayed, if the status value of database name to be restored is 320, it indicates
that the setting is successful.
Step 4 Run the following commands:
1> shutdown
2> go

Step 5 To start the database, run the following commands as user sybase:
$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 6 To log in to the database, run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 7 Run the following commands:


1> dbcc dbrepair(database name, dropdb)
2> go

Step 8 Delete the database devices.


(1) To query the names of all the database devices in the database, run the following commands:
1> select name from sysdevices
2> go

The terminal displays:

NOTE
The following takes the unexpected deletion of the physical file of LogDB as an example.
name
------------------------------
FaultDB_dev
FaultDBlog_dev
LogDB_dev
LogDBlog_dev
NAWdmNemgrDB_994_dev
NAWdmNemgrDB_994log_dev
NgwdmaNemgrDB_6154_dev
NgwdmaNemgrDB_6154log_dev
OAMSDB_dev
OAMSDBlog_dev
SchdDB_dev
SchdDBlog_dev
SecurityDB_dev
SecurityDBlog_dev
TNCOMMONDB_dev

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-11


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

TNCOMMONDBlog_dev
TNOTNDB_dev
TNOTNDBlog_dev
TopoDB_dev
TopoDBlog_dev
TransPerfDB_dev
TransPerfDBlog_dev
master
mcdb_dev
mcdblog_dev
sysprocsdev
tapedump1
tapedump2
tempdb_dev
tempdblog_dev

(2) Find the names of the database devices to be deleted according to the message displayed.

The prefixes of the names of the database devices to be deleted are consistent with the name
of the database to be restored. For example, the name of the database to be restored in this
case is LogDB. Then, the names of the database devices to be deleted are LogDB_dev and
LogDBlog_dev.
(3) To delete the database devices, run the following commands:
1> sp_dropdevice database device name
2> go

For example, the names of the database devices to be deleted in this case are
LogDB_dev and LogDBlog_dev. Run the following commands:
1> sp_dropdevice LogDB_dev
2> go
1> sp_dropdevice LogDBlog_dev
2> go

Step 9 Initialize the database. For the specific method, see the administrator guide for the corresponding
version and solution.

Step 10 Restore the database data. For the specific method, see the administrator guide for the
corresponding version and solution.

----End

Suggestion and Summary


During routine maintenance, it is recommended that you comply with the precautions for the
software and hardware operations mentioned in the suggestions on safe operations. In this way,
you can avoid database exceptions caused by incorrect operations.

6.1.4.2 Prompt suspect in Logs

Symptom
A message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log indicating that
the equipment file cannot be opened. The message displayed is as follows:
00:00000:00001:2005/07/20 17:33:25.71 server Error: 926, Severity: 14, State: 1
00:00000:00001:2005/07/20 17:33:25.71 server Database 'database name' cannot be
opened.
An earlier attempt at recovery marked it 'suspect'.
Check the SQL Server errorlog for information as to the cause.

6-12 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Possible Cause
The log contains suspect. Generally, this fault occurs because of the abnormal power failure of
the server, or because the equipment file of the database is damaged or the database log is full
but not cleared in a timely manner. Therefore, you need to rectify the fault manually.

CAUTION
If the master database is suspended, you need to re-install the database or seek advice from
Sybase engineers.

Procedure
Step 1 Log in to the operating system as user root.
Step 2 To log in to the database as user sa , run the following commands:
# su - sybase
$ isql -Usa -P<sa password> -SDBSVR

Step 3 To update the suspended database in the log, run the following commands:
1> sp_configure 'allow update', 1
2> go
1> update master..sysdatabases set status = -32768 where name = 'database name'
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 4 To restart the database server, run the following commands:


$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 5 To log in to the database as user sa , run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 6 Run the following commands:


1> dump transaction database name with no_log
2> go
1> sp_configure 'allow update', 1
2> go
1> update master..sysdatabases set status = 12 where name = 'database name'
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 7 To restart the database server, run the following commands:


$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 8 To log in to the database as user sa , run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 9 Run the following commands:


1> use master
2> go

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-13


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

1> sp_dboption database name,'trunc. log on chkpt.',true


2> go
1> use database name
2> go
1> checkpoint
2> go
1> sp_configure 'allow update', 0
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 10 Run the following commands to restart the database server. Then you can restore the database.
$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

----End

6.1.4.3 Disk of the Database Logs Is Full

Symptom
The database is started abnormally.

A message is displayed in the $SYBASE/$SYBASE_ASE/install/DBSVR.log indicating full


log space of the database.

Possible Cause
The possible causes that result in full log space of the database are as follows:
l The log truncation is not set.
l The database is set to a small size.

Fault Diagnosis
To find the name of the database with full log space, do as follows:
1. Ensure that the U2000 application is closed and the database is started.
2. To search for the names of all the databases, run the following commands as user root:
# su - sybase
$ isql -Usa -P<sa password> -SDBSVR
1> sp_helpdb
2> go

3. To search for the name of the database with full log space, run the following commands:
# su - sybase
$ isql -Usa -P<sa password> -SDBSVR
1> sp_helpdb database name
2> go
In the message displayed, the number behind only log free kbytes indicates the remaining
space of the database log.
4. Find the name of the database with full log space according to the message displayed.

Procedure
Step 1 Log in to the operating system as user root.

6-14 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Step 2 To log in to the database as user sa , run the following commands:


# su - sybase
$ isql -Usa -P<sa password> -SDBSVR

Step 3 To update the suspended database in the log, run the following commands:
1> sp_configure 'allow update', 1
2> go
1> update master..sysdatabases set status = -32768 where name = 'database name'
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 4 To restart the database server, run the following commands:


$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 5 To log in to the database as user sa , run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 6 Run the following commands:


1> dump transaction database name with no_log
2> go
1> sp_configure 'allow update', 1
2> go
1> update master..sysdatabases set status = 12 where name = 'database name'
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 7 To restart the database server, run the following commands:


$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

Step 8 To log in to the database as user sa , run the following command:


$ isql -Usa -P<sa password> -SDBSVR

Step 9 Run the following commands:


1> use master
2> go
1> sp_dboption database name,'trunc. log on chkpt.',true
2> go
1> use database name
2> go
1> checkpoint
2> go
1> sp_configure 'allow update', 0
2> go
1> shutdown SYB_BACKUP
2> go
1> shutdown
2> go

Step 10 Run the following commands to restart the database server. Then you can restore the database.
$ cd /opt/sybase/ASE-*/install
$ ./startserver -f ./RUN_DBSVR &
$ ./startserver -f ./RUN_DBSVR_back &

----End

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-15


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

6.2 SQL Server Database Troubleshooting


This topic describes how to troubleshoot the SQL Server database.
6.2.1 Failed to Re-install the SQL Database
6.2.2 How to Solve the Problem That an Attempt to Log In to the SQL Server Fails After the
Windows Password Is Changed
6.2.3 Initializing the Database Fails
6.2.4 Backing up the Database Fails

6.2.1 Failed to Re-install the SQL Database


Symptom
Re-installing the SQL server fails.

Possible Cause
The possible causes that result in the database re-installation failure are as follows:
l The path where the installation software package is located contains space, punctuations,
or Chinese characters.
l The path where the database to be installed is located contains space, punctuations, or
Chinese characters.
l The database is uninstalled incompletely. Therefore, junk files exist.
l The registry information is faulty or deleted incompletely.
l The computer is infected by viruses.

Procedure
Step 1 Ensure that the following paths do not contain any Chinese character:
l The path where the installation software package is located
l The path where the database to be installed is located

Step 2 Ensure that the database is installed correctly according to the following method:
NOTE
The Microsoft SQL Server 2000 is considered as an example.
(1) You need to stop the database server and exit the database service manager before
uninstalling the Microsoft SQL Server 2000.
(2) Click Start and choose Control Panel. The Control Panel window is displayed.
(3) Double-click the Add or Remove Programs icon. The Add or Remove Programs
window is displayed.
(4) Select Microsoft SQL Server 2000, and then click Change/Remove.
(5) Click Yes. A progress bar is displayed.
(6) Perform the rest operations according to the prompts.

6-16 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

(7) Delete the MSSQL2000 folder in the installation directory of the database.
(8) Delete the Microsoft SQL Server folder in the Program Files folder that is placed in the
installation directory of the operating system.
(9) Delete the MSDesigners7 and MSDesigners98 folders in the Program Files\Common
Files\Microsoft Shared directory that is in the installation directory of the operating
system.
(10) Delete the following registry information.
TIP
For the method of opening the registries, see the Windows Online Help.

a. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server


b. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer
c. HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Updates\SQLServer 2000
d. HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSSQLServer
e. HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
\SQLSERVERAGENT
f. HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services
\MSSQLServerADH

Step 3 After the preceding operations are performed, restart the operating syste.

Step 4 Ensure that the registries do not contain the PendingFileRenameOperations key value.
TIP
For the method of opening the registries, see the Windows Online Help.

Step 5 Re-install the database.

Step 6 If the database re-installation fails, the computer may be infected with viruses. Check for and
remove the viruses by using the anti-virus software.

Step 7 If the preceding procedure does not work, contact Huawei technical support personnel.

----End

6.2.2 How to Solve the Problem That an Attempt to Log In to the


SQL Server Fails After the Windows Password Is Changed

Symptom
After the Windows password is changed, an attempt to log in to the SQL Server fails. How to
solve this problem?

Possible Cause

Procedure
Step 1 Choose Start > Aministrative Tools > Services .

Step 2 In the SQL Server services automatically started by Windows, right-click MSSQLSERVER ,
and then choose Properties. Click the Log On tab, and change the password to the new one.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-17


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

Step 3 In the SQL Server services automatically started by Windows, right-


clickSQLSERVERAGENT, and then choose Properties. Click the Log On tab, and change
the password to the new one.

Step 4 In the service manager of SQL Server, start the SQL Server and SQL Server Agent services.

----End

6.2.3 Initializing the Database Fails


This topic describes how to troubleshoot the database initialization failure. On windows, locate
and rectify the fault according to the system prompts or log information:

Current Symptom Troubleshooting

If prompts are displayed in If the following information is displayed, rectify the fault with
the DOS window, locate the reference to the corresponding solutions:
fault according to the l 6.2.3.1 System Prompts login database failure
prompts.
l 6.2.3.4 System Prompts Incorrect Parameter of Java
Virtual Machine

If no prompt is displayed, If the following information is displayed, rectify the fault with
locate the fault by querying reference to the corresponding solutions:
the log information in the l 6.2.3.2 Prompt Failed to open the database
nms\server\database\log 'U2000DB'Failed to open the database 'VSMDB' in Logs
file.
l 6.2.3.3 Prompt Cannot insert duplicate key in object
'TrailServiceType' in Logs

In other cases. Contact Huawei engineers for troubleshooting.

6.2.3.1 System Prompts login database failure


6.2.3.2 Prompt Failed to open the database 'U2000DB'Failed to open the database 'VSMDB' in
Logs
6.2.3.3 Prompt Cannot insert duplicate key in object 'TrailServiceType' in Logs
6.2.3.4 System Prompts Incorrect Parameter of Java Virtual Machine

6.2.3.1 System Prompts login database failure

Symptom
On Windows, when the U2000 database is initialized, a message is displayed indicating login
database failure.

Possible Cause
The possible causes that result in the database login failure are as follows:
l The alias of the database server is set incorrectly or is not set.

6-18 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

l The ODBC data source is configured incorrectly or is not configured.


l The database is not started.

Procedure
Step 1 Check whether the database is started. If not, start it manually.
(1) Double-click the database icon on the taskbar of Windows. The SQL Server Service
Manager window is displayed.
(2) Check whether the database server is started.

If Start/Continue is grayed out, it indicates that the database is already started. Otherwise,
click Start/Continue to start the database server.

Step 2 Check for and rectify the alias of the database server.
(1) Click Start and then choose Programs > Microsoft SQL Server > Client Network
Utility. On the Alias tab page, view the alias of the database server.
The Server alias should be DBSVR.
(2) Initialize the database again.
If the message indicating login database failure is displayed again, the ODBC data source
may not be configured or configured incorrectly.

Step 3 Check for and restore the configuration of the ODBC data source.
(1) Choose Control Panel > Administrative Tools > Data Sources (ODBC).

(2) On the System DSN tab page, view the configuration of U2000DBServer.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-19


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

l If U2000DBServer already exists, select U2000DBServer and then click Configure


to view the configuration items.
l If U2000DBServer does not exist, click Add to add U2000DBServer.
NOTE
Adding the U2000DBServer is considered as an example.
(3) On the System DSN tab page, click Add. In the Create New Data Source dialog box that
is displayed, select SQL Server.

(4) Click Finish. In the Microsoft SQL Server Configuration dialog box displayed, enter the
following information:

6-20 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

(5) Click Next. In the Microsoft SQL Server Configuration dialog box displayed, set the
parameters as follows:
l Select the With Windows NT authentication using the network login ID. and
Connect to SQL Server to obtain default setting for the additional configuration
options. check boxes.
l In the Login ID field, enter the database user name sa. The Password is null. If a
password is set, enter the password.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-21


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

(6) Click Next. In the dialog box displayed, select Change the default database to: and then
select master from the drop-down list.
(7) Click Next. In the dialog box displayed, the default settings are recommended.

(8) Click Finish. Then, ODBC Microsoft SQL Setup is displayed.

6-22 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

(9) Click Test Data Source.... Then, observe the information displayed on the screen. If TEST
COMPLETED SUCCESSFULLY! is displayed, the U2000 application and the database
server are connected.
(10) Initialize the database again.

----End

6.2.3.2 Prompt Failed to open the database 'U2000DB'Failed to open the database
'VSMDB' in Logs

Symptom
Database initialization fails. Check the logs in the nms\server\database\log directory and the
following message is found:
2008-08-06_10:27:51(DBConnectionManager.getSingleConnection) finish to
getSingleConnection
2008-08-06_10:27:51(CMSSQLConfig.mssqlSetDBOwner) Begin to set database
U2000DB's owner to U2000user
2008-08-06_10:27:51(CMSSQLConfig.mssqlSetDBOwner) ERROR:Set database U2000DB's
owner to U2000user failed
2008-08-06_10:27:51(CMSSQLConfig.mssqlSetDBOwner) ERROR:java.sql.SQLException:
[Microsoft][ODBC SQL Server Driver][SQL Server] Failed to open the database
'U2000DB', because the file cannot be accessed, or the memory or the disk space is

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-23


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

insufficient. For details, see the SQL Server error logs.


......

Possible Cause
Certain database files were deleted or the disk space is insufficient.

Procedure
Step 1 Check the disk space. You can locate and rectify the fault with reference to 5.1.5 Operation
Anomaly Caused by Insufficient Disk Space.
Step 2 To delete the database manually, run the following commands:
> isql -Usa -P<sa password> -SDBSVR
1> drop database database name
2> go

Deleting the U2000DB database is considered as an example.


> isql -Usa -P<sa password> -SDBSVR
1> drop database U2000DB
2> go

Step 3 Initialize the database again.


----End

6.2.3.3 Prompt Cannot insert duplicate key in object 'TrailServiceType' in Logs

Symptom
Database initialization fails. Check the logs in the U2000\server\database\log directory and the
following message is found:
2008-04-02_18:20:11(CServerConfig.RunCommand) ERROR:Execute command failed
2008-04-02_18:20:11(CServerConfig.RunCommand) ERROR:java.lang.Exception: MSSQL
bcp executes failed
2008-04-02_18:20:11(CServerConfig.LoadDataTable) ERROR:Load data to
U2000DB.TrailServiceType from D:\U2000\server\database/staticdata/chinese
\TrailServiceType.dat failed
2008-04-02_18:20:11(CServerConfig.LoadDataTable) ERROR:java.lang.Exception:
Failed to import the static data.
2008-04-02_18:20:11(CServerConfigManagement.loadAllStaticDatatable) ERROR:load
static data failed
2008-04-02_18:20:11(CServerConfigManagement.loadAllStaticDatatable)
ERROR:java.lang.Exception: Failed to import the static data .
2008-04-02_18:20:11(CServerConfigManagement.InitializeDatabase)
ERROR:Initialize database failed
2008-04-02_18:20:11(CServerConfigManagement.InitializeDatabase)
ERROR:java.lang.Exception: Failed to import the static data.
2008-04-02_18:20:11(CServerConfigManagement.InitializeDatabase) ERROR:Error
Message is Starting copy...
SQLState = 23000, NativeError = 2627
Error = [Microsoft][ODBC SQL Server Driver][SQL Server]Violation of UNIQUE KEY
constraint 'UQ__TrailServiceType__114A936A'. Cannot insert duplicate key in object
'TrailServiceType'.
SQLState = 01000, NativeError = 3621
Warning = [Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been
terminated.
BCP copy in failed

Possible Cause
The character set used by the Microsoft SQL server database is not Chinese, while that used by
the U2000 is Chinese.

6-24 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 6 Faults of the Database

Procedure
Step 1 Run the following commands according to the command prompts:
> isql -Usa -P<sa password> -SDBSVR
1> sp_helpsort
2> go
Server default collation

The terminal displays:


Chinese-PRC, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive

NOTE
If Chinese-PRC is displayed, it indicates that the character set used by the database is Chinese. Otherwise, the
database needs to be installed again.

Step 2 Initialize the database again.

----End

6.2.3.4 System Prompts Incorrect Parameter of Java Virtual Machine

Symptom
Database initialization fails. Check the logs in the vsm\server\database\log directory and the
following message is found:

Possible Cause
The symbol \ exists at the end of the value of the IMAP environment variable.

Procedure
Step 1 Check for and restore the IMAP environment variable. For details, see 7.1.4 U2000
Environment Variable Is Set Incorrectly.
Step 2 Initialize the database again.

----End

6.2.4 Backing up the Database Fails


Symptom
Backup files are not found in the specified path.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 6-25


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
6 Faults of the Database Troubleshooting

Possible Cause
The possible causes that result in the database backup failure are as follows:
l The database is not started.
l Full Disk Space.

Procedure
Step 1 Ensure that the database is started.

If the database icon in the Windows taskbar is displayed as , it indicates that the database is
started.
Step 2 Check the disk space. For details, see 5.1.5 Operation Anomaly Caused by Insufficient Disk
Space.

----End

6-26 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 7 U2000 Server Troubleshooting

7 U2000 Server Troubleshooting

About This Chapter

This topic describes how to troubleshoot the U2000 server.


7.1 Starting the U2000 Server Fails
7.2 Abnormal NMS Functions Due to Modified OS Time
7.3 U2000 Runs Slowly

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 7-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
7 U2000 Server Troubleshooting Troubleshooting

7.1 Starting the U2000 Server Fails


Starting the U2000 server fails or certain processes of the U2000 are started repeatedly. On
Solaris in the single server system, locate and rectify the fault according to the following
sequence:

Sequ Problem Location Troubleshooting


ence

1 Judge whether the fault is caused by Rectify the fault with reference to 7.1.1
the U2000 coredump. Abnormal Termination of the Server
Application.

2 Locate and rectify the fault Locate and rectify the fault according to the
according to the following system following system prompts:
prompts. l 7.1.2 System Prompting Connection
Failure to the Database
l 7.1.3 Prompting Invalid License
l 7.1.4 U2000 Environment Variable Is Set
Incorrectly

3 Restarting the U2000 server fails. Contact Huawei engineers for troubleshooting.

7.1.1 Abnormal Termination of the Server Application


7.1.2 System Prompting Connection Failure to the Database
7.1.3 Prompting Invalid License
7.1.4 U2000 Environment Variable Is Set Incorrectly
7.1.5 Startup Failure Because of the Authority Problem of the U2000 Installation Path
7.1.6 Certain Processes of the U2000 Server Fail to Start

7.1.1 Abnormal Termination of the Server Application

Symptom
The U2000 server application is terminated abnormally.

Possible Cause
The problem may be caused by the U2000 core dump.

Procedure
Step 1 Check whether any file whose name starts with core. exists in the following directories.
On UNIX:

7-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 7 U2000 Server Troubleshooting

l /opt/U2000
l /opt/U2000/server
l /opt/U2000/server/bin
On Windows:
l D:\U2000
l D:\U2000\server
l D:\U2000\server\bin
NOTE

l In the case of the Unix OS, the installation of the U2000 in the /opt/U2000 path is taken as an example.
l In the case of the Windows OS, the installation of the U2000 in the D:\U2000 path is taken as an
example.

Step 2 Collect the U2000 core dump file.


Step 3 Send the collected core dump file to Huawei engineers for troubleshooting.

----End

7.1.2 System Prompting Connection Failure to the Database


Symptom
A message is displayed indicating that connecting to the database fails. In addition, the U2000
server cannot be started.

Possible Cause
l The database is not started.
l The communication connection between the database and the server is set improperly.
l The database password is illegally modified, which causes that the configuration file is
damaged.
l Other problems regarding the database occur.

Procedure
l Check whether the database is started. If the database is not started, start the database
manually.
Check and start the database on Windows according to the following procedure:
1. Double-click the database icon on the Windows taskbar.
The SQL Server Service Manager dialog box is displayed.
2. Check whether the database server is started.
– If the Start/Continue option is grayed, it indicates that the database is started.
– If the database is not started, click Start/Continue to start the database server.
NOTE
In the dialog box that is displayed, select the Auto-start service when OS starts option.

Check and start the database on Solaris according to the following procedure:

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 7-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
7 U2000 Server Troubleshooting Troubleshooting

1. Log in to the operating system as user sybase.


NOTE
If you log in to the operating system as user sybase for the first time, a message is displayed
asking you to set the password. For the system security, periodically change the password of
user sybase by running the passwd sybase command. The password must contain a minimum
of eight characters.
2. To check whether the database is started, run the following command :
$ cd $SYBASE/$SYBASE_ASE/install
$ ./showserver
Check whether the dataserver and backupserver processes are running. If these two
processes do not exist, it indicates that the database process is not started. Start the
database according to the following procedure:
3. To start the database, run the following commands:
$ cd /opt/sybase/ASE-*/install

# ./startserver -f ./RUN_DBSVR &

# ./startserver -f ./RUN_DBSVR_back &


4. To check whether the database process is running, run the following commands:
$ cd $SYBASE/$SYBASE_ASE/install
$ ./showserver
Check whether the dataserver and backupserver processes are running. If these two
processes do not exist, it indicates that the database process is not started. If the
database cannot be started, rectify the database fault with reference to 6.1.3 Database
Cannot Be Started Automatically.
l Check the communication connection between the U2000 and database.
Check and restore the communication connection on Solaris according to the following
procedure:
1. To log in to the Sybase, run the following commands:
$ cd /opt/sybase/OCS-*/bin
$ ./isql -SDBSVR -Usa -Ppassword of user sa
If the following message is displayed:
1>
It indicates that communication between the U2000 and database is normal. Enter
quit to exit the Sybase. If the preceding message is not displayed, you need to locate
the fault of connection failure according to the log information and then rectify the
fault.
l The database password is illegally modified, which causes that the configuration file is
damaged.
Re-set the database password. For details, see Managing a Database User in the Huawei
iManager U2000 Administrator Guide.
l Other exceptions regarding the database.

----End

7.1.3 Prompting Invalid License


Symptom
A message is displayed indicating that the license of the U2000 is invalid. In this case, the
U2000 cannot be started or certain functions cannot be used.

7-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 7 U2000 Server Troubleshooting

Possible Cause
l If the U2000 cannot start or certain functions cannot be used, the possible cause is that the
license item is incorrect.
l If the time setting of the OS is incorrect, the license may also be invalid.

Procedure
l Check for and rectify the fault on Solaris according to the following precautions:
1. Ensure that the date of the OS is the current date.
2. A unique license file exists in the /opt/U2000/server/license directory.
If more than one license files exist in the directory, you need to delete redundant license
files manually.
3. The MAC address in the license file must be the same as the MAC address of the NIC
that is actually used on the server.
If the MAC addresses are different, you need to apply for a new license.
4. The license file must be transferred in the ASCII format.
TIP
You can check the license file by running the vi command. If each line of the license file ends
with the ^M symbol, it indicates that the license file is uploaded in binary mode. You need to
re-upload the license file.
5. The authority of the U2000 is incorrect.
6. The license file must comply with the U2000 version.
l Check for and rectify the fault on Windows according to the following precautions:
NOTE
Suppose that the U2000 is installed in the D:\U2000 directory.
1. Ensure that the date of the OS is the current date.
2. A unique license file exists in the D:\U2000\server\license directory.
If more than one license files exist in the directory, you need to delete redundant license
files manually.
3. The MAC address in the license file must be the same as the MAC address of the NIC
that is actually used on the server.
If the MAC addresses are different, you need to apply for a new license.
4. The license file must comply with the U2000 version.

----End

Suggestion and Summary


Do not modify the license file. Any modification made on the license file may result in the
invalidity of the license.

7.1.4 U2000 Environment Variable Is Set Incorrectly

Symptom
A message is displayed indicating that the environment variable of the U2000 is set incorrectly.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 7-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
7 U2000 Server Troubleshooting Troubleshooting

Possible Cause
The environment variable is lost or modified.

Procedure
Step 1 Check the environment variable of the U2000.
l On Windows, right-click My Computer on the desktop and choose Properties from the
shortcut menu. On the Advanced tab page, click Environment variable to query the value
of IMAP.
l On Solaris, run the echo $IMAP command as user nmsuser to query the value of IMAP.

Step 2 Check and rectify the environment variable of the U2000.


NOTE
For the U2000 with other versions, see the installation guide for the corresponding version and solution.
l On Windows: Assume that the U2000 is installed in the D:\U2000 directory. Then,
IMAP=D:\U2000\server\conf. Otherwise, re-set the environment variable of the U2000
manually.
l On Solaris: Assume that the U2000 is installed in the /opt/U2000 directory. Then, the value
of the $IMAP is /opt/U2000/server/conf by default. Otherwise, re-set the environment
variable of the U2000 by running the IMAP=/opt/U2000/server/conf;export IMAP
command as user nmsuser.

----End

7.1.5 Startup Failure Because of the Authority Problem of the


U2000 Installation Path
Symptom
After the U2000 workstation is restarted, the U2000 services fail to be started.

Possible Cause
This is caused by the authority problem of the U2000 installation path. You can change the
owner of the U2000 installation path to solve this problem.

Procedure
Step 1 Log in to the Unix OS as the root user.
Step 2 Change the owner of the U2000 installation path to nmsuser. Then, run the following commands
in the CLI:
# cd /opt
# chown -R nmsuser U2000

Step 3 Modify the owner of the EmfSecuDm process to root. Then, run the following commands:

# cd /opt/U2000/server/bin
# chown -R root EmfSecuDm

Step 4 Restart the U2000.

----End

7-6 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 7 U2000 Server Troubleshooting

7.1.6 Certain Processes of the U2000 Server Fail to Start


Symptom
Through the Sysmonitor client, you can find that certain processes of the U2000 server fail to
start.

Possible Cause
The cause of this problem is that these processes had once started as the root user and then
abnormally exited before they are started.

Procedure
Step 1 Normally start the processes as the root user, and then normally exit.

Step 2 Restart the OS.

----End

Suggestion and Summary


It is recommended that you start or stop the U2000 as the nmsuser or user.

7.2 Abnormal NMS Functions Due to Modified OS Time


Symptom
The modification made on the OS time results in the abnormal running of certain NMS functions.

Possible Cause
If the system time of the server is modified while the NMS is running, the whole system looks
normal. Some functions based on timer principles, however, may be affected, such as the
scheduled dump function of the security Daemon.

Procedure
l Shut down the NMS and the database, and then restart the server.
NOTE
Set the correct system time of the server when installing the NMS. Never modify it while the NMS
is running. If needed, first exit the NMS server, then modify the system time and restart the NMS
server.

----End

7.3 U2000 Runs Slowly


Response to certain operations on the U2000 is slow. For example, opening or closing a window
takes more than three seconds.
Locate and rectify the fault according to the following sequence:

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 7-7


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
7 U2000 Server Troubleshooting Troubleshooting

Seque Problem Location Troubleshooting


nce

1 Check whether the number of non- Contact Huawei engineers for network
gateway NEs managed by the division, ECC reconstruction, and DCN
gateway NE exceeds the limit. reconstruction.
Generally, each gateway NE is
recommended to support a
maximum of 50 non-gateway NEs
(including the non-gateway NEs
that use the extended ECC to
connect to the gateway NE). If the
number of non-gateway NEs
exceeds 60, it is recommended that
the number of gateway NEs be
increased. Otherwise, ECC
congestion may occur easily, which
causes slow response to operations
in the user interface.

2 Check whether a large number of Rectify the fault according to the abnormal
abnormal events are reported to the events.
U2000.

3 Check whether the communication Restore the communication connection


between the U2000 and gateway between the T2000 and gateway NEs. You
NEs is normal. can rectify the fault with reference to 4.3
If a large packet loss ratio (such as Failed to Connect the U2000 Server and
40% or above) exists in the network, NE.
the data packets need to be
retransmitted. In this case, the
response speed to the commands
that are delivered to the
transmission equipment by the
U2000 is greatly affected.
Therefore, the response to the
operations in the user interface is
slow.

4 Check whether the operating system If the operating system runs abnormally,
is normal. rectify the fault with reference to 5.1.1
If the operating system runs at a low Starting the Operating System Fails.
speed or crashes or is restarted
frequently, the problem may be
caused by exceptions of the
operating system.

5 Check whether the disk usage If the disk space exceeds the normal value,
exceeds the limit. rectify the fault with reference to 5.1.5
Normally, the disk space occupancy Operation Anomaly Caused by
should be 80% or below. Insufficient Disk Space.

7-8 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 7 U2000 Server Troubleshooting

Seque Problem Location Troubleshooting


nce

6 Check the hardware performance of Rectify the fault with reference to 5.1.6 Slow
the U2000 server. Running of the System Caused by
Insufficient Memory and 5.1.7 Slow
Running of the System Caused by High
CPU Usage.

7 The preceding measures do not Contact Huawei engineers for


work. troubleshooting.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 7-9


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 8 Faults of the U2000 Client

8 Faults of the U2000 Client

About This Chapter

This topic describes how to troubleshoot the faults of the U2000 client.
8.1 Starting the U2000 Client Fails
8.2 U2000 Client Login Failure
8.3 U2000 Client Runs Abnormally
8.4 Main Menu or Icons Cannot Be Loaded in the U2000 Client Window
8.5 The NE Manager GUI of Certain Equipment Is Displayed Abnormally on the U2000 Client

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 8-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
8 Faults of the U2000 Client Troubleshooting

8.1 Starting the U2000 Client Fails


Symptom
A certain user double-clicks the shortcut icon of the U2000 client, but the login interface cannot
be displayed.

Possible Cause
The possible causes that result in the U2000 client startup failure are as follows:
l The files of the operating system and client are abnormal.
l The shortcut icon on the desktop is not updated after upgrade.
l The virtual memory is not set. This may be caused by illegal installation of the U2000
client.

Procedure
Step 1 If a prompt is displayed, locate and rectify the fault according to the prompt information.

Step 2 Uninstall the U2000 client and then install it again. For details, see the installation guide for
Huawei iManager U2000 Client Installation Guide .

----End

8.2 U2000 Client Login Failure


Symptom
The U2000 client fails to log in to the U2000 server after the user name and password are entered
in the login interface.

Possible Cause
The possible causes that result in the U2000 client login failure are as follows:
l The U2000 server is faulty.
l When the server is installed in the Windows OS, the ODBC data source is configured
incorrectly or not configured on the U2000 server.
l When the server is installed in the Windows OS, the database dynamic port setting on the
U2000 is incorrect.
l The network between the client and server is faulty.
l The version of the client is inconsistent with that of the server.
l The communication protocol used by the client is inconsistent with that used by the server.
l The user that logs in to the client is locked. This may be caused by a number of failed login
attempts.
l The number of clients allowed in the license is restricted.

8-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 8 Faults of the U2000 Client

l The setting of the system time of the client is incorrect.

Procedure
Step 1 If a prompt is displayed, locate and rectify the fault according to the prompt information.

Step 2 Choose Help > About on the U2000 server to check the number of clients allowed in the license.
If the number of clients to log in exceeds the maximum number of clients allowed in the license,
apply for a new license and update the U2000 license. For details, see the method in the
installation guide for the corresponding version and solution.

Step 3 If the U2000 server is installed in the Windows OS, check and restore the ODBC data source
settings on the U2000. For details, see Step 3 in 6.2.3.1 System Prompts login database
failure .

Step 4 If the U2000 server is installed in the Windows OS, do as follows to modify the dynamic port
number on the U2000:
(1) Choose Start > All Programs > Microsoft SQL Server > Client Network Utility.
(2) Check whether the dynamic port number is 1433 on the Alias tab page. If not, change the
value to 1433.

Step 5 Check whether the versions of the client and server are consistent. If the versions are inconsistent,
replace the client with a version that is consistent with the server version, and then log in to the
client again.

Step 6 Check whether the communication protocols used by the client and the server are consistent. If
the protocols are inconsistent, modify the protocols so that the protocols are consistent.
TIP
Log in to the Sysmonitor Client on the server, and choose System > Communication Settings.... In the
dialog box displayed, view the communication mode of the server.

Step 7 Check the network between the client and server.


Generally, the communication bandwidth between the client and server is at least 2 Mbit/s and
the packet loss ratio is smaller than 0.1%.
l To check the network between the client and server, run the following command on
Windows:
> ping -t IP address of the NMS

l To check the network between the client and server, run the following command on Solaris:
# ping -s IP address of the NMS

Step 8 Check whether the client access control is set on the server.
On the U2000 server, you can set the client IP addresses that can be accessed. If the IP address
of a client is not in the permitted range, the client cannot access the server. For details, see the
administrator guide for the corresponding version and solution.

Step 9 If the number of failed login attempts by using the same user exceeds 3, the login authority of
the user is locked.
You can log in to the client again in 30 minutes (default) or unlock the user as another user that
has the authority, such as user admin.

Step 10 Check whether the system time is the current time. If not, modify the system time.

----End

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 8-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
8 Faults of the U2000 Client Troubleshooting

8.3 U2000 Client Runs Abnormally


Symptom
The U2000 client is started repeatedly and the operations are interrupted.

Possible Cause
The computer may be infected with viruses.

Fault Diagnosis
Check for and remove the viruses.

8.4 Main Menu or Icons Cannot Be Loaded in the U2000


Client Window
Symptom
In the U2000 client window, the main menu, toolbar, and icons are displayed abnormally.

Possible Cause
Operations are performed improperly or an abnormality occurs during the installation or upgrade
of the U2000 client. As a result, the index files of earlier versions are not normally cleared.

Procedure
Step 1 Shut down the U2000 client.

Step 2 Delete all the files in the following path on the U2000 client.

In Windows OS, the path is D:\U2000\client\configuration.

In UNIX OS, the path is /opt/U2000/client/configuration.

After you restart the U2000 client, the user configuration file is automatically generated.

----End

8.5 The NE Manager GUI of Certain Equipment Is


Displayed Abnormally on the U2000 Client
Symptom
U2000On the U2000 client, the NE manager GUI of certain equipment is grayed out or displayed
abnormally.

8-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 8 Faults of the U2000 Client

Possible Cause
For the NE manager of certain equipment such as the equipment of the PTN series, RTN series,
NG WDM series, and SLM 3160 series, the browser settings result in abnormal display of the
GUI.

Procedure
Step 1 Check whether the browser settings comply with the standards. For the Windows OS, the default
browser needs to be Microsft Internet Explorer; for the Solaris OS, the default browser needs to
be Mozilla browser.
Step 2 Check the version of Internet Explorer in the Windows OS. If the security level of Internet
Explorer is set to high, the running of scripts is affected and the GUI becomes grayed out. To
make the GUI display normally, you need to set the security level of the Internet Explorer to
Medium or a lower level. In the Windows 2003 OS, the function of Internet Explorer enhanced
security settings is installed by default. This function results causes the security level to remain
high. Therefore, you need to cancel the function as follows:
(1) Choose Start > Control Panel. The Control Panel dialog box is displayed.
(2) Double-click the Add or Remove Programs icon. The Add or Remove Programs dialog
box is displayed.
(3) Click the Add/Remove Windows Components icon. The Windows Components
Wizard.
(4) Clear the selection of the check box to the left of Internet Explorer Enhanced Security
Configuration.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 8-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
8 Faults of the U2000 Client Troubleshooting

NOTE
By default, the check box is selected, which indicates that the security level of the Internet Explorer
is high.
(5) Click Next.
(6) Click Finish.
(7) Double-click the Internet Explorer icon on the desktop to open the Internet Explorer.
(8) Choose Tool > Internet Options.
(9) In the Internet Options dialog box, select Security. Then, move the slider to set the security
level of Internet Explorer to Medium or a lower level.
(10) Click Apply.
(11) Click OK.
Step 3 Check whether Internet Explorer is configured with the proxy server. If Internet Explorer is
configured with the proxy server, cancel the proxy server or disable the connection to the
U2000 server through the proxy server.
Step 4 Check the installation directory of the U2000 client. The directory name contains only the letters,
numbers, and underscores (_) and cannot contain the space or bracket.

----End

8-6 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

9 Veritas HA System Troubleshooting

About This Chapter

This topic describe how to troubleshoot the Veritas HA system.


9.1 Troubleshooting Policies for the Veritas HA System
This topic describes the confirmation of the faults that commonly occur in the Veritas high
availability (HA) system and the troubleshooting policies.
9.2 Veritas Troubleshooting Cases
This topic describes how to troubleshoot the Veritas.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

9.1 Troubleshooting Policies for the Veritas HA System


This topic describes the confirmation of the faults that commonly occur in the Veritas high
availability (HA) system and the troubleshooting policies.

9.1.1 Confirming the System Status


You need to check whether the HA system is in the dual-host state or in the healing state before
you determine which fault recovery strategy to adopt.

9.1.1 Confirming the System Status


You need to check whether the HA system is in the dual-host state or in the healing state before
you determine which fault recovery strategy to adopt.

NOTE

l If the server is configured with one network card, the Host name is the Host IP address of the master
server. In this example, the Host name of the master servers are 129.9.1.1 and 129.9.1.2.
l If the server is configured with two network cards and has the IPMP feature enabled, the Host name
is the IP address (floating IP address) of the master server, that is, the IP address of the network card
on the U2000 that is used for external services.
l If the server is configured with two network cards and has the IPMP feature disabled, the Host
name is the Data replication IP address of the master server.

In a Dual-Host State
Run the following command on the master server of primary site to check the system status:
# vradmin -g datadg repstatus datarvg
Replicated Data Set: datarvg
Primary:
Host name: 129.9.1.1
RVG name: datarvg
DG name: datadg
RVG state: disabled for I/O
Data volumes: 4
SRL name: srl_vol
SRL size: 3.00 G
Total secondaries: 1

Secondary:
Host name: 129.9.1.2<unreacheable>
RVG name: datarvg
DG name: datadg
Replication status: paused due to network disconnection
Current mode: asynchronous
Logging to: SRL
Timestamp Information: N/A
Config Errors:
129.9.1.2: Pri or Sec IP not available or vradmind not running

Run the following command on the master server of secondary site to check the system status:
# vradmin -g datadg repstatus datarvg
Replicated Data Set: datarvg
Primary:
Host name: 129.9.1.2
RVG name: datarvg
DG name: datadg
RVG state: enabled for I/O

9-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

Data volumes: 4
SRL name: srl_vol
SRL size: 3.00 G
Total secondaries: 1
Config Errors:
129.9.1.1: Pri or Sec IP not available or vradmind not running

It indicates that the system is in the dual-host state.


Because of the following causes, the heartbeat connection between the primary and secondary
sites is interrupted, the standby server is started, and the system is in the dual-host state:
l Corruption of the network card used for the communication between the two sites
l Fault in DCN between the primary and secondary sites
l Incorrect configuration of firewall between the primary and secondary sites

In the dual-host state, the following situation occurs on the client:


The NE users repeatedly force each other to log out. In this situation, where the server is in the
dual-host state, shut down the U2000 applications on the primary site and connect to the
secondary site.
When the primary site and the communication between the primary and secondary sites restore
to normal, perform incremental or full synchronization on the site with updated data.
NOTE

l In the dual-host state, if the U2000 client connects to the secondary site, perform incremental or full
synchronization on the secondary site.
l In the dual-host state, if the U2000 client is still running on the primary site, perform incremental or
full synchronization on the primary site.

In a Healing State
Run the following command on the master server of primary and the secondary site to check the
system status:
# vradmin -g datadg repstatus datarvg

If the on-screen terminal output contains the acting secondary information as follows, it can be
confirmed that the system is running in a healing status. (Usually because the secondary site
takes over forcibly, the network between the primary site and the secondary site returns to
normal.)
Replicated Data Set: datarvg
Primary:
Host name: 129.9.1.2
RVG name: datarvg
DG name: datadg
RVG state: enabled for I/O
Data volumes: 4
SRL name: srl_vol
SRL size: 3.00 G
Total secondaries: 1

Primary (acting secondary):


Host name: 129.9.1.1
RVG name: datarvg
DG name: datadg
Data status: consistent, behind
Replication status: logging to DCM (needs failback synchronization)
Current mode: asynchronous
Logging to: DCM (contains 0 Kbytes) (failback logging)
Timestamp Information: N/A

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

Config Errors:
129.9.1.1: Primary-Primary configuration

9.2 Veritas Troubleshooting Cases


This topic describes how to troubleshoot the Veritas.

9.2.1 Switching Between Primary and Secondary Nodes Fails


9.2.2 Starting the U2000 HA System Fails
9.2.3 Data Replication Cannot Be Performed Between Primary and Secondary Nodes
9.2.4 Communication Between Primary and Secondary Nodes Fails
9.2.5 Resource in the Frozen State
9.2.6 Resource in the Fault State
9.2.7 Frequent Dual-Host State of the System
9.2.8 Connection Failure Between the Rlink and the Remote Host
9.2.9 Abnormal Status of the Disk Volume
9.2.10 Failed to Start the VCS Because of the Errors in the Configuration File
9.2.11 Faults on the Primary Site
9.2.12 Unstable DCN Between the Primary and Secondary Sites

9.2.1 Switching Between Primary and Secondary Nodes Fails


The switching between the primary and secondary nodes in the HA system (Veritas hot backup)
cannot be performed.
Locate and rectify the fault according to the following sequence:

Sequence Problem Location Troubleshooting

1 Check whether the HA system is in If the system is in the revertive state


the normal state. or dual-host state, you need to
For the specific method, see the rectify the fault manually.
troubleshooting chapters in the For the specific method, see the
administrator guide for the troubleshooting chapters in the
corresponding version and solution. administrator guide for the
corresponding version and solution.

2 Check whether the resources are Rectify the fault with reference to
abnormal. 9.2.5 Resource in the Frozen
State and 9.2.6 Resource in the
Fault State.

9-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

Sequence Problem Location Troubleshooting

3 Check whether the communication Rectify the fault with reference to


connection between the primary and 9.2.4 Communication Between
secondary nodes is normal. Primary and Secondary Nodes
Fails.

4 Check whether the data on the Rectify the fault with reference to
primary node is consistent with the 9.2.3 Data Replication Cannot Be
data on the secondary node. Performed Between Primary and
Secondary Nodes.

5 The preceding measures do not Contact Huawei engineers for


work. troubleshooting.

9.2.2 Starting the U2000 HA System Fails


After the primary and secondary nodes are restarted upon power failure, the U2000 HA system
cannot be started.
Locate and rectify the fault according to the following sequence:

Sequence Problem Location Troubleshooting

1 Check whether the files of the Rectify the fault with reference to
operating system are normal. 5.1.1 Starting the Operating
System Fails.

2 Check whether the VCS is normal. Rectify the fault with reference to
Run the hastatus -sum command 9.2.10 Failed to Start the VCS
to query the status of the VCS. If Because of the Errors in the
the reported status of the VCS is Configuration File.
ADMIN, it indicates that the VCS
fails to be started.

3 The preceding measures do not Contact Huawei engineers for


work. troubleshooting.

9.2.3 Data Replication Cannot Be Performed Between Primary and


Secondary Nodes
The vxrlink -g datadg -i 5 status datarlk command is run on the primary server. After a certain
period of time, however, the system still displays that substantive data is not synchronized.
Locate and rectify the fault according to the following sequence:

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

Sequence Problem Location Troubleshooting

1 Check whether the Rectify the fault with


communication connection reference to 9.2.4
between the primary and Communication Between
secondary nodes is normal. Primary and Secondary
Nodes Fails.

2 Check whether the HA If the system is in the


system is in the normal state. revertive state or dual-host
For the specific method, see state, you need to rectify the
the troubleshooting chapters fault manually.
in the administrator guide for For the specific method, see
the corresponding version the troubleshooting chapters
and solution. in the administrator guide for
the corresponding version
and solution.

3 The preceding measures do Contact Huawei engineers


not work. for troubleshooting.

9.2.4 Communication Between Primary and Secondary Nodes Fails

Symptom
Data replication and switching cannot be performed between the primary and secondary nodes.

Possible Cause
The possible causes that result in the communication failure between the primary and secondary
nodes are as follows:
l The network between the primary and secondary nodes is unstable or a firewall exists.
l The IP addresses and gateways of the primary and secondary nodes are set incorrectly.

Procedure
Step 1 To check the communication status between the primary and secondary nodes, run the following
commands as user root on the primary node:
# ping -s IP address of the Master NIC on the secondary node
# ping -s IP address of the replication NIC on the secondary node
TIP
Run cat /etc/hosts | grep loghost as user root on secondary node can query the IP address of the Master
NIC on the secondary node.
Generally, the bandwidth between the primary and secondary nodes is at least 2 Mbit/s and the
packet loss ratio is smaller than 0.1%.

Step 2 Check whether all the ports used by the HA system are enabled.
To query the service ports that are enabled in the system, run the following command as user
root:

9-6 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

# netstat -an

----End

9.2.5 Resource in the Frozen State

Symptom
A lock in red is displayed on a resource or resource group in the VCS Explorer.

Possible Cause
You may forget to restore the resource group after freezing it manually.

Procedure
Step 1 In the VCS Explorer interface, right-click the resource group that is in the frozen state, and then
choose Unfreeze.

----End

9.2.6 Resource in the Fault State

Symptom
In the VCS Explorer, a cross in red is displayed for a certain resource. The resource is in the
Fault state.

Possible Cause
The resource is faulty. For example, the U2000 coredump occurs.

Procedure
Step 1 Right-click the name of the resource that is in the Fault state, and then choose Clear Fault to
rectify the fault.

Step 2 In the case of the primary server, right-click AppService, and then choose Online. The
AppService resource group is in the Online state.

----End

Suggestion and Summary


If the U2000 still cannot work after the Fault state of the resource is cleared, that is, the
AppService resource group cannot enter the Online state on the primary server, contact the
local office or customer service center of Huawei for troubleshooting.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-7


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

9.2.7 Frequent Dual-Host State of the System


Symptom
The heartbeat between the primary and secondary nodes is interrupted frequently, which results
in frequent dual-host state of the system. Therefore, the U2000 system cannot work in the normal
state.

Possible Cause
The DCN between the primary and secondary nodes is instable.

Procedure
Step 1 To modify the heartbeat detection timeout, run the following commands as user root respectively
on the primary and secondary nodes:
# haconf -makerw
# /opt/VRTSvcs/bin/hahb -local Icmp AYATimeout
# /opt/VRTSvcs/bin/hahb -modify Icmp AYATimeout heartbeat detection timeout -clus
Cluster name of the opposite node
# haconf -dump -makero

NOTE

l The heartbeat detection timeout is 300 seconds by default. You can set the heartbeat detection timeout,
such as 600 seconds, according to the duration of network interruption between the primary and
secondary nodes.
l If you use one or two NICs but do not enable the IPMP feature, the Cluster name of the opposite node
is the host name Cluster of the opposite node, such as SecondaryCluster.
l If you use two NICs and enable the IPMP feature, the Cluster name of the opposite node is the host
name of the opposite node, such as Secondary.

Step 2 After the DCN becomes stable, you need to run the following commands as user root on the
primary and secondary nodes to restore the heartbeat detection timeout to the default value.
# haconf -makerw
# /opt/VRTSvcs/bin/hahb -local Icmp AYATimeout
# /opt/VRTSvcs/bin/hahb -modify Icmp AYATimeout 300 -clus Cluster name of the
opposite node
# haconf -dump -makero

----End

Suggestion and Summary


Modifying the heartbeat detection timeout applies only to the temporary avoidance of HA system
problems caused by the DCN instability. Therefore, it is recommended that you clear the
instability of DCN communication between the primary and secondary nodes in a timely manner,
and restore the heartbeat detection timeout to the default value.
If the DCN fault between the primary and secondary nodes cannot be rectified for a long time,
you can use one node to monitor the network according to the following method. In this case,
the U2000 cannot realize the redundancy protection when errors occur.
1. To close the T2000 application and Sybase on the primary and secondary nodes, run the
following command as user root:
# hagrp -offline AppService -sys host name

2. To start the T2000 and Sybase on the node to be monitored, run the following command
as user root:

9-8 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

# hagrp -online AppService -sys host name

3. To freeze the T2000 and Sybase on the primary node, run the following commands as user
root:
# hagrp -freeze AppService
# hagrp -freeze VVRService

4. To freeze the T2000 and Sybase on the secondary node, run the following commands as
user root:
# hagrp -freeze AppService
# hagrp -freeze VVRService

5. To restore the protection for the primary and secondary nodes after the network recovers,
run the following commands:
# hagrp -unfreeze AppService
# hagrp -unfreeze VVRService

6. Synchronize data between the primary and secondary nodes.


For details, see the administrator guide for the corresponding version and solution.

9.2.8 Connection Failure Between the Rlink and the Remote Host

Symptom
In the console window, the following error message is displayed:
vxvm:vxrlink: ERROR: Unable to establish connection with remote host
<remote_host>

Possible Cause
l The network connection between the primary site and the secondary site is torn down.
l The vradmind service process is stopped.

Procedure
l Check network connection between primary and secondary nodes.
Run the following command:
# ping host
IP address of the master server on the secondary site
If each host can be pinged successfully, it indicates that network connection is normal.
Otherwise, clear the network fault first.
l Check whether the vradmind process of the primary/secondary site is running.
Run the following command:
# ps -ef | grep vradmind

The terminal displays:


root 489 1 0 17:36:12 ? 0:00 /usr/sbin/vradmind
root 9717 9662 0 18:08:46 pts/3 0:00 grep vradmind

If /usr/sbin/vradmind is output, it indicates that the vradmind process is running.


Otherwise, run the following commands to restart it:
# cd /etc/init.d # ./vras-vradmind.sh start

----End

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-9


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

9.2.9 Abnormal Status of the Disk Volume


Symptom
If the status of the data volume is not ACTIVE or ENABLED, and the status of datarvg and
datarlk is RECOVER.

Possible Cause
The server is powered off abnormally or other abnormal operations are performed.

Procedure
l Open a terminal window.
l Run the following commands on the node on which the disk volume is abnormal:
# cd /opt/HWICMR/bin
# ./runtaskflow.sh recover_rvg.tf

l check whether the status of disk volume and data replication status is correct. If so, the
recovery is successful.
----End

9.2.10 Failed to Start the VCS Because of the Errors in the


Configuration File
Symptom
After the hastatus -sum is run, the state of the VCS is reported as ADMIN.

Possible Cause
The VCS startup failure may be caused by a power failure.

Procedure
Step 1 To restore the VCS on the primary site, run the following command on the primary site as the
root user:
# hasys -force host name of the primary site

Step 2 If starting the VCS on the secondary site fails, run the following command on the secondary site
as the root user:
# hasys -force host name of the secondary site

----End

9.2.11 Faults on the Primary Site


Symptom
The NMS cannot be normally used.

Possible Cause
The NMS cannot be used because of the fault on the primary site.

9-10 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 9 Veritas HA System Troubleshooting

Procedure
l The connection between the client and server is torn down. In this case, the primary site is
unavailable. The NMS application processes are automatically switched to the server on
the secondary site. Do as follows:
1. Log in to the U2000 server on the secondary site through the client.
2. Manage NEs through the U2000 server on the secondary site.
l On the client, the NEs on the NMS preempt the resource of each other. The server is in the
dual-host state. Do as follows:
1. Shut down the U2000 server on the primary site. For details, refer to the Huawei
iManager U2000 High Availability System (Veritas) Administrator Guide.
2. Log in to the U2000 server on the secondary site through the client.
3. Manage NEs through the U2000 server on the secondary site.
l The damage of the NMS data results in the failure of the server. In this case, the primary
and secondary sites are both unavailable. Do as follows:
1. Recover the backup data of the U2000. For details, refer to the Huawei iManager
U2000 High Availability System (Veritas) Administrator Guide.
2. If there is no backup data, recover the data by using the script. For details, refer to the
Huawei iManager U2000 High Availability System (Veritas) Administrator Guide.

----End

9.2.12 Unstable DCN Between the Primary and Secondary Sites

Symptom
The instability of the data communication network (DCN) between the primary and secondary
nodes leads to the frequent interruption of heartbeat between the two nodes. As a result, the
U2000 cannot work normally.

Possible Cause
You can rectify the fault by modifying the timeout period of the heartbeat detection.

Procedure
l To modify the heartbeat detection timeout, run the following commands respectively on
the primary and secondary nodes:
# haconf -makerw
# /opt/VRTSvcs/bin/hahb -local Icmp AYATimeout
# /opt/VRTSvcs/bin/hahb -modify Icmp AYATimeout Heartbeat Detection Timeout -
clus Cluster name of the opposite node
# haconf -dump -makero

NOTE

l The default Heartbeat Detection Timeout is 300s. You can set the heartbeat detection timeout
according to the interruption time of the network between the primary and secondary nodes. For
example, set the value to 600.
l If you use one or two network adapters but do not enable the IPMP feature, the Cluster name of
the opposite node is the opposite hostnameCluster, such as SecondaryCluster.
l If you use two network adapters and enable the IPMP feature, the Cluster name of the opposite
node is the hostname of the opposite node, such as Secondary.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 9-11


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
9 Veritas HA System Troubleshooting Troubleshooting

l After the DCN becomes stable, you need to run the following commands on the primary
and secondary nodes, to restore the heartbeat detection timeout to the default value.
# haconf -makerw
# /opt/VRTSvcs/bin/hahb -local Icmp AYATimeout
# /opt/VRTSvcs/bin/hahb -modify Icmp AYATimeout 600 -clus Cluster name of the
opposite node
# haconf -dump -makero

----End

9-12 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 10 Distributed System Troubleshooting

10 Distributed System Troubleshooting

About This Chapter

This topic describes how to troubleshoot the distributed system.

10.1 Slave Server in the Disconnected State


10.2 Inconsistent Statuses of the U2000s on the Slave and Master Servers
10.3 Other Faults on the Master Server
10.4 Other Faults on the Slave Server

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 10-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
10 Distributed System Troubleshooting Troubleshooting

10.1 Slave Server in the Disconnected State


Symptom
After logging in to the MSuite, you find that the slave server is in the disconnected state.

Possible Cause
l The slave server is not started. The possible causes may be manual shutdown, abnormal
power-off, and hardware fault.
l The MSuite server of the slave server is not started or is started abnormally.
l The IP address used for connecting the slave server to the master server changes.
l The network between the slave server and the master server is faulty or the NIC of the slave
server is faulty.

Procedure
Step 1 Check whether the slave server is started successfully.
If the slave server is started abnormally, check the server hardware, such as hard disk, CPU,
memory, and card.
Step 2 To check whether the MSuite server is started successfully, run the following commands as user
root on the slave server:
# cd /opt/HWENGR/engineering
# ./startclient.sh

If the login window of the NMS maintenance tool is displayed, it indicates that the tool is
normally started. Otherwise, run the ./startserver.sh command to start the server of the NMS
maintenance tool.
Step 3 Check whether the IP address used for connecting the slave server to the master server changes.
Run the ifconfig -a command as user root to check whether the displayed IP address is the same
as the IP address in the server list of the MSuite. If the IP addresses are different, select the slave
server, and then choose System > Synchronize the IP and hostname of the slaveserver .
Step 4 Run the ping Floating IP address of the slave server command as user root on the master server
to check whether the network between the master and slave servers is normal.
If the displayed floating IP address of the slave server is alive, it indicates that the network
between the master and slave server is normal. Otherwise, troubleshoot the network fault.

----End

10.2 Inconsistent Statuses of the U2000s on the Slave and


Master Servers
Symptom
After logging in to the system maintenance tool, you can find that the status of the U2000 on a
slave server is inconsistent with that on the master server.

10-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 10 Distributed System Troubleshooting

Possible Cause
l The slave server is in non-non-monitored mode.
l The status of the U2000 of the slave server is not refreshed.
l The U2000 on the slave server is abnormal.

Procedure
Step 1 Check whether the slave server is in monitored mode.
Log in to the MSuite client, and check the monitoring status of the slave server in the lower pane
of the main interface. If system displays Monitoring, it indicates that the slave server is in
monitored mode. Otherwise, choose System > Start monitoring slave server to restore the
monitoring function of the master server for the slave server.
Step 2 Check whether the status of the U2000 on the slave server is inconsistent with that on the master
server for a long time (more than 10 minutes).
After the master server starts, the slave server starts after the synchronization for 2 to 10 minutes.
Therefore, it is a common phenomenon when the status of the T2000 on the slave server is
inconsistent with that on the master server for a short time.
Step 3 Check whether the configuration of the slave server is correct.
l To check whether the following configuration items are correct, run the following command
as user root on the slave server:
# cat /opt/sybase/interfaces
SYSDBServer
master tcp ether Floating IP address of the master server 5100
query tcp ether Floating IP address of the master server 5100
SYSDBServer_back
master tcp ether Floating IP address of the master server 5200
query tcp ether Floating IP address of the master server 5200
# cat /opt/U2000/server/conf/imap.cfg | grep MDPAddress
MDPAddress=Floating IP address of the master server
# cat /opt/U2000/server/conf/emfmoni.cfg | grep MONI_DISTRIBUTE_MODE
MONI_DISTRIBUTE_MODE=1

If the displayed floating IP address of the master server is inconsistent with the actual IP
address or the value of MONI_DISTRIBUTE_MODE is not 1, you need to manually
modify the configuration file for restoration by running the vi command.
l Run the ls /opt/U2000/server/conf/sysmoni command to view the files in the directory.
If the following configuration files exist in the directory, delete them manually.
– moniemffault.cfg
– moniperfsrv.cfg
– moniweblct.cfg
– moniemfsecu.cfg
– monipubsvr.cfg
– monizip.cfg
– moniemftopo.cfg
– monisvhdsvr.cfg
– monicau.cfg
– moniiNBXmlFramework.cfg
– monitomcat.cfg

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 10-3


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
10 Distributed System Troubleshooting Troubleshooting

– moniemfalmagent.cfg
– monin2100dcsrv.cfg
– monitoolkit.cfg

----End

10.3 Other Faults on the Master Server


Symptom
Unrecoverable faults occur on the master server. You need to reinstall the master server.

Possible Cause
l The hard disk of the master server is faulty.
l The OS of the master server is faulty.
l A severe fault occurs on the file system of the master server. Consequently, the files on the
master server are lost and reinstalling the NMS is required.

Procedure
l Reinstall the master server where the faults occur.
For details, refer to the Huawei iManager U2000 Installation Guide.
NOTE
During the installation, make sure that the IP address and host name of the reinstalled server are the
same as those of the faulty master server.
l Configure the IP Network Multipathing (IPMP) on the master server. Run the following
commands on the master server as the root user:
# cd /opt/HWICMR/bin
# ./runtaskflow.sh config_distributed_ipmp.tf

Configure the IPMP according to the prompts. Note that the settings of the IPMP parameters
must be the same as those for the master server before the faults occur.
l Log in to the system maintenance tool. Choose System > Mounting a slave server to add
the original slave servers again.
l Choose System > Restoring the NMS information to select the up-to-date backup data.
Then, click OK.

----End

10.4 Other Faults on the Slave Server


Symptom
Unrecoverable faults occur on the slave server. You need to reinstall the slave server.
NOTE
In the distributed system, the slave protection server functions as the backup of the slave server. When a
fault occurs on the slave server, all services on the slave server are switched to the slave protection server.

10-4 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 10 Distributed System Troubleshooting

Possible Cause
l The hard disk of the slave server is faulty.
l The OS of the slave server is faulty.
l A severe fault occurs on the file system of the slave server. Consequently, the files on the
slave server are lost and reinstalling the NMS is required.

Procedure
Step 1 Reinstall the slave server where the faults occur.
For details, refer to the Huawei iManager U2000 Installation Guide.
NOTE
During the installation, make sure that the IP address and host name of the reinstalled server are the same
as those of the faulty slave server.

Step 2 Configure the IPMP of the slave server. Run the following commands on the slave server as the
root user:
# cd /opt/HWICMR/bin
# ./runtaskflow.sh config_distributed_ipmp.tf

Configure the IPMP according to the prompts. Note that the settings of the IPMP parameters
must be the same as those for the slave server before the faults occur.
Step 3 If the slave protection server exists in the distributed system, switch the services on the slave
protection server to the slave server.
(1) On the client of the NMS maintenance tool, click the Server tab. Right-click the server
where the subsystem is to be added and choose Switch Nodes from the shortcut menu. The
Switch Nodes dialog box is displayed.
(2) Click OK to start switchover. Wait until the Switch nodes successfully dialog box is
displayed.
(3) Click OK to complete switchover.
Step 4 If the slave protection server does not exist in the distributed system, log in to the NMS
maintenance tool. Choose System > Restoring the NMS information to select the up-to-date
backup data. Then, click OK.

----End

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 10-5


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting 11 NMS System Maintenance Tool Troubleshooting

11 NMS System Maintenance Tool


Troubleshooting

About This Chapter

This topic describes how to troubleshoot the NMS system maintenance tool.

11.1 Troubleshooting the Inconsistency of the Instance Status


11.2 An Error Message Is Displayed When the U2000 Maintenance Tool Client Is Started

Issue 01 (2009-09-25) Huawei Proprietary and Confidential 11-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
11 NMS System Maintenance Tool Troubleshooting Troubleshooting

11.1 Troubleshooting the Inconsistency of the Instance


Status
Symptom
How to troubleshoot the inconsistency of the instance status between the client of the network
management system maintenance suite and the system monitoring client by refreshing the
information on the network management system.

Possible Cause
The client of the network management system maintenance suite refreshes the instance status
every 15 seconds. Therefore, the instance status between the client of the network management
system maintenance suite and the system monitoring client may be inconsistent in a short time.

Procedure
l On the client of the network management system maintenance suite, click the Instance tab.

l Click the shortcut icon to refresh the information on the network management system.

----End

11.2 An Error Message Is Displayed When the U2000


Maintenance Tool Client Is Started
Symptom
In the Solaris or SUSE Linux platform, the startup file of the U2000 maintenance tool client is
startClient.sh and stored in the /opt/U2000/engineering path. When the startup file is executed,
a message indicating the right modification error is displayed. For example,
chmod: warning: can not be modified /opt/U2000/engineering/conf/launch/client/
org.eclipse.osgi

Possible Cause
Before you run the /startClient.sh file as the nmsuser user, the file has already run through the
root user. As a result, a right error occurs.

Procedure
Step 1 Log in to the OS as the root user.
Step 2 Modify the owner of the file where the information indicates a right error. Then, run the following
command:
# chown nmsuser /opt/U2000/engineering/conf/launch/client/org.eclipse.osgi

Here, the org.eclipse.osgi file in the /opt/U2000/engineering/conf/launch/client/ path is taken


as an example. You need to change the owner of the file to nmsuser.

----End

11-2 Huawei Proprietary and Confidential Issue 01 (2009-09-25)


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting A Obtaining the Technical Support

A Obtaining the Technical Support

This topic describes how to obtain the technical support in the case of any problems encountered
during routine maintenance.
During the routine maintenance of the U2000, if there is any problem that is uncertain or hard
to solve, or if you cannot find the solution to a problem from this manual, contact the customer
service center of Huawei or send an email to [email protected]. You can also go to http://
support.huawei.com to obtain the latest technical materials of Huawei.
Before seeking the technical support, collect the relevant information.

Issue 01 (2009-09-25) Huawei Proprietary and Confidential A-1


Copyright © Huawei Technologies Co., Ltd.
iManager U2000
Troubleshooting Index

Index

B
basic principle, 1-1

Issue 01 (2009-09-25) Huawei Proprietary and Confidential i-1


Copyright © Huawei Technologies Co., Ltd.

You might also like