IBM Tivoli UNIX Log Agent User Guide_EN
IBM Tivoli UNIX Log Agent User Guide_EN
Version 6.1.0
User’s Guide
SC32-9471-00
Tivoli Monitoring: UNIX Log Agent
®
Version 6.1.0
User’s Guide
SC32-9471-00
Note
Before using this information and the product it supports, read the information in Appendix F, “Notices,” on page 89.
Use the requirements and configuration chapter in this guide along with the IBM
Tivoli Monitoring Installation and Setup Guide to install and set up the software.
Use the information in this guide along with the IBM Tivoli Monitoring User’s Guide
to monitor UNIX (and Linux®) logs.
Publications
This section lists publications relevant to the use of the Monitoring Agent for UNIX
Logs. It also describes how to access these publications online and how to order
these publications.
Prerequisite publications
To use the information in this guide effectively, you must have some knowledge of
IBM Tivoli Monitoring products, which you can obtain from the following
documentation:
v IBM Tivoli Monitoring Administrator’s Guide
v IBM Tivoli Monitoring Installation and Setup Guide
v IBM Tivoli Monitoring Problem Determination Guide
v IBM Tivoli Monitoring: Upgrading from Tivoli Distributed Monitoring
v IBM Tivoli Monitoring User’s Guide
v Introducing IBM Tivoli Monitoring Version 6.1
Related publications
The following documents also provide useful information:
v IBM Tivoli Enterprise Console Adapters Guide
http://publib.boulder.ibm.com/tividd/glossary/tivoliglossarymst.htm
The IBM Terminology Web site consolidates the terminology from IBM product
libraries in one convenient location. You can access the Terminology Web site at the
following Web address:
http://www.ibm.com/ibm/terminology
http://www.ibm.com/software/tivoli/library
Scroll down and click the Product manuals link. In the Tivoli Technical Product
Documents Alphabetical Listing window, click M to access all of the IBM Tivoli
Monitoring product manuals.
Note: If you print PDF documents on other than letter-sized paper, set the option
in the File → Print window that allows Adobe Reader to print letter-sized
pages on your local paper.
Ordering publications
You can order many Tivoli publications online at the following Web site:
http://www.elink.ibmlink.ibm.com/public/applications/
publications/cgibin/pbi.cgi
Accessibility
Accessibility features help users with a physical disability, such as restricted
mobility or limited vision, to use software products successfully. With this product,
you can use assistive technologies to hear and navigate the interface. You can also
use the keyboard instead of the mouse to operate most features of the graphical
user interface.
http://www.ibm.com/software/tivoli/education/
Support information
“Support information” on page 83 describes the following options for obtaining
support for IBM products:
v “Searching knowledge bases” on page 83
v “Obtaining fixes” on page 84
v “Contacting IBM Software Support” on page 84
This guide uses several conventions for special terms and actions, and operating
system-dependent commands and paths.
Typeface conventions
This guide uses the following typeface conventions:
Bold
v Lowercase commands and mixed case commands that are otherwise
difficult to distinguish from surrounding text
v Interface controls (check boxes, push buttons, radio buttons, spin
buttons, fields, folders, icons, list boxes, items inside list boxes,
multicolumn lists, containers, menu choices, menu names, tabs, property
sheets), labels (such as Tip:, and Operating system considerations:)
v Keywords and parameters in text
Italic
v Words defined in text
v Emphasis of words
v New terms in text (except in a definition list)
v Variables and values you must provide
Monospace
v Examples and code examples
v File names, programming keywords, and other elements that are difficult
to distinguish from surrounding text
v Message text and prompts addressed to the user
v Text that the user must type
v Values for arguments or command options
The names of environment variables are not always the same in Windows® and
UNIX. For example, %TEMP% in Windows is equivalent to $TMPDIR in UNIX.
The Tivoli Enterprise Portal is the interface for IBM Tivoli Monitoring products. By
providing a consolidated view of your environment, the Tivoli Enterprise Portal
permits you to monitor and resolve performance issues throughout the enterprise.
Each site handles log management differently and can adopt a strategy that falls
between two extremes:
v Discard all log data immediately
v Store all log data indefinitely
While the first choice conserves disk space and the second allows problems to be
diagnosed at a later time, neither strategy allows you to anticipate problems or
respond to them in a timely manner.
The Monitoring Agent for UNIX Logs monitors and provides reports for the
following types of logs:
v Syslogs
v Utmp style logs
v Errlogs (AIX platforms only)
v User-defined ASCII logs
User-defined ASCII logs are supported through the Generic User Log Support
(GULS) feature. GULS requires that you supply a format command in the
configuration file that describes a log’s format to the monitoring agent. See
Appendix A, “Generic user log support,” on page 43 for further details.
For both of the IBM Tivoli Monitoring environments (IBM Tivoli Monitoring 5.x
and IBM Tivoli Monitoring 6.1), IBM Tivoli Enterprise Console is an optional
component, which acts as a central collection point for events from a variety of
sources, including those from other Tivoli software applications, Tivoli partner
applications, custom applications, network management platforms, and relational
When the monitoring agent starts, it looks at two files to determine which logs to
monitor:
v Customer configuration file
v Syslog daemon configuration file
If, for any reason, the monitoring agent is unable to find at least one log to
monitor from either file, it writes a message to the RAS log that contains the text
Agent has no work to do. Exiting... and then automatically terminates.
A default customer configuration file is shipped with the product and is called
kul_configfile. This file is installed into the install_dir/config directory. All entries
in the default file are commented out.
Note: All references to install_dir refer to the destination directory that was
specified when the monitoring agent was installed.
Important! If the Monitoring Agent for UNIX Logs is installed on top of a previous
version of the agent (the OMEGAMON® agent), the file
install_dir/config/kul_configfile will be replaced. Rename the file or copy it to
another location before installing version 6.1 of the monitoring agent.
If debug mode is on, each entry written to the monitored log will be
recorded in a debug log. In addition, the formatted entry that is passed to a
situation is also written to the debug log. All logs that are monitored in
debug mode write to the same debug log.
The debug log is specified in the monitoring agent ul.ini file using the
AGENT_DEBUG_LOG environment variable. If this variable is undefined or
the log cannot be opened, no debug logging occurs. Each time the
monitoring agent is started, new events will be appended to the end of the
existing debug log.
3 Log type (optional: default = S)
v S = syslog
v E = errlog
v A = utmp log
v U = user-defined log
4 Format command. This command is valid only for type ‘E’ (errlog) and type
‘U’ (user-defined) logs.
For type ‘E’ logs, the format command must consist of a an errpt command
that includes the ‘-c’ (concurrent mode) option. The default value is:
errpt -c -smmddhhmmyy
For user-defined logs, the format command describes both the format of the
log and how data will be mapped and formatted in the Tivoli Enterprise
Portal Log Entries table view. There is no default.
The actual logging activities are performed by the syslog daemon, syslogd, which
is controlled through a configuration file usually called /etc/syslog.conf. This file
is usually maintained by the system administrator.
The syslog.conf file is used to indicate to which syslog messages are to be written
that have a given severity and that originate from a given application. This allows
The monitoring agent will attempt to build a default list of logs to monitor from
the syslog daemon configuration file under the following circumstances:
v The KUL_CONFIG_FILE environment variable is undefined.
v The specified customer configuration file does not exist or cannot be opened.
v There are no log names in the customer configuration file.
v None of the logs contained in the customer configuration file are valid.
The file that the monitoring agent reads to build the default monitored logs list is
called /etc/syslog.conf, but this can be overridden using the KUL_SYSLOG_CONF
environment variable.
If you are interested only in monitoring syslogs, you can omit the
KUL_CONFIG_FILE environment variable from the ul.ini file, or you can leave the
variable unassigned, thereby letting the monitoring agent determine which syslogs
are active on each system based on the syslogd configuration file.
If you are using the C shell, the following command would produce the same
result:
setenv VAR VALUE
Examples
If you are using the Bourne or Korn shells, use the following commands to assign
a value of install_dir/config/myconfig to the KUL_CONFIG_FILE variable.
KUL_CONFIG_FILE=install_dir/config/myconfig
export KUL_CONFIG_FILE
Note: A refresh occurs only if the monitoring agent determines that the
configuration file has been modified since the agent was started, or since the
previous refresh. If you have not modified the configuration file, but want to
restart a monitor, change the modification date of the configuration file prior
to sending a refresh signal.
To change the modification date of the configuration file prior to sending a signal,
issue the following command from the install_dir/config directory on the managed
system where the monitoring agent is running:
touch kul_configfile
To monitor an ASCII log that does not conform to any of the three supported
types, see Appendix A, “Generic user log support,” on page 43.
This chapter provides information about how to use the Monitoring Agent for
UNIX Logs to perform the following tasks:
v “View real-time data about UNIX logs”
v “Investigate an event” on page 12
v “Recover the operation of a resource” on page 12
v “Customize your monitoring environment” on page 13
v “Monitor with custom situations that meet your requirements” on page 14
v “Collect and view historical data” on page 15
For each of these tasks, there is a list of procedures that you perform to complete
the task. For the procedures, there is a cross-reference to where you can find
information about performing that procedure. Information about the procedures is
located in subsequent chapters and appendixes of this user’s guide and in the IBM
Tivoli Monitoring documentation.
Table 4 contains a list of the procedures for viewing the real-time data about UNIX
logs that the monitoring agent collects through the predefined situations. The table
also contains a cross-reference to where you can find information about each
procedure.
Table 4. Viewing real-time data about UNIX logs
Procedure Where to find information
View the hierarchy of your monitored IBM Tivoli Monitoring User’s Guide:
resources from a system point of view ″Navigating through workspaces″ (in
(Navigator view organized by operating ″Monitoring: real-time and event-based″
platform, system type, monitoring agents, chapter)
and attribute groups).
View the indicators of real or potential
problems with the monitored resources
(Navigator view).
View changes in the status of the resources IBM Tivoli Monitoring User’s Guide: ″Using
that are being monitored (Enterprise workspaces″ (in ″Monitoring: real-time and
Message Log view). event-based″ chapter)
View the status of the agents in the Chapter 4, “Workspaces reference,” on page
managed enterprise that you are monitoring 17 in this guide
(Monitoring Agent Status view).
Investigate an event
When the conditions of a situation have been met, an event indicator is displayed
in the Navigator. When an event occurs, you want to obtain information about that
event so you can correct the conditions and keep your enterprise running
smoothly. The situation must be associated with a Navigator Item in order to
appear.
Table 6 on page 13 contains a list of the procedures for recovering the operation of
a resource and a cross-reference to where you can find information about each
procedure.
Note: When you create and run a situation, an IBM Tivoli Enterprise Console
event is created. For information on how to define event severities from
forwarded IBM Tivoli Monitoring situations and other event information,
see the IBM Tivoli Monitoring Administrator’s Guide.
Table 8 contains a list of the procedures for monitoring your resources with custom
situations that meet your requirements and a cross-reference to where you can find
information about each procedure.
Table 8. Monitoring with custom situations
Procedure Where to find information
Create an entirely new situation. IBM Tivoli Monitoring User’s Guide: ″Creating
a new situation″ (in ″Situations for
event-based monitoring″ chapter, ″Creating a
situation″ section)
Table 9 on page 16 contains a list of the procedures for collecting and viewing
historical data and a cross-reference to where you can find information about each
procedure.
About workspaces
A workspace is the working area of the Tivoli Enterprise Portal application
window. At the left of the workspace is a Navigator that you use to select the
workspace you want to see.
As you select items in the Navigator, the workspace presents views pertinent to
your selection. Each workspace has at least one view. Some views have links to
workspaces. Every workspace has a set of properties associated with it.
For a list of the predefined workspaces for this monitoring agent and a description
of each workspace, refer to the Predefined workspaces section below and the
information in that section for each individual workspace.
Predefined workspaces
The Monitoring Agent for UNIX Logs provides the following predefined
workspaces:
v Log Entries
v Monitored Logs
The Log Entries table view provides entry data and a description of each entry in
the monitored log. The Log Size chart depicts the size of each monitored log file, in
The Log Size chart depicts the size of each monitored log file, in bytes. The
Monitored Logs table view lists a variety of status details associated with the logs
you are monitoring. The Number of Events chart depicts the total number of
events detected by the monitor since the monitor was first started. Based on the
information that this workspace provides, you can make changes, and set up
situations.
Typical scenarios
This section illustrates how you can use the workspaces to monitor logs in some
typical scenarios.
Security issues
A common technique used by hackers to gain unauthorized access to your systems
is to guess the password for a known userid, often for the superuser. Failed logon
attempts are often recorded in a log. For instance, if someone issues the “su”
command to change the user ID to which they are currently logged on and enters
an invalid password for the new user ID, an entry is usually written to the file
/usr/adm/suaudit. You can create a situation that alerts you to possible break-in
attempts when repeated login failures to a user ID occur within a short period of
time.
Display the Monitored Logs workspace for the system in question to confirm that
you are monitoring the appropriate log. Display the Log Entries workspace for the
log to verify the format of the message written when a login attempt fails, for
example, “BAD SU from user1 to root”. Construct a situation that will fire if a
message indicating a logon failure to “root” is detected more frequently than
would normally be expected.
Display the Monitored Logs workspace for a client system to confirm that you are
monitoring the appropriate log. Display the Log Entries workspace for the log to
verify the format of the message written when an NFS request fails, for example,
“NFS server system1 not responding”. Construct a situation that fires if a message
indicating a server problem is detected more frequently then would normally be
expected.
If you want to monitor an application that is not already writing to a log file and
you can modify the application, add the necessary syslog system calls (or, on AIX
platforms, errlog system calls), to the code to produce the desired messages and
add the new logs to the configuration file.
If you want to monitor an application that you cannot modify and which is
writing messages to an ASCII log in a non-standard format, add the log to the
configuration file as normal but set its type to ‘U’ (user-defined). User-defined logs
require a format command that the monitoring agent uses to read log entries and
map the data into the Log Entries workspace.
The following is the entry that you would include in the configuration file that will
format the log entries appropriately both for your situation and for report requests.
See “Customer configuration file format” on page 7 for details on the format of the
configuration file.
Analyzing this line one component at a time from left to right, you can see that the
first item specifies the absolute file name of the log you want to monitor, in this
The ‘U’ item indicates that the log type is user-defined which mandates the
presence of the last item, the format command.
The format command starts with an ‘a’ (ASCII) and a comma. Everything between
the double quotes that follow is part of the format description that tells the
monitoring agent the format of the log. This format description allows you to
break each log entry into arbitrary fields consisting of one or more characters.
Following the format description is a comma that is followed by the mapping
specifications. These indicate into which column of the Log Entries workspace each
field must be mapped.
In this example, the format description breaks each log entry into 11 fields. Each
field is then mapped into one column of the Log Entries workspace by the 11
mapping specifiers.
Table 10. Log entry fields and their descriptions
Mapping and formatting data into the Log Entries
Field Scan directive workspace
1 %9s Consumes the first 9 characters of the message identifier
and is mapped into the “description” column.
2 %c Consumes the 10th character, in this example the message
severity indicator, (I, W or E). This is mapped into the
“type” column.
3 /%d Consumes the ‘/’ character and the following integer. The
‘/’ literal causes the ‘/’ character to be discarded but the
integer is mapped into the ‘class’ column. The mapping
specification for the class column includes a format specifier
“RC=%x”. This inserts the literal “RC =” into the column
and converts the integer to hexadecimal.
4 %d Consumes the white space and the following integer
mapping it into the “month” component of the Entry Time
column.
5 /%d Consumes and discards the ‘/’ character and then consumes
the following integer mapping it into the “day” component
of the Entry Time column.
6 /%d Consumes and discards the ‘/’ character; then consumes the
following integer mapping it into the “year” component of
the Entry Time column.
7 %d Consumes the next integer mapping it into the ‘hour’
component of the Entry Time column.
8 :%d Consumes and discards the ‘:’ character and then consumes
the following integer mapping it into the ‘minute’
component of the Entry Time column.
9 :%d Consumes and discards the ‘:’ character and then consumes
the following integer mapping it into the ‘second’
component of the Entry Time column.
10 %s Consumes and discards the white space preceding the next
character; then maps the next character string, (terminated
by white space), into the ‘source’ column.
You can now create a situation that fires if any ’E’ type messages are written to
this log file by including the following predicates:
v Log_Entries.Log_Name= mylog
v Log_Entries.Type= E
You can also include an ‘Until’ predicate in the situation that causes it to be reset
automatically. The Until predicate allows you to specify that the situation is to be
reset after a certain time interval or when another situation is true.
This can be useful if you are monitoring events that occur in pairs, for example:
server redwood not responding
server redwood OK
In this case, you might create a situation called “Server_OK” that monitors a
certain log file looking for messages that contain the text “redwood OK”. Then
create another situation called “Server_Error” monitoring the same log but looking
for the text “redwood not responding”. In this second situation, include an Until
predicate. Open the Until settings page and refer to the “Reset this situation when”
Distribute both situations as usual to the managed systems on which you want
them to run. Add the situation called “Server Error” to one of the states of a new
or existing template and drag the template to the Enterprise icon to create a
managed object. Assign the managed object to one or more of the managed
systems to which you distributed the situations
When a event is written to the monitored log containing the text “redwood not
responding”, the Server_Error situation fires causing the managed object to change
state. When a message is written to the log containing the text “redwood OK”, the
Server_Error situation is reset and the managed object state reverts to normal.
About attributes
Attributes are the application properties being measured and reported by the
Monitoring Agent for UNIX Logs, such as the log size. Some monitoring agents
have fewer than 100 attributes, while others have over 1000.
Attributes are organized into groups according to their purpose. The attributes in a
group can be used in the following two ways:
v Chart or table views
Attributes are displayed in chart and table views. The chart and table views use
queries to specify which attribute values to request from a monitoring agent.
You use the Query editor to create a new query, modify an existing query, or
apply filters and set styles to define the content and appearance of a view based
on an existing query.
v Situations
You use attributes to create situations that monitor the state of your operating
system, database, or application. A situation describes a condition you want to
test. When you start a situation, the Tivoli Enterprise Portal compares the values
you have assigned to the situation attributes with the values collected by the
Monitoring Agent for UNIX Logs and registers an event if the condition is met.
You are alerted to events by indicator icons that appear in the Navigator.
Some of the attributes in this chapter are listed twice, with the second attribute
having a ″(Unicode)″ designation after the attribute name. These Unicode attributes
were created to provide access to globalized data. Use the globalized attribute
names because this is where the monitoring agent is putting the data. If you were
using a previous Candle® OMEGAMON release of this monitoring agent, you
must run the Application Migration Tool to create globalized attributes for your
customized queries, situations, and policies. See the IBM Tivoli Monitoring
Installation and Setup Guide for more information.
For a list of the attributes groups, a list of the attributes in each attribute group,
and descriptions of the attributes for this monitoring agent, refer to the Attribute
groups and attributes section in this chapter.
The following sections contain descriptions of these attribute groups, which are
listed alphabetically. Each description contains a list of attributes in the attribute
group.
Description The content of the log entry. Valid entry is an alphanumeric text
string, with a maximum length of 256 characters.
Description (Unicode) The content of the log entry. Valid entry is an alphanumeric
text string, with a maximum length of 768 bytes. This attribute is globalized.
Entry Time The date and time, as set on the monitored system, indicating the
instance when the entry was written.
Log Name The name of the monitored log. Valid entry is an alphanumeric text
string, with a maximum length of 128 characters.
Log Name (Unicode) The name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 384 bytes. This attribute is
globalized.
Log Path The absolute path name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 256 characters.
Log Path (Unicode) The absolute path name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 768 bytes. This attribute is
globalized.
Managed System The name of the node that the agent is monitoring. Valid entry is
an alphanumeric text string, with a maximum length of 64 characters.
Period Threshold The interval within which an event must occur at least a
user-specified number of times before a situation is raised. Valid entries are
integers in the range 0 to 31622400. Note that Period Threshold does not display in
the workspace, although it can be used as a situation predicate.
Source (Unicode) The application or resource that logged the entry. Valid entry is
an alphanumeric text string, with a maximum length of 96 bytes. This attribute is
globalized.
System The system on which the entry was written. Valid entry is an
alphanumeric text string, with a maximum length of 32 characters.
Timestamp The date and time, as set on the monitored system, indicating the
instance when the agent collects information.
Type The type of entry for errlogs and utmp logs. For errlog entries, the entry
types include PEND, PERF, PERM, TEMP and UNKN. For utmp logs, the entry
types include Unused space, Run level, System boot time, User logon time, User
idle time, init process, getty waiting, User process, Zombie process, and
Accounting. Valid entry is an alphanumeric text string, with a maximum length of
16 characters.
Date Last Modified The date and time, as set on the monitored system, indicating
the instance when the log was last modified.
Format Command The name of the command to be invoked to format log entries.
For errlogs, this attribute represents the name of the command to be invoked to
format entries in ASCII format. For user logs, this attribute represents the format
command used to describe the log’s format and how the data must be mapped
and formatted in the Log Entries workspace. Valid entry is an alphanumeric text
string, with a maximum length of 256 characters.
Log Name The name of the monitored log. Valid entry is an alphanumeric text
string, with a maximum length of 128 characters.
Log Name (Unicode) The name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 384 bytes.
Log Path The absolute path name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 256 characters.
Log Path (Unicode) The absolute path name of the monitored log. Valid entry is an
alphanumeric text string, with a maximum length of 768 bytes. This attribute is
globalized.
Log Size The size of the monitored log file, in bytes. Valid entries are integer in the
range of 0 to 2147483647.
Managed System The name of the node that the agent is monitoring. Valid entry is
an alphanumeric text string, with a maximum length of 64 characters.
Monitor Start/Stop Time A time stamp indicating the time at which a monitor
started running (if the monitor status is running) or the time at which the monitor
terminated.
Monitor Status If the log monitor is active, the status is running; otherwise, the
status indicates the error that caused the monitor to terminate. Valid entry is an
alphanumeric text string, with a maximum length of 32 characters. Valid values
are:
Number of Events The number of events detected by the monitor since monitoring
started. Valid entries are integers in the range of 0 to 2147483647.
Number of Format Errors The number of events that the monitor was unable to
understand and format (and as a result, were discarded). Valid entries are integers
in the range of 0 to 2147483647.
Timestamp The date and time, as set on the monitored system, indicating the
instance when the agent collects information.
For more information about historical data collection, see the IBM Tivoli Monitoring
Administrator’s Guide.
About situations
A situation is a logical expression involving one or more system conditions.
Situations are used to monitor the condition of systems in your network. You can
manage situations from the Tivoli Enterprise Portal by using the Situation editor.
The IBM Tivoli Monitoring agents that you use to monitor your system
environment are shipped with a set of predefined situations that you can use as-is
or you can create new situations to meet your requirements. Predefined situations
contain attributes that check for system conditions common to many enterprises.
Using predefined situations can improve the speed with which you can begin
using the Monitoring Agent for UNIX Logs. You can examine and, if necessary,
change the conditions or values being monitored by a predefined situation to those
best suited to your enterprise.
Note: The predefined situations provided with this monitoring agent are not
read-only. Do not edit these situations and save over them. Software updates
will write over any of the changes that you make to these situations.
Instead, clone the situations that you want to change to suit your enterprise.
You can display predefined situations and create your own situations using the
Situation editor. The left frame of the Situation editor initially lists the situations
associated with the Navigator item that you selected. When you click a situation
name or create a new situation, the right frame opens with the following tabs:
Formula
Condition being tested
Distribution
List of managed systems (operating systems, subsystems, or applications)
to which the situation can be distributed.
Expert Advice
Comments and instructions to be read in the event workspace
Action
Command to be sent to the system
Until Duration of the situation
For a list of the predefined situations for this monitoring agent and a description
of each situation, refer to the Predefined situations section below and the
information in that section for each individual situation.
HACMP_acquire_service_addr situation
Configures the boot address to the corresponding service address and starts
TCP/IP servers and network daemons by running the telnet -a command.
HACMP_config_too_long situation
Sends a periodic console message when a node has been in reconfiguration for
more than six minutes.
HACMP_event_error situation
Occurs when an HACMP event script fails for some reason.
HACMP_fail_standby situation
Sends a console message when a standby adapter fails or is no longer available
because it has been used to take over the IP address of another adapter.
HACMP_get_disk_vg_fs situation
Acquires disk, volume group, and file system resources as part of takeover.
HACMP_network_down situation
Occurs when the cluster determines that a network has failed. The event script
provided takes no default action because the appropriate action is site or LAN
specific.
HACMP_network_down_complete situation
Occurs only after a network_down event has successfully completed. The event
script provided takes no default action because the appropriate action is site or
LAN specific.
HACMP_network_up situation
Occurs when the cluster determines that a network has become available. The
event script provided takes no default action because the appropriate action is site
or LAN specific.
HACMP_network_up_complete situation
Occurs only after a network_up_event has successfully completed. The event script
provided takes no default action because the appropriate action is site or LAN
specific.
HACMP_node_down_complete situation
Occurs only after a node_down event has successfully completed. Depending on
whether the node is local or remote, either the node_down_local or
node_down_remote_complete subevent is called.
HACMP_node_down_local situation
Releases resources taken from a remote node, stops application servers, releases a
service address taken from a remote node, releases concurrent volume groups,
unmounts file systems, and reconfigures the node to its boot address.
HACMP_node_down_local_complete situation
Instructs the cluster manager to exit when the local node has completed detaching
from the cluster. This event occurs only after a node_down_local event has
successfully completed.
HACMP_node_down_remote situation
Unmounts any NFS file systems and places a concurrent volume group in
nonconcurrent mode when the local node is the only surviving node in the cluster.
If the failed node did not go down gracefully, acquires failed nodes resources: file
systems, volume groups, and disks and service address.
HACMP_node_down_rmt_complete situation
Starts takeover application servers if the remote node did not go down gracefully.
This event occurs only after node_down_remote event has successfully completed.
HACMP_node_up situation
Occurs when a node is joining the cluster. Depending on whether the node is local
or remote, either the node_up_local or node_up_remote subevent is called.
HACMP_node_up_complete situation
Occurs only after a node_up has successfully completed. Depending on whether
the node is local or remote, either node_up_local_complete or
node_up_remote_complete subevent is called.
HACMP_node_up_local situation
When the local node attaches to the cluster, the HACMP_node_up_local situation
acquires the services address, clears the application server file, acquires file
systems, volume groups and disk resources, exports file systems and either
activates concurrent volume groups or puts them into concurrent mode depending
upon the status of the remote nodes.
HACMP_node_up_local_complete situation
Starts application servers and then checks to see if an inactive takeover is needed.
This event only occurs after node_up_local event has successfully completed.
HACMP_node_up_remote situation
Causes the local node to do an NFS mount only after the remote node starts and to
place the concurrent volume groups into concurrent mode.
HACMP_node_up_remote_complete situation
Allows the local node to do an NFS mount only after the remote node is
completely up. This event occurs only after a node_up_remote event has
successfully completed.
HACMP_release_service_addr situation
Detaches the service address and reconfigures to the boot address.
HACMP_release_takeover_addr situation
Identifies a takeover address to be released because a standby adapter on the local
node is masquerading as the service address of the remote node. Reconfigures the
local standby into its original role.
HACMP_release_vg_fs situation
Releases volume groups and file systems that the local node took from the remote
node.
HACMP_stop_server situation
Stops application servers.
HACMP_swap_adapter situation
Exchanges or swaps the IP addresses of two network interface. NIS and name
serving are temporarily turned off during the event.
HACMP_swap_adapter_complete situation
Occurs only after a swap_adapter event has successfully completed. Ensures that
the local Address Resolution Protocol (ARP) cache is updated by deleting entries
and pinging cluster IP addresses.
UNIX_LAA_Bad_su_to_root_Warning situation
Raises an alert if a logon failure to root message is written to usr/adm/suaudit
more than three times within a minute.
When included in a situation, the command executes when the situation becomes
true. A Take Action command in a situation is also referred to as reflex automation.
When you enable a Take Action command in a situation, you automate a response
to system conditions. For example, you can use a Take Action command to send a
command to restart a process on the managed system or to send a text message to
a cell phone.
About policies
Policies are an advanced automation technique for implementing more complex
workflow strategies than you can create through simple automation.
A policy is a set of automated system processes that can perform actions, schedule
work for users, or automate manual tasks. You use the Workflow Editor to design
policies. You control the order in which the policy executes a series of automated
steps, which are also called activities. Policies are connected to create a workflow.
After an activity is completed, Tivoli Enterprise Portal receives return code
feedback and advanced automation logic responds with subsequent activities
prescribed by the feedback.
Note: For monitoring agents that provide predefined policies, predefined policies
are not read-only. Do not edit these policies and save over them. Software
updates will write over any of the changes that you make to these policies.
Instead, clone the policies that you want to change to suit your enterprise.
For information about using the Workflow Editor, see the IBM Tivoli Monitoring
Administrator’s Guide or the Tivoli Enterprise Portal online help.
Predefined policies
There are no predefined policies for this monitoring agent.
While the data is being mapped into the table view, you have the ability to
perform data type conversions (for example, decimal to hexadecimal), and do
formatting to clarify the table view and facilitate the creation of situations.
Format command
A format command is composed as follows:
The format description and formatting specifications both use a syntax based on
that used by the standard ‘C’ scanf and printf functions. The format command
syntax is, perhaps, best illustrated through a simple example. The format
command syntax is explained in detail after the example.
The following format command enables the monitoring agent to monitor this log
allowing you to both create situations looking for specific messages and to display
the log’s contents within the Tivoli Enterprise Portal Log Entries table view.
A , “%s %s %d %d %d:%d %s %s %[^:] : %[^\n]” , type month day year
hour minute hour system source description
Monitoring
Agent
type MSG123
month Dec
day 25
year 2004
hour 3 pm
min 15
system system1
source myapp
In addition to fields with content that varies from one log entry to another, an
entry can contain fixed character strings that occur in the same relative location in
all log entries. These are termed literals. Literals can be embedded anywhere
within a format description.
A scan directive has the following format. (Items enclosed within brackets are
optional.)
%[(offset)][*][width][size]datatype
All scan directives must include at least a percent sign, ‘%’, and a datatype.
Each scan directive starts from where the previous one ended unless it is the first
(in which case it starts at column 1 in the log entry), or an offset option has been
included in the directive. Each scan directive consumes characters from a log entry
until any of the following occurs:
v An inappropriate character is encountered (that is, one that does not match the
expected data type).
v The field width, if specified, is exhausted.
v The end of the log entry is encountered.
Literals: Literals describe a sequence of one or more characters that occur at the
same relative location in every entry in the log, and which you do not want to
map into a table view column.
Specifying a literal makes the monitoring agent look for and read those characters
from a log entry, and then discard them. If you include a literal, it must match
exactly the character sequence in a log entry, otherwise that entry is ignored.
For example, to read a time from a log that has the format:
03:15
In this example, the colon ‘:’ preceding the second directive is a literal.
Any number of white space characters that immediately precede the start of a field
in a log entry are automatically consumed and discarded (unless the data type of
the next scan directive is a character, for example %c). To consume any number of
white space characters that are embedded within a literal in a log entry, include
one or more white space characters in the format description literal. For example,
suppose a log entry has the following format:
MSG123 < Code 9 > System1
If you want to extract only the message field, the code number, and the system,
use the following format description:
%s < Code%d >%s
The first directive scans in the message field. The single blank following the first
directive consumes all the white space between the message field and the ‘<‘ sign
in the log entry. Similarly, the single blank between the ‘<‘ sign and the literal
‘Code’ in the format description consumes all the white space between those same
literals within the log entry. No white space is required between the literal ‘Code’
and the ‘%d’ directive or between the literal ‘>‘ and the ‘%s’ directive because this
white space is automatically consumed.
To include a percent sign, ‘%’, in a literal, specify two adjacent percent sign
characters in the format description. For example, if you want to extract the
percentages from a log that contains the following three fields:
45% 82% 2%
(The blanks embedded within the description are not required but clarify the
example.)
Offsets: Offsets allow you to specify the absolute column within a log entry at
which a scan directive starts; if no offset is specified, each succeeding scan starts
where the previous one ended. The first column in an entry is 1. Offsets can
facilitate the description of fixed field logs; that is, those in which a field starts in
the same column in every entry.
For example, suppose each log entry starts with a message number as in the
following example:
MSG123 Dec 25
If you want to extract only the message number, discarding the text “MSG” that
precedes it, you can use the following scan directive:
%(4)d
This causes the scan to start in column 4, skipping over and discarding the first
three characters.
If you want to skip over the message number field entirely, you can use this
format description:
%*s %s %d
These directives cause the message number field to be ignored but the month and
day are stored and mapped to a column. Since the data is discarded for scan
directives that are suppressed, such directives have no corresponding data
mapping specifier (which associates log data with a table view column).
Width: The width option allows you to specify the maximum number of
characters that is consumed from a log entry to satisfy a scan directive. For
example, suppose you want to extract only the first 2 digits from the message
number from the following log entry:
MSG123 Dec 25
The first directive, “%*3s”, discards the first three characters of the message
number field (the text “MSG”). The second directive, “%2d”, saves the digits “12”
for mapping and the last directive, “%*d”, discards any digits that remain in the
message field. The digit ‘3’ is consumed and discarded.
For scan directives that have a data type of “string” (that is, %s or %[]), the default
width is 31.
Size: The size option can be used in numeric (that is, non-string) directives and
controls the amount of storage that is reserved to hold a scanned number.
Allocating more storage allows larger numbers to be scanned and stored. Unless
you need to scan very large numbers or need to increase the precision, the default
sizes are probably sufficient. The effect of including a size option in a numeric scan
directive depends on the operating system on which the monitoring agent is run.
The valid size option and data type combinations that you can specify are listed in
the following table.
Table 13. Monitoring Agent for UNIX Logs valid size option and data type combinations
Size option Can be used with these data types
l d, i, o, u, x, e, f, g
ll d, i, o, u, x
L e, f, g
h d, i, o, u, x
If you explicitly include formatting in the data mapping specifier for a scanned
value rather than allowing the print directive to default, the size option that you
specify in the scan and print directives must be consistent.
The following table lists the valid data types you can use in a scan directive to
describe alphanumeric data.
Table 14. Monitoring Agent for UNIX Logs valid alphanumeric data types
Data type Corresponding field in the log entry
s A sequence of nonwhite space characters. Characters from the log entry are
consumed until the first white space character is encountered or until the number of
characters specified in the field width has been exhausted. If no width is specified in
the directive, the default width is 31.
c A sequence of bytes. The number of bytes consumed is determined by the specified
width option. If no width is specified, the default is 1. Unlike all other data type
directives, white space immediately preceding the corresponding field in the log
entry is not automatically skipped. To skip over white space, you must explicitly
include a white space literal immediately preceding a directive with a data type of
character.
This feature is useful for describing fields in a log entry that might be blank
assuming the starting column and width of the optional field is known. For
example, suppose two entries from a log are as shown:
field1a field2a field3a
field1b field3b
In this example, the second directive will store “field2a” when the first entry is
processed and will contain blank when the second entry is processed.
This says that all non-blank, non-tab and non-newline characters will be consumed.
That is, the scan will end on the first white space character in the log entry. (See
“Escape characters” on page 55 for details on specifying escape characters in a
format command.)
A scanset allows a single scan directive to consume multiple log entry fields. For
example, a scanset that you might use frequently is one that is to “read all the
remaining characters in an entry”:
%[^\n]
That is, consume everything from the current position in an entry up to the newline
character, which marks the end of the entry.
A scanset directive can also be used to terminate a scan before a simple type ‘s’
variable would, that is, when a white space character is found. For example,
suppose a log entry has embedded within it either one of the following two field
sequences:
Error code:24
Warning code:16
If you want only to extract the numeric code itself, the following directives could be
used:
%*[^:]:%d
This format description consumes and discards (field is suppressed) all characters
until a colon is found (exclusive scanset), then consumes and discards the colon
itself (an embedded literal) and finally consumes and stores the numeric code.
As with a string data type, if you wish to consume more than 31 characters with a
scanset directive, you must include a maximum width option in the directive. For
example, to consume up to 60 characters from the current location up to the end of
the entry use:
%60[^\n]
Some operating systems support the use of a ‘-’ (dash) to represent a range of
characters, for example:
%[a-z]
This example includes all lowercase letters in the scanset. The character that
precedes the dash must be lexically less than the character following it otherwise the
dash stands for itself. Also, the dash stands for itself whenever it is the first or last
character in the scanset.
To include the right bracket in an inclusive scanset, it must immediately follow the
opening left bracket. To include the right bracket in an exclusive scanset, it must
immediately follow the circumflex character. In both cases, a right bracket so placed
is not considered the closing right bracket of the scanset.
Data mapping specifications: The data mapping specifications that comprise the
second component of a format command are separated from the format description
by a single comma. Each mapping specification is separated from the next by
white space. Every non-suppressed scan directive in the format description must
have a single, corresponding data mapping specifier to indicate into which column
of the Log Entries table view the scanned data must be mapped. That is, the
general form of a format command is as follows:
A , “%scan1 %scan2 %*scan3 %scan4” , mapspec1 mapspec2
mapspec4
As the data is mapped into a table view column you can also optionally specify
how you want it formatted and (for data read in and stored in numeric form), that
numeric data is converted to a different type (for example, decimal to hexadecimal
or exponent form). See “Specifying log entry times” on page 57 for further details.
The columns into which scanned data can be mapped correspond to columns in
the Log Entries table view. The following table lists all the valid column names
and minimum abbreviations that can be used.
Table 15. Log entries table view column mapping names
Tivoli Enterprise Portal Log
Entries table view column Format command mapping
name name Minimum abbreviation
Entry Time month mo
day da
year ye
hour ho
minute mi
second se
Description description de
Referring again to the example at the beginning of this section (“Example format
command” on page 43), notice that there are 10 scan directives and also that there
are 10 mapping specifications. Each successive scan directive, reading the format
description left to right, is associated with each successive mapping specifier. That
is, the first log entry field, MSG123, is read by the first scan directive, %s, and is
mapped to the first column specifier, type. The second field, Dec, is read by the
second scan directive, %s, and is mapped to the second column specifier, month,
and so on.
The corresponding format description and data mapping specifiers in the example
format command are:
“%d:%d %s” , hour minute hour
This causes ‘03’ to be read by the first scan directive, ‘%d’, and mapped into the
hour column. The next character in the log entry, the colon, matches the colon
literal in the format description and is discarded. The ‘15’ is read by the second
scan directive, ‘%d’, and is mapped into the minute column. The ‘pm’ is read by
the third scan directive, ‘%s’, and is also mapped into the hour column. The result
is that the hour and minute columns contain ‘3pm’ and ‘15’ respectively. (If the
monitoring agent is passed an hour of ‘3pm’ it converts this into 24-hour format
for display in the Log Entries table view. See “12-Hour format times” on page 57
for more information concerning valid date and time formats that can be passed to
the monitoring agent.)
The example above shows that it is necessary only to supply a column name into
which scanned log data is mapped. For each mapping specification that has no
explicit format specifier, default formatting is applied. The monitoring agent
expands the mapping specifiers in the previous example as follows:
“%d:%d %s” , hour=“%d” minute=“%d” hour=“%s”
How to override the default mapping format specifiers is the subject of the next
section.
MappingName[=“[literals]%[options][width]
[.precision][size]datatype[literals]”]
Literals: Literals can be included before the scanned data is mapped into a table
view column, afterwards, or both, and can serve to clarify the Log Entries table
view and facilitate the creation of situations. For example, suppose you are
monitoring a log that includes the following three fields:
... 13303 15 4 ...
The first field indicates a process identifier, the second represents a return code
and the third a severity. You might choose to map and format the data as follows:
“... %d %d %d ...” , ... source=“proc id. = %d” desc=“RC = %d ” desc=“;
Severity = %d”...
(The ellipses represent omitted fields.) This causes the following to be displayed in
the source and description columns in the Log Entries table view:
Source Description
proc id. = 13303 RC = 15 : Severity = 4
For example, suppose a log entry contains the character sequence “65000” and the
format command contains:
“... %d ...” , ... type=“%’0+9d”
Width: A decimal digit string included in a mapping format specifier signifies the
minimum width of the field into which the data is mapped. If the mapped data
contains fewer characters than the minimum field width, it is right justified and
padded on the left to the length specified by the field width. If the ‘-’ (left-justify)
option has been specified, the data is padded on the right.
For example, suppose the log entry contains the following fields:
1 789 82 4567
A field width with a leading zero is interpreted as meaning that the field must be 0
padded.
Size: The valid size options that you can specify in a format specification for
mapped data are the same as those that can be specified in a scan directive in the
log format description. See Table 13 on page 47 for a list of valid size and data
type combinations.
The data size specified in a mapping format specifier must be the same as that
specified in the corresponding scan directive.
Data type: As with the size option, the data type in a mapping format specifier
must be consistent with that in the corresponding scan directive. This means that if
data is scanned and stored as an integer, it must be mapped as an integer. If it is
scanned as a floating point number, it must be mapped as a floating point number.
If it is scanned as a character or character string it must be mapped as a character
or character string.
Numeric data can be scanned and stored in one of two families: the integer family
and the floating point family. Within each family, data can be represented in
different ways. An integer can be displayed in decimal, octal, hexadecimal and
unsigned formats. A floating point number can be displayed in decimal or
exponent notation. When numeric data is mapped, it is legal to use a different data
type to that in the scan directive as long as the format data type comes from the
same family. Put another way, it is not legal to mix data types from different
numeric families.
This feature allows you to perform type conversions as you map data into a table
view column to clarify the table view. For example, if a field in a log entry
contains the size of a file in bytes, you can display the size as a hexadecimal value
in the Log Entries table view by using the following in the format command:
A, “... %d ...” , ... desc = “File size is %#x bytes” ...
If the file size is, for example, 11259375 bytes, the description column will contain:
File size is 0xabcdef bytes
For an example that mixes data types from the floating point family, the size can
be displayed in exponent notation with the following format command:
A, “... %f ...” , ... desc = “File size is %.2e bytes” ...
The valid mapping data types are specified by family in the following tables.
Table 18. Integer family data types
Data type Format of data scanned and stored as integers
d, i Signed decimal numbers.
u Unsigned decimal numbers.
o Unsigned octal numbers.
x, X Unsigned hexadecimal numbers. The letters abcdef are used for x;
the letters ABCDEF are used for X.
The backward slash character removes any special meaning from the following
character and causes the latter’s single character code value to be substituted
instead.
For example, suppose you want to monitor a log that contains the following
sample entry:
“http://www.acme.com” : GET /download/Acme.exe
Suppose that you want to extract the character string between the set of double
quotation marks and everything after the colon. Also, assume you want to map the
first string into the source column of the Log Entries table view and that you want
to map the second string into the description column enclosing it in quotation
marks. The following format command accomplishes this:
A,”\”%[^\”]\” :%[^\n]” , source desc = “\”%s\””
Following the double quotation mark that starts the format description is an
escaped double quotation mark literal that consumes the double quotation mark
preceding “http:” in the sample log entry. The exclusive scanset scan directive
contains an escaped double quotation mark that terminates the first scan; that is,
“http://www.acme.com” is stored by the scanset directive. The escaped double
quotation mark, white space character, and colon literal following the scanset
directive consumes all characters up to the text “GET” in the sample log entry. The
second scanset consumes and stores all characters until a newline character is
encountered (end of line). The next non-escaped double quotation mark terminates
the format description component of the format command.
The mapping specification for the description column contains two escaped double
quotation mark literals surrounding the scanned string.
The following shows how the sample data above is displayed in the Log Entries
table view using this format command.
Description Source
“GET /download/Acme.exe” http://www.acme.com
The following table lists all the characters that can be represented by an escape
sequence and the associated escape sequence.
Table 20. Escape character sequence
Character Escape sequence representation
newline \n
horizontal tab \t
vertical tab \v
backspace \b
carriage return \r
backspace character (\) \\
single quotation mark (‘) \’
double quotation mark (“) \”
alert \a
Since the format of dates and times varies so widely between logs, the Entry Time
column of the Log Entries table view is composed of six components: the year,
month, day, hour, minute, and second. You specify scanning and mapping pairs
explicitly for each component using one of the mapping names in Table 15 on page
50.
When the monitoring agent formats an entry from a log, it attempts to build a time
stamp with a format of:
mm/dd/yy hh:mm:ss
‘yy’ is the 2 digit year, the first ‘mm’ is the month, ‘dd’ is the day of the month,
‘hh’ is the hour (in 24-hour format), the second ‘mm’ is the minute and ‘ss’ is the
second. To do so, the monitoring agent expects that the data that is mapped into
each of the entry time component fields is a valid integer. The data type with
which each component was scanned and mapped is not important; what is
important is that the formatted result is an integer.
For example, suppose the date and time in a log entry is in the form:
MSG123 2005 03 03 10 15 56 ...
Use the following format command to extract and map the entry time components:
A , “%s %s %s %s %s %s %s” , desc year month day hour minute
second
There are two exceptions to the requirement that all time components consist of
numeric data only: text months and 12-hour format times.
Text months: If the month of the log entry is in text form, for example, Jan, Feb
or JAN, FEB, you can read and map the month as a string. When the monitoring
agent is passed a month in this format, it will translate it to the appropriate
numeric value when constructing the entry time.
12-Hour format times: Some logs contain the entry time in 12-hour format, for
example, 03:15 pm. Since the monitoring agent displays entry times in 24-hour
format, for such logs you must pass the ‘am/pm’ indicator to the monitoring agent
in the hour column. An example format command for reading and mapping the
hour and minute from a log with this format follows.
“%d:%d%s” , hour minute hour
This concatenates the ‘am/pm’ indicator to the 12-hour value so that, using the
previous time as an example, the value “3pm” is passed to the monitoring agent.
When constructing the entry time for an event, the monitoring agent will translate
such a time to its equivalent 24-hour format, in this case ‘15’.
If, for instance, the minute is not supplied within a log entry and you do not want
to let the minute default to the current system clock minute for either monitored
events or table view requests, you can hardcode a value such as ‘0’ for the minute
column. Suppose an entry from such a log has the following format:
MSG123 2005 Mar 6 10 pm Text of event ...
The following format command will hardcode a value of zero for the entry time
minute for every event:
A,“%s %d %s %d %d %s%c %[^\n]” , de ye mo da ho ho min=“0”
desc=“:%s”
The Tivoli Enterprise Portal Log Entries table view would contain the following in
the Entry Time and Description columns. (If omitted, seconds defaults to zero.)
In this example, a dummy scan directive is supplied ‘%c’ that consumes the single
space character between the ‘pm’ and the start of the actual message. Since this
space was going to be discarded anyway and does not affect the data in interest,
this scan directive’s sole purpose is to allow the inclusion of a ‘minute’ column
mapping specifier in which a ‘0’ character is forced to be displayed.
When monitoring such logs, the monitoring agent sets the entry-time year
component for each new event to the current system-clock year as described in
“Hardcoding missing entry time components.”
When handling table view requests for logs that do not include the entry-time
year, the monitoring agent attempts to determine the year of an event based on the
date in the next entry. This causes the monitoring agent to make an assumption
that a monitored log has never been inactive for a period one year or longer. To
show how this works, suppose two entries from a syslog are as follows. (New log
entries are appended to the end of the log so the entry dated December 31st is
older than that dated January 1st.)
58 IBM Tivoli Monitoring: UNIX Log Agent: User’s Guide
Dec 31 23:34:11 bilbo unix: NFS server gandalf not responding
Jan 1 03:34:11 frodo unix: NFS write error on host bilbo
Further, suppose that the current date is March 15th, 2005. If you issued a table
view request and specified in the time span dialog a time range of December 31st,
2004 at 11:00 p.m. to January 1st, 2005 at 4:00 a.m., the above entries are displayed
(in reverse chronological order) in the Log Entries table view as follows:
Running the utility invokes the same code as that used by the Monitoring Agent
for UNIX Logs to monitor user logs. This means that a format command that maps
a user log as desired when passed to kulmapper will also work correctly when
used by the Monitoring Agent for UNIX Logs to monitor that log. Conversely, if a
format command is passed to the utility to format a given user log, any errors in
the format command will produce exactly the same messages as those generated
by the monitoring agent if it were given the same format command to monitor the
same log.
It is easier to build and test format commands using the utility instead of the
Monitoring Agent for UNIX Logs for several reasons:
v The format command used by kulmapper is included in the same file that the
utility reads that contains your sample user log entries. This is of benefit not
only because it is easier to create a format command while you can see actual
log entries but also because it simplifies the task of managing multiple sample
user log files and their corresponding format commands.
v Using sample user log files means you can edit the log entries and play ″what
if″ scenarios to ensure your format command can handle the different messages
that can be written to the log.
v All messages written by kulmapper are sent to the standard output device,
which, by default, is your terminal. The monitoring agent, on the other hand,
sends any syntax and formatting error messages to its RAS log. If formatting is
successful, the results must be viewed in the Log Entries table view of Tivoli
Enterprise Portal.
v The default kulmapper RAS tracing options cause each line read from the
sample log file, including the format command itself, to be displayed followed
by the results of formatting and mapping that entry. This simplifies the task of
verifying that each sample log entry was formatted as expected. If an entry is
encountered that cannot be formatted, kulmapper stops after displaying the log
entry that caused the error and the error message describing the problem.
v After you have updated the format command in the kulmapper sample log file,
you test it by simply invoking the utility again and observing the results on
your terminal. Conversely, after updating a format command in the monitoring
agent’s configuration file, it is necessary to send the monitoring agent a refresh
signal so that it can learn and use the new format.
Using the kulmapper utility you can significantly reduce the time required
performing each ″change-test-observe″ cycle as you tune format commands.
To see how the utility operates, simply change to the install_dir/bin directory and
enter the script name, kulmapper, at the command prompt without any
parameters. This formats and maps the first sample log entry in the file according
to the supplied format command.
Generic event mapping provides useful event class and attribute information for
situations that do not have specific event mapping defined. BAROC files are found
on the Tivoli Enterprise Monitoring Server in the installation directory in TECLIB
(that is, install_dir/cms/TECLIB for Windows systems and
install_dir/tables/TEMS_hostname/TECLIB for UNIX systems). IBM Tivoli
Enterprise Console event synchronization provides a collection of ready-to-use rule
sets that you can deploy with minimal configuration. Be sure to install IBM Tivoli
Enterprise Console event synchronization to access the correct Sentry.baroc, which
is automatically included during base configuration of IBM Tivoli Enterprise
Console rules if you indicate that you want to use an existing rulebase. See the
IBM Tivoli Monitoring Installation and Setup Guide for details.
Each of the event classes is a child of KUL_Base. The KUL_Base event class can be
used for generic rules processing for any event from the Monitoring Agent for
UNIX Logs.
Table 21. Overview of event slots to event classes
IBM Tivoli Enterprise Console event class event slots
ITM_Monitored_Logs Monitored_Logs attribute group
v managed_system: STRING
v log_path: STRING
v log_name: STRING
v log_type: STRING
v monitor_status: STRING
v monitor_start_per_stop_time: STRING
v number_of_events: INTEGER
v number_of_format_errors: INTEGER
v log_size: INTEGER
v date_last_modified: STRING
v debug_mode: STRING
v format_command: STRING
v timestamp: STRING
v log_path_u: STRING
v log_name_u: STRING
Note: You can resolve some problems by ensuring that your system matches the
system requirements listed in Chapter 2, “Requirements and configuration
for the monitoring agent,” on page 5.
Upload files for review to the following FTP site: ftp.emea.ibm.com. Log in as
anonymous and place your files in the directory that corresponds to the IBM Tivoli
Monitoring component that you use. See “Contacting IBM Software Support” on
page 84 for more information about working with IBM Software Support.
Problem classification
The following types of problems might occur with the IBM Tivoli Monitoring:
UNIX Log Agent:
v Installation and configuration
v General usage and operation
v Display of monitoring data
This appendix provides symptom descriptions and detailed workarounds for these
problems, as well as describing the logging capabilities of the monitoring agent.
See the IBM Tivoli Monitoring Problem Determination Guide for general problem
determination information.
Trace logging
Trace logs capture information about the operating environment when component
software fails to operate as intended. The principal log type is the RAS (Reliability,
Availability, and Serviceability) trace log. These logs are in the English language
only. The RAS trace log mechanism is available for all components of IBM Tivoli
Monitoring. Most logs are located in a logs subdirectory on the host computer. See
the following sections to learn how to configure and use trace logging:
v “Principal trace log files” on page 69
v “Examples: using trace logs” on page 71
v “Setting RAS trace parameters” on page 72
Note: The documentation refers to the RAS facility in IBM Tivoli Monitoring as
″RAS1″.
where:
v hostname is the host name of the machine on which the monitoring component is
running.
v product is the two-character product code. For Monitoring Agent for UNIX Logs,
the product code is ul.
v program is the name of the program being run.
v timestamp is an 8-character hexadecimal timestamp representing the time at
which the program started.
For long-running programs, the nn suffix is used to maintain a short history of log
files for that startup of the program. For example, the kulagent program might
have a series of log files as follows:
server01_ul_kulagent_437fc59-01.log
server01_ul_kulagent_437fc59-02.log
server01_ul_kulagent_437fc59-03.log
As the program runs, the first log (nn=01) is preserved because it contains program
startup information. The remaining logs ″roll." In other words, when the set of
numbered logs reach a maximum size, the remaining logs are overwritten in
sequence.
server01_ul_kulagent_537fc59-01.log
server01_ul_kulagent_537fc59-02.log
server01_ul_kulagent_537fc59-03.log
Each program that is started has its own log file. For example, the Monitoring
Agent for UNIX Logs would have agent logs in this format:
server01_ul_kulagent_437fc59-01.log
Note: When you communicate with IBM Software Support, you must capture and
send the RAS1 log that matches any problem occurrence that you report.
See the IBM Tivoli Monitoring Installation and Setup Guide for more information on
the complete set of trace logs that are maintained on the monitoring server.
Background Information
Monitoring Agent for UNIX Logs uses RAS1 tracing and generates the logs
described in Table 23 on page 70. The default RAS1 trace level is ERROR.
RAS1 tracing has control parameters to manage to the size and number of RAS1
logs. Use the procedure described in this section to set the parameters.
Note: The KBB_RAS1_LOG parameter also provides for the specification of the
log file directory, log file name, and the inventory control file directory and
name. Do not modify these values or log information can be lost.
Regularly prune log files other than the RAS1 log files in the logs directory. Unlike
the RAS1 log files which are pruned automatically, other log types can grow
indefinitely, for example, the logs in Table 23 on page 70 that include a process ID
number (PID).
Procedure
Specify RAS1 trace options in the install_dir/config/ul.ini file. You can
manually edit the configuration file to set trace logging:
1. Open the trace options file: /install_dir/config/ul.ini.
2. Edit the line that begins with KBB_RAS1= to set trace logging preferences.
For example, if you want detailed trace logging, set the Maximum Tracing
option:
export KBB_RAS1=’ERROR (UNIT:kul ALL) (UNIT:kra ALL)’
3. Edit the line that begins with KBB_RAS1_LOG= to manage the generation of
log files:
v Edit the following parameters to adjust the number of rolling log files and
their size.
– MAXFILES: the total number of files that are to be kept for all startups of
a given program. Once this value is exceeded, the oldest log files are
discarded. Default value is 9.
– LIMIT: the maximum size, in megabytes (MB) of a RAS1 log file. Default
value is 5.
v IBM Software Support might guide you to modify the following parameters:
– COUNT: the number of log files to keep in the rolling cycle of one
program startup. Default value is 3.
– PRESERVE: the number of files that are not to be reused in the rolling
cycle of one program startup. Default value is 1.
Note: You can resolve some problems by ensuring that your system matches the
system requirements listed in Chapter 2, “Requirements and configuration
for the monitoring agent,” on page 5.
This appendix provides agent-specific problem determination information. See the
IBM Tivoli Monitoring Problem Determination Guide for general problem
determination information.
Note: See Table 24 on page 75 to learn about a problem that affects users who
have a previous version of Monitoring Agent for UNIX Logs.
Presentation files and customized The upgrade from version 350 to IBM Tivoli Monitoring handles
Omegamon DE screens for Candle export of the presentation files and the customized Omegamon DE
monitoring agents need to be upgraded to a screens.
new Linux on z/Series system.
(UNIX only) During a command-line The system prompts you to ignore the warning and re-install the
installation, you choose to install a component (″Yes″) or to stop installation of the component (″No″).
component that is already installed, and v If you select, ″Yes,″ you overwrite the current installation of the
you see the following warning: component.
WARNING - you are about to install Note: If you had previously applied a fixpack or other modification
the SAME version of "component" to the component, those changes would be overwritten.
v If you select, ″No,″ you must exit and restart the installation
where component is the name of the process. You cannot return to the list where you selected
component that you are attempting to components to install. When you run the installer again, do not
install. attempt to install any component that is already installed, unless
you want the installer to overwrite it.
(Monitoring Agent for UNIX Logs only) The This problem occurs because the installer overwrites a pre-existing
install_dir/config/kul_configfile copy of the kul_configfile file. Retrieve the prior version of the
configuration file is empty following the kul_configfile file from your archives or recreate the file from scratch.
installation of a new version of the Log Note: You should rename the file or copy it to another location before
Agent. Settings created for a previous upgrading to ITM 6.1.
version of this agent are lost.
The product fails to do a monitoring The monitoring agent must have the permissions necessary to
activity that requires read, write, or execute perform requested actions. For example, if the user ID you used to log
permissions. For example, the product onto the system to install the monitoring agent (locally or remotely)
might fail to read a log. does not have the permission to perform a monitoring operation (such
as running a command), the monitoring agent is not able perform the
operation.
While installing the agent from a CD, the This error is caused by low disk space. Although the install.sh script
following message is displayed and you are indicates that it is ready to install the agent software, the script
not able to continue the installation: considers the size of all tar files, not the size of all the files that are
install.sh warning: unarchive of contained within the tar file.Run the df -k command to check whether
"/cdrom/unix/cienv1.tar" may the file systems have enough space to install agents.
have failed
About installing as root: Normally, do not use the root user account
to install or to start the Monitoring Agents for UNIX, for Linux, and
for UNIX Logs. If you use the root user account to install the product,
the files do not receive the correct permissions, and product behavior
is unpredictable.
Note: When you monitor a multinode system, such as a database, IBM Tivoli
Monitoring adds a subsystem name to the concatenated name, typically a
database instance name.
The length of the name that IBM Tivoli Monitoring generates is limited to 32
Note: You must ensure that the resulting name is unique with respect to any
existing monitoring component that was previously registered with the
Tivoli Enterprise Monitoring Server.
4. Save the file.
5. Restart the agent.
6. If you do not find the files mentioned in Step 1, perform the workarounds
listed in the next paragraph.
If you do not find the files mentioned in the preceding steps, perform the
following workarounds:
1. Change CTIRA_HOSTNAME environment variable in the configuration file of
the monitoring agent.
v Find the KULENV file in the same path mentioned in the preceding row.
v For z/OS agents, find the RKANPAR library.
v For i5/OS agents, find the QAUTOTMP/KMSPARM library in member
KBBENV.
2. If you cannot find the CTIRA_HOSTNAME environment variable, you must
add it to the configuration file of the monitoring agent:
v On Windows: Use the Advanced > Edit Variables option.
v On UNIX and Linux: Add the variable to the config/product_code.ini and to
config/product_code.config files.
v On z/OS: Add the variable to the RKANPAR library, member
Kproduct_codeENV.
v On i5/OS: Add the variable to the QAUTOTMP/KMSPARM library in
member KBBENV.
3. Some monitoring agents (for example, the monitoring agent for MQ Series) do
not reference the CTIRA_HOSTNAME environment variable to generate
component names. Check the documentation for the monitoring agent that you
are using for information on name generation. If necessary, contact IBM
Software Support.
This section describes problems and solutions for remote deployment and removal
of agent software Agent Remote Deploy:
Consider performance impact of each attribute group: Table 28 lists the impact
on performance (high, medium, or low) of each attribute group. The
multiple-instance attributes have been classified at the lowest level. That is, the
performance overhead will increase if you do not specify compare values for one
or more key values.
When you want to prevent impact on performance by any of the attribute groups
listed in Table 28 you must avoid referencing that attribute group, as suggested in
this list:
v Disable the attribute group.
v Never select workspaces that reference the attribute group.
v Disable situations that reference the attribute group by using the ″Undistributed
situations″ option in the Situation Editor.
v Disable historical reporting that references the attribute group.
v Avoid using the ″Auto Refresh″ refresh feature in a Workspace because this
option causes a refresh of data for all attribute groups.
See the IBM Tivoli Monitoring User’s Guide for additional information on controlling
attribute group usage.
Table 28. Performance Impact by attribute group
Attribute group High Medium Low
Log Entries U
By default the table associated with the attribute group shows 24 hours of
data. This set of data might be large.
Monitored Logs U
Table 30. Problems with configuration of situations that you solve in the Workspace area
Problem Solution
Situation events are not displayed Associate the situation with a workspace.
in the Events Console view of the Note: The situation does not need to be displayed in the workspace. It is
workspace. sufficient that the situation be associated with any workspace.
You do not have access to a Note: You must have administrator privileges to perform these steps.
situation. 1. Select Edit > Administer Users to access the Administer Users window.
2. In the Users area, select the user whose privileges you want to modify.
3. In the Permissions tab, Applications tab, and Navigator Views tab, select
the permissions or privileges that correspond to the user’s role.
4. Click OK.
A managed system seems to be 1. Select Physical View and highlight the Enterprise Level of the navigator
offline. tree.
2. Select View > Workspace > Managed System Status to see a list of
managed systems and their status.
3. If a system is offline, check network connectivity and status of the specific
system or application.
Table 31. Problems with configuration of situations that you solve in the Manage Tivoli Enterprise Monitoring Services
window
Problem Solution
After an attempt to restart the Check the system status and check the appropriate IBM Tivoli Monitoring logs.
agents in the Tivoli Enterprise
Portal, the agents are still not
running.
The Tivoli Enterprise Monitoring Check the system status and check the appropriate IBM Tivoli Monitoring logs.
Server is not running.
The documentation CD contains the publications that are in the product library.
The format of the publications is PDF, HTML, or both.
IBM posts publications for this and all other Tivoli products, as they become
available and whenever they are updated, to the Tivoli software information center
Web site. Access the Tivoli software information center by first going to the Tivoli
software library at the following Web address:
http://www.ibm.com/software/tivoli/library
Scroll down and click the Product manuals link. In the Tivoli Technical Product
Documents Alphabetical Listing window, click M to access all of the IBM Tivoli
Monitoring product manuals.
The IBM Software Support Web site provides the latest information about known
product limitations and workarounds in the form of technotes for your product.
You can view this information at the following Web site:
http://www.ibm.com/software/support
To search for information on IBM products through the Internet (for example, on
Google), be sure to consider the following types of documentation:
v IBM technotes
v IBM downloads
v IBM Redbooks
v IBM developerWorks
v Forums and newsgroups
For more information about the types of fixes that are available, see the IBM
Software Support Handbook at
http://techsupport.services.ibm.com/guides/handbook.html.
Before contacting IBM Software Support, your company must have an active IBM
software maintenance contract, and you must be authorized to submit problems to
IBM. The type of software maintenance contract that you need depends on the
type of product you have:
v For IBM distributed software products (including, but not limited to, Tivoli,
Lotus, and Rational products, as well as DB2 and WebSphere products that run
on Windows, or UNIX operating systems), enroll in Passport Advantage in one
of the following ways:
Online
Go to the Passport Advantage Web site at
http://www.lotus.com/services/passport.nsf/
WebDocs/Passport_Advantage_Home and click How to Enroll.
By phone
For the phone number to call in your country, go to the IBM Software
If you are not sure what type of software maintenance contract you need, call
1-800-IBMSERV (1-800-426-7378) in the United States. From other countries, go to
the contacts page of the IBM Software Support Handbook on the Web at
http://techsupport.services.ibm.com/guides/contacts.html and click the name of
your geographic region for phone numbers of people who provide support for
your location.
Submitting problems
You can submit your problem to IBM Software Support in one of two ways:
Online
Click Submit and track problems on the IBM Software Support site at
http://www.ibm.com/software/support/probsub.html. Type your
information into the appropriate problem submission form.
By phone
For the phone number to call in your country, go to the contacts page of
the IBM Software Support Handbook at
http://techsupport.services.ibm.com/guides/contacts.html and click the
name of your geographic region.
If the problem you submit is for a software defect or for missing or inaccurate
documentation, IBM Software Support creates an Authorized Program Analysis
Report (APAR). The APAR describes the problem in detail. Whenever possible,
IBM Software Support provides a workaround that you can implement until the
APAR is resolved and a fix is delivered. IBM publishes resolved APARs on the
Software Support Web site daily, so that other users who experience the same
problem can benefit from the same resolution.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not give you
any license to these patents. You can send license inquiries, in writing, to:
For license inquiries regarding double-byte (DBCS) information, contact the IBM
Intellectual Property Department in your country or send inquiries, in writing, to:
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law:
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
IBM Corporation
2Z4A/101
11400 Burnet Road
Austin, TX 78758 U.S.A.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.
All statements regarding IBM’s future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
This information is for planning purposes only. The information herein is subject to
change before the products described become available.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
If you are viewing this information in softcopy form, the photographs and color
illustrations might not appear.
Trademarks
IBM, the IBM logo, AIX, Candle, CICS, DB2®, developerWorks®, eServer™,
HACMP, Hummingbird®, iSeries™, i5/OS, Lotus®, OMEGAMON, OS/390®,
OS/400, Passport Advantage®, pSeries®, Rational®, Redbooks™, Tivoli, the Tivoli
logo, Tivoli Enterprise, Tivoli Enterprise Console, WebSphere®, zOS®, and zSeries
are trademarks or registered trademarks of International Business Machines
Corporation in the United States, other countries, or both.
Intel®, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo,
Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium® are trademarks or
registered trademarks of Intel Corporation or its subsidiaries in the United States
and other countries.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Other company, product, and service names may be trademarks or service marks
of others.
Appendix F. Notices 91
92 IBM Tivoli Monitoring: UNIX Log Agent: User’s Guide
Index
Numerics data (continued)
trace logs 68
12-hour format times 57 viewing 15
data mapping specifications 50
data provider
A See agent
accessibility ix, 87 data type in format description 48
agent database agent installation problems 75
trace logs 69 detecting problems, modifying situation values 14
agent installation problems 75 directory names, notation xi
ASCII logs, nonconforming 43 disk capacity planning for historical data 26
attribute groups disk space requirements 5
list of all 24
Log Entries 24
more information 23 E
overview 23 education
performance impact 80 see Tivoli technical training x
attributes entry time format
more information 23 12-hour format 57
overview 23 defaulting the year 58
missing components 58
specifying 57
B text months 57
books environment
feedback viii customizing 13
online viii features 1
ordering viii monitoring real-time 11
see publications ix real-time monitoring 11
built-in problem determination features 67 environment variables 8, 9
environment variables, notation xi
event
C mapping 65
events
calculate historical data disk space 26 investigating 12
capacity planning for historical data 26 workspaces 12
collecting data 15
commands, Take Action 39
components 2
configuration 5 F
customer configuration file 6 features, Monitoring Agent for UNIX Logs 1
customer configuration file format 7 field suppression in format description 47
environment variable syntax 9 files
environment variables 8 agent trace 69
specifying log files to monitor 6 installation trace 69
syslog daemon configuration file 7 other trace log 70
using nonstandard logs 10 trace logs 68
contacting support 84 fixes, obtaining 84
conventions format command 43, 61
operating system xi format description 45
typeface x format description components 45
customer configuration file 6 data mapping specifications 50
customer support data type 48
See support field suppression 47
customizing literals 45
monitoring environment 13 offsets 46
situations 14 size 47
width 47
formatting mapped data 51
D
data
collecting 15
P S
sample Log Entry workspace 21
path names, for trace logs 68 scenarios, workspace 18
path names, notation xi situations
performance considerations 79 general problem determination 80
performance impact HACMP_acquire_service_addr 30
attribute groups 80 HACMP_acquire_takeover_addr situation 31
policies HACMP_config_too_long situation 31
more information 41 HACMP_event_error situation 31
overview 41 HACMP_fail_standby situation 31
predefined policies 41 HACMP_get_disk_vg_fs situation 31
problem determination 67, 73 HACMP_join_standby situation 32
built-in features 67 HACMP_network_down situation 32
describing problems 85 HACMP_network_down_complete situation 32
determining business impact 85 HACMP_network_up situation 32
information centers for 83 HACMP_network_up_complete situation 32
installation 75 HACMP_node_down situation 33
installation logs 69 HACMP_node_down_complete situation 33
knowledge bases for 83 HACMP_node_down_local situation 33
remote deployment 78 HACMP_node_down_local_complete situation 33
situations 79, 80 HACMP_node_down_remote situation 33
submitting problems 86 HACMP_node_down_rmt_complete situation 34
uninstallation 75 HACMP_node_up situation 34
uninstallation logs 69 HACMP_node_up_complete situation 34
problems HACMP_node_up_local situation 34
detecting 14 HACMP_node_up_local_complete situation 34
problems and workarounds 73 HACMP_node_up_remote situation 35
procedures 11 HACMP_node_up_remote_complete situation 35
publications HACMP_release_service_addr situation 35
accessing online ix HACMP_release_takeover_addr situation 35
feedback viii HACMP_release_vg_fs situation 35
for support 83 HACMP_start_server situation 36
online viii HACMP_stop_server situation 36
ordering viii, ix HACMP_swap_adapter situation 36
purposes HACMP_swap_adapter_complete situation 36
collecting data 15 list of all 30
customizing monitoring environment 13 more information 29
investigating events 12 overview 29
monitoring with custom situations 14 predefined 30
problem determination 67 resetting using Until predicate 21
recovering resource operation 12 specific problem determination 79
viewing data 15 UNIX_LAA_Bad_su_to_root_Warning 36
viewing real-time monitoring environment 11 UNIX_LAA_Log_Size_Warning 37
using attributes 23
values, modifying 14
Q size in format description 47
queries, using attributes 23 support
about 83
contacting 84
R describing problems 85
determining business impact of problems 85
real-time data, viewing 11 gathering information for 67
recovering the operation of a resource 12 information centers for 83
refresh signal 10 knowledge bases for 83
refreshing the monitoring agent 9 obtaining fixes 84
on Internet 83
Index 95
support (continued) workspaces (continued)
submitting problems 86 typical scenarios (continued)
weekly update option 84 security issues 18
syslog daemon configuration file 7
Y
T year, omitting from entry times 58
Take Action commands 12
list of all 39
more information 39
overview 39
predefined 39
tasks for using 11
text months 57
time stamps 57
Tivoli software information center ix
Tivoli technical training x
trace logs 68
directories 68
trademarks 91
training, Tivoli technical x
troubleshooting 67
tuning format commands 61
typeface conventions x
U
uninstallation
log file 69
problems 75
UNIX_LAA_Bad_su_to_root_Warning situation 36
UNIX_LAA_Log_Size_Warning situation 37
Until predicate 21
user interfaces options 3
V
values, modifying situations 14
variables, notation for xi
viewing data 15
viewing real-time monitoring environment 11
views
Log Entries workspace 17
Monitored Logs workspace 18
W
weekly update support option 84
width in format description 47
Windows agent installation problems 75
workarounds 73
remote deployment 78
situations 79
workspaces
event 12
list of all 17
Log Entries 17
Monitored Logs 18
more information 17
overview 17
predefined 17
sample Log Entry 21
typical scenarios 18
file server problems 18
monitoring applications 19
Printed in USA
SC32-9471-00