Au Debugtools PDF
Au Debugtools PDF
This article discusses tools that assist application developers with debugging applications on
AIX®. This is especially helpful when working in a customer environment with minimal debug
information.
Introduction
Customer-reported bugs are not always easily reproducible in a development environment.
Application crashes, hangs, and slow performance are common examples. In such cases, we
need tools that can be used in a customer environment. A guided approach to debugging and
some common problem areas are discussed here, along with the available tools on AIX. Note that
debugging slower performance is not discussed here.
AIX environment
The first thing we start with when a problem appears is the environment: the operating system
version and the hardware in use. This is an important step because you might want to check if you
have a reproducible environment where you can debug, or you may want to recreate the exact
environment.
System configuration
Run the prtconf command to see the overall system configuration.
# lslpp -h bos.rte
Fileset Level Action Status Date Time
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.rte
5.3.0.50 COMMIT COMPLETE 10/17/07 16:34:57
5.3.0.60 COMMIT COMPLETE 03/11/08 16:08:59
5.3.7.0 COMMIT COMPLETE 03/12/08 11:28:55
# oslevel -r
5300-07
System uptime
#uptime
05:16PM up 2 days, 1:36, 4 users, load average: 1.95, 1.90, 1.80
• SIGQUIT— Quit
• SIGILL— Invalid instruction
• SIGTRAP— Trace trap
Core files are not always generated when an application crashes, or they may be incomplete. If
this occurs, you may need to enable core file dumps or increase the core file size.
This command displays the current value, called the soft limit, of the core file size for the shell,
which is applicable for all processes started from that shell. If it is zero, run the following command
to increase it to its maximum value, called the hard limit:#ulimit -c <val>.
Attributes of interest:
Determine where the core file is created and which program caused it
If a core file has been created, there should be an error log entry logged by the error-logging
process, which is usually started when the first software failure occurs.
Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED
Probable Causes
SOFTWARE PROGRAM
User Causes
USER GENERATED SIGNAL
Recommended Actions
CORRECT THEN RETRY
Failure Causes
SOFTWARE PROGRAM
Recommended Actions
RERUN THE APPLICATION PROGRAM
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE
Detail Data
SIGNAL NUMBER
11
The executable is located between the pipes on the right hand side of the output and in
the case below, it is uvsh.
Useful attributes:
The following tools can be used to inspect the process or core in question. All the commands
start with proc<cmd>. Special care should be taken while inspecting a process in the production
environment since these tools actually stop the process while they inspect:
Watching a process
The command truss produces a trace of the system calls it performs, the signals it receives, and
the machine faults it incurs. By default, user-level functions are not traced. To enable tracing for all
user-level functions, use truss -u '*' -p <pid>.
Useful options:
• Log in as the user whom you need to investigate and find the PID of your shell using the ps
command.
If client process status field is in FIN_WAIT state for long periods of time, or the server process
status field is in CLOSE_WAIT for a long time, the processes are said to be hanging, or a deadlock
could have occurred.
Socket-to-process ID mapping
Run netstat -Aan, where -A shows the address of any protocol control blocks associated with the
sockets.
Run kdb and issue sockinfo on the address for the socket in question.
Check the time field. If it is constant over time, a probable deadlock or hang could have occurred.
This value greatly affects some of the memory-related issues on AIX. MAXDATA controls the amount
of mallocd memory, and MAXDATA is changed using LDR_CNTRL=MAXDATA=0xN0000000 (where N
equals the number of segments).
On 32-bit systems, the default address-space model is that it uses a single segment for user and
stack data with a maximum aggregate size close to 256 MB. If your application requires more than
that, a large or very large address-space model can be used by setting MAXDATA.
See AIX documents for more information about large program support.
The ldedit command can also be used to change the MAXDATA settings in the executable itself.
ldedit -bmaxdata:0x80000000 sampleexec
For 32-bit programs under the large address-space model, the maximum value allowed is
0x80000000; and under the very-large address-space model, it is 0xD0000000. For 64-bit
programs, any value can be specified, but the data area cannot extend 0x06FFFFFFFFFFFFF8.
#export PSALLOC=early
By default, when malloc is called, no paging space is assigned until it is referenced. It is possible
for malloc to overcommit, and some other process may get the resource before the current
process, resulting in a failure. Setting PSALLOC to "early" guarantees as much paging space as
requested by the memory allocation request.
Conclusion
You have learned about some tools that can be used in a customer environment that helps in
debugging problems. We have discussed a guided approach of debugging and some common
problem areas, along with available AIX tools.
Related topics
• Learn about the Large Address-Space Model.
• Download IBM product evaluation versions and get your hands on application development
tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.