0% found this document useful (0 votes)
106 views

Au Debugtools PDF

Uploaded by

mrcipamochaq
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views

Au Debugtools PDF

Uploaded by

mrcipamochaq
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Tools to aid debugging on AIX

Kalyanji Chintakayala February 02, 2010

This article discusses tools that assist application developers with debugging applications on
AIX®. This is especially helpful when working in a customer environment with minimal debug
information.

Introduction
Customer-reported bugs are not always easily reproducible in a development environment.
Application crashes, hangs, and slow performance are common examples. In such cases, we
need tools that can be used in a customer environment. A guided approach to debugging and
some common problem areas are discussed here, along with the available tools on AIX. Note that
debugging slower performance is not discussed here.

AIX environment
The first thing we start with when a problem appears is the environment: the operating system
version and the hardware in use. This is an important step because you might want to check if you
have a reproducible environment where you can debug, or you may want to recreate the exact
environment.

System configuration
Run the prtconf command to see the overall system configuration.

Listing 1. Overall system configuration


#prtconf
System Model: IBM,8204-E8A
Machine Serial Number: 06381D2
Processor Type: PowerPC_POWER6
Number Of Processors: 2
Processor Clock Speed: 4204 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 2 ibmmachine
Memory Size: 9344 MB
Good Memory Size: 9344 MB
Platform Firmware level: Not Available
Firmware Version: IBM,EL320_076
Console Login: enable
Auto Restart: true
Full Core: false

© Copyright IBM Corporation 2010 Trademarks


Tools to aid debugging on AIX Page 1 of 10
developerWorks® ibm.com/developerWorks/

Version and maintenance levels


The following commands display the version, release, and maintenance levels of AIX.

Listing 2. AIX version, release, and maintenance levels


# instfix -i|grep AIX_ML
All filesets for 5.3.0.0_AIX_ML were found.
All filesets for 5300-01_AIX_ML were found.
All filesets for 5300-02_AIX_ML were found.
All filesets for 5300-03_AIX_ML were found.
All filesets for 5300-04_AIX_ML were found.
All filesets for 5300-05_AIX_ML were found.
All filesets for 5300-06_AIX_ML were found.
All filesets for 5300-07_AIX_ML were found.

# lslpp -h bos.rte
Fileset Level Action Status Date Time
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.rte
5.3.0.50 COMMIT COMPLETE 10/17/07 16:34:57
5.3.0.60 COMMIT COMPLETE 03/11/08 16:08:59
5.3.7.0 COMMIT COMPLETE 03/12/08 11:28:55
# oslevel -r
5300-07

CPU and kernel type


Listing 3. CPU and kernel type
# bootinfo -K
64
# bootinfo -y
64

Installed software products


Listing 4. Installed software products
# lslpp -lc|grep -i perl
/usr/lib/objrepos:perl.libext:2.1.0.10::COMMITTED:I:Perl Library Extensions :
/usr/lib/objrepos:perl.rte:5.8.2.71::COMMITTED:F:Perl Version 5 Runtime Environment:

System uptime
#uptime
05:16PM up 2 days, 1:36, 4 users, load average: 1.95, 1.90, 1.80

Tools for an application crash


If a program is terminated, depending on the termination type, a core file could have been
generated. A core file is the image of a terminated process — a dump of everything in memory at
the time of the crash. A core file is generated when any of the following occurs:

• SIGQUIT— Quit
• SIGILL— Invalid instruction
• SIGTRAP— Trace trap

Tools to aid debugging on AIX Page 2 of 10


ibm.com/developerWorks/ developerWorks®

• SIGIOT— End process


• SIGEMT— EMT instruction
• SIGFPE— Arithmetic exception, integer divided by 0, or floating-point exception
• SIGBUS— Specification exception
• SIGSEGV— Segmentation violation
• SIGSYS— Parameter not valid to subroutine

Core files are not always generated when an application crashes, or they may be incomplete. If
this occurs, you may need to enable core file dumps or increase the core file size.

Checking core file size


#ulimit -c

This command displays the current value, called the soft limit, of the core file size for the shell,
which is applicable for all processes started from that shell. If it is zero, run the following command
to increase it to its maximum value, called the hard limit:#ulimit -c <val>.

Checking hard limit for core


#ulimit -Hc

Setting the core limit system-wide


Edit the /etc/security/limits file and change <value> for soft and hard core size, respectively:

core = <value> core_hard = <value>

Alternate method of setting soft limit system-wide


Add the following to /etc/profile to set a soft limit:
#ulimit -S -c <value> > /dev/null 2>&1

Setting soft or hard limits for a user


chuser attribute=value username

Attributes of interest:

• core— Size of soft limit


• core_hard— Size of hard limit
• core_path— Core file directory path enable/disable
• core_pathname— Directory to generate core files

Changing the core file setting


Use the chcore command to change the settings and lscore to view the current core settings.

Enabling full core dump


chdev -l sys0 -a fullcore=true

Tools to aid debugging on AIX Page 3 of 10


developerWorks® ibm.com/developerWorks/

Generating core for the running process


The gencore utility creates a core image of each specified process. It can be then used with a
debugger like dbx.

Gathering core files


The snapcore command gathers the core file, program, and libraries used by the program,
then compresses the information into a PAX file. The file can then be transmitted to a debug
environment, and can be used to identify and resolve a problem with the application.
snapcore -r<core file name> <program name>

The PAX file is created in the /tmp/snapcore directory.

Determine where the core file is created and which program caused it
If a core file has been created, there should be an error log entry logged by the error-logging
process, which is usually started when the first software failure occurs.

1. Retrieve the error log


Listing 5. Error log retrieval
# errpt -a
LABEL: CORE_DUMP
IDENTIFIER: C69F5C9B

Date/Time: Fri Nov 13 17:04:55 IST 2009


Sequence Number: 235168
Machine Id: 000381D2D900
Node Id: ibmmachine
Class: S
Type: PERM
Resource Name: SYSPROC

Description
SOFTWARE PROGRAM ABNORMALLY TERMINATED

Probable Causes
SOFTWARE PROGRAM

User Causes
USER GENERATED SIGNAL

Recommended Actions
CORRECT THEN RETRY

Failure Causes
SOFTWARE PROGRAM

Recommended Actions
RERUN THE APPLICATION PROGRAM
IF PROBLEM PERSISTS THEN DO THE FOLLOWING
CONTACT APPROPRIATE SERVICE REPRESENTATIVE

Detail Data
SIGNAL NUMBER
11

Tools to aid debugging on AIX Page 4 of 10


ibm.com/developerWorks/ developerWorks®

USER'S PROCESS ID:


765972
FILE SYSTEM SERIAL NUMBER
8
INODE NUMBER
352516
CORE FILE NAME
/opt/IBM/InformationServer/Server/Projects/sample1/core
PROGRAM NAME
dsapi_slave

The program that generated the core is mentioned under PROGRAM_NAME.


2. Displaying errors with reference to time
To display a detailed report of all errors logged in the past 24 hours, use the errpt command,
as follows:
# date
Fri Nov 13 18:18:33 IST 2009
# errpt -a -s 1112181809

Which application created the core?


Listing 6. Core-creating application
#lquerypv -h core 500 64

The executable is located between the pipes on the right hand side of the output and in
the case below, it is uvsh.

00000500 00000001 00000000 00000043 00000003 |...........C....|


00000510 F1000100 3361BFF8 00000000 00000000 |....3a..........|
00000520 00120000 75767368 00000000 00000000 |....uvsh........|
00000530 00000000 00000000 00000000 00000000 |................|
00000540 00000000 00000000 00000000 5A9E9590 |............Z...|
00000550 00000000 00000016 00000000 00000BF1 |................|
00000560 00000000 00000000 00000000 00001019 |................|

Examining the core file


Run dbx on the binary executable that caused the core dump. This will display the offending call.
#dbx exe core

System settings useful for debugging


Listing sys0
lsattr -El sys0

Useful attributes:

• autorestart— Automatically reboot system after a crash


• fullcore— Enable/disable full core dump
• maxuproc— Maximum number of processes allowed per user

Changing system attributes


chdev -l sys0 -a attribute=value

Tools to aid debugging on AIX Page 5 of 10


developerWorks® ibm.com/developerWorks/

Process inspection tools


There are myriad tools on AIX for inspecting processes for application errors, hangs, and crashes.
We will discuss some of them here.

The following tools can be used to inspect the process or core in question. All the commands
start with proc<cmd>. Special care should be taken while inspecting a process in the production
environment since these tools actually stop the process while they inspect:

• procstack prints a stack trace of the process.


• procflags prints pending and held signals for the process.
• procsig prints signal actions and handlers for the process.
• procfiles reports fstat and fcntl information for all open files in each process.
• procwdx prints the current working directory of the process procstop, procrun to stop
and run
the stopped process.
• proctree prints the process trees containing the specified process IDs (PIDs) or users, with
child processes indented from their respective parent processes.

Watching a process
The command truss produces a trace of the system calls it performs, the signals it receives, and
the machine faults it incurs. By default, user-level functions are not traced. To enable tracing for all
user-level functions, use truss -u '*' -p <pid>.

Useful options:

• -p provides the PID.


• -u [!] [LibraryName [...]::[!]FunctionName [...] ] traces dynamically loaded user-level
function calls from user libraries.
• -a shows the argument strings passed in each exec() system call.
• -f follows all children created by fork() or vfork() and includes their signals, faults, and
system calls in the trace output.
• -m [!]Fault traces the listed (see the sys/procfs.h header file) machine faults in the process.
• -s [!] Signal permits listing signals to trace or exclude.

trussing a SUID process


To truss a command that runs as another user under SUID, you will not be allowed to do so
because the system identifies it as not belonging to your user. The following error displays:
# truss -deaf -o truss.out program

truss: 0915-015 Cannot create subject process.


wait4all: i: 0, status: 32512, pid: 643282, created: 0

To truss such commands:

• Log in as the user whom you need to investigate and find the PID of your shell using the ps
command.

Tools to aid debugging on AIX Page 6 of 10


ibm.com/developerWorks/ developerWorks®

• Start a new session as root and truss the shell session.


• This new session will log all the activity in the original shell. Run the failing command and stop
the truss. The truss.out file can be investigated to find the failure.

Knowing names of the files opened by a process


In a typical database system environment or applications that have extensive usage of file
handling, it might be important to know the names of files owned by a process for debugging the
problem.

1. List the names of the files owned by the process:


procfiles -n <pid>
2. If you know the inode number, then:
• ncheck generates path names from inode numbers
ncheck - i <inode>
• List the files and grep for the inode
ls -ail |grep <inode>

Process hangs while connecting or accepting TCP connections


netstat -a |grep <process name>

If client process status field is in FIN_WAIT state for long periods of time, or the server process
status field is in CLOSE_WAIT for a long time, the processes are said to be hanging, or a deadlock
could have occurred.

Socket-to-process ID mapping
Run netstat -Aan, where -A shows the address of any protocol control blocks associated with the
sockets.

Listing 7. Socket-to-process ID mapping


#netstat -Ana|grep 31538
f10006000041c398 tcp4 0 0 *.31538 *.* LISTEN
f10006000677d398 tcp4 0 0 9.122.87.107.31538 9.122.87.51.2500 ESTABLISHED
f100060006affb98 tcp4 0 0 9.122.87.107.31538 9.122.87.51.2511 ESTABLISHED
f1000600066d1398 tcp4 0 0 9.122.87.107.31538 9.122.87.51.2521 ESTABLISHED

Run kdb and issue sockinfo on the address for the socket in question.

Listing 8. Run kdb


(0)> sockinfo f10006000677d398 tcpcb
---- TCPCB ----(@ F10006000677D398)----
seg_next......@F10006000677D398 seg_prev......@F10006000677D398
t_softerror... 00000000 t_state....... 00000004 (ESTABLISHED)
t_timer....... 00000000 (TCPT_REXMT)
....
proc/fd: fd: 4
SLOT NAME STATE PID PPID ADSPACE CL #THS

pvproc+01B000 108*dsapi_sl ACTIVE 006C0D0 00B206C 000000002E707590 0 0001

Tools to aid debugging on AIX Page 7 of 10


developerWorks® ibm.com/developerWorks/

Check for hangs from CPU usage


#ps -fp <pid>

Check the time field. If it is constant over time, a probable deadlock or hang could have occurred.

#ps -mp <pid> -o THREAD

Tools to work on process memory


Data-segment settings
The LDR_CNTRL environment variable controls the number of data segments a process can use.
The following example defines one additional data segment:
export LDR_CNTRL=MAXDATA=0x10000000
start the process
unset LDR_CNTRL

This value greatly affects some of the memory-related issues on AIX. MAXDATA controls the amount
of mallocd memory, and MAXDATA is changed using LDR_CNTRL=MAXDATA=0xN0000000 (where N
equals the number of segments).

On 32-bit systems, the default address-space model is that it uses a single segment for user and
stack data with a maximum aggregate size close to 256 MB. If your application requires more than
that, a large or very large address-space model can be used by setting MAXDATA.

See AIX documents for more information about large program support.

The ldedit command can also be used to change the MAXDATA settings in the executable itself.
ldedit -bmaxdata:0x80000000 sampleexec

For 32-bit programs under the large address-space model, the maximum value allowed is
0x80000000; and under the very-large address-space model, it is 0xD0000000. For 64-bit
programs, any value can be specified, but the data area cannot extend 0x06FFFFFFFFFFFFF8.

Memory usage of a process


The ps command reports mallocd memory and does not include mmapd memory. svmon reports
complete process memory utilization.
#svmon -P <pid> -m -r -i <interval>

Late and early allocation


Memory and paging space allocation by default is late. The PSALLOC environment variable controls
the mechanism of allocation.

Tools to aid debugging on AIX Page 8 of 10


ibm.com/developerWorks/ developerWorks®

#export PSALLOC=early

By default, when malloc is called, no paging space is assigned until it is referenced. It is possible
for malloc to overcommit, and some other process may get the resource before the current
process, resulting in a failure. Setting PSALLOC to "early" guarantees as much paging space as
requested by the memory allocation request.

Shared memory settings


To print information about active shared-memory segments, use: #ipcs -mop. To remove shared-
memory segments, use: ipcrm [ -m SharedMemoryID ] [ -M SharedMemoryKey ].

Conclusion
You have learned about some tools that can be used in a customer environment that helps in
debugging problems. We have discussed a guided approach of debugging and some common
problem areas, along with available AIX tools.

Tools to aid debugging on AIX Page 9 of 10


developerWorks® ibm.com/developerWorks/

Related topics
• Learn about the Large Address-Space Model.
• Download IBM product evaluation versions and get your hands on application development
tools and middleware products from DB2®, Lotus®, Rational®, Tivoli®, and WebSphere®.

© Copyright IBM Corporation 2010


(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)

Tools to aid debugging on AIX Page 10 of 10

You might also like