基于调试器的调试
如果仅跟踪不够,您需要使用调试器(例如,gdb)详细检查 gem5 正在做什么。
如果您达到这一点,您肯定想使用
gem5.debug 二进制文件。理想情况下,查看跟踪应该
至少允许您缩小您认为
出现问题的时间范围。达到该点的最快方法是使用
DebugEvent,它进入 gem5 的事件队列,并在通过向进程发送 SIGTRAP
信号达到指定周期时强制进入
调试器。您需要在调试器下启动 gem5 或让调试器
附加到 gem5 进程才能工作。
You can create one or more DebugEvents when you invoke gem5 using the
--debug-break=100 parameter. You can also create new DebugEvents from the
debugger prompt using the schedBreak() function. The following example
session illustrates both of these approaches:
% gdb m5/build/ALL/gem5.debug
GNU gdb 6.1
Copyright 2002 Free Software Foundation, Inc.
[...]
(gdb) run --debug-break=2000 configs/run.py
Starting program: /z/stever/bk/m5/build/ALL/gem5.debug --debug-break=2000 configs/run.py
M5 Simulator System
[...]
warn: Entering event queue @ 0. Starting simulation...
Program received signal SIGTRAP, Trace/breakpoint trap.
0xffffe002 in ?? ()
(gdb) p curTick
$1 = 2000
(gdb) c
Continuing.
(gdb) call schedBreak(3000)
(gdb) c
Continuing.
Program received signal SIGTRAP, Trace/breakpoint trap.
0xffffe002 in ?? ()
(gdb) p _curTick
$3 = 3000
(gdb)
gem5 includes a number of functions specifically intended to be called from the
debugger (e.g., using the gdb call command, as in the schedBreak() example
above). Many of these are “dump” functions which display internal simulator
data structures. For example, eventq_dump() displays the events scheduled on
the main event queue. Most of the other dump functions are associated with
particular objects, such as the instruction queue and the ROB in the detailed
CPU model. These include:
| Function | Effect |
|---|---|
schedBreak(<tick>) |
Schedule a SIGTRAP to occur at <tick> |
setDebugFlag("<flag>") |
Enable a debug flag from the debugger |
clearDebugFlag("<flag>") |
Disable a debug flags from the debugger |
eventqDump() |
Print out all events on the event queue |
takeCheckpoint(<tick>) |
Create a checkpoint at cycle <tick> |
SimObject::find("system.qualified.name") |
Returns the pointer to the object with the specified name |
Debugging Python with PDB
You can debug configuration scripts with the Python debug (PDB) just as you would other Python
scripts. You can enter PDB before your configuration script is executed by
giving the --pdb argument to the gem5 binary. Another approach is to put the
following line in your configuration script wherever you would like to enter the debugger:
import pdb; pdb.set_trace()
Note that the Python files under src are compiled in to the gem5 binary, so you
must rebuild the binary if you add this line (or make other changes) in these
files. Alternatively, you can set the M5_OVERRIDE_PY_SOURCE environment
variable to “true” (see src/python/importer.py).
See the official PDB documentation for more details on using PDB.
Using Valgrind
Valgrind is a dynamic analysis tool used (primarily) to profile a target application and detect the source of run-time errors, as well as detect memory leaks.
For Valgrind to function, the target gem5 binary must have been compiled to
include debugging information. Therefore, the gem5.debug binaries must be
used. Due to difficulties with Valgrind working with tcmalloc, gem5.debug
must be compiled without using the --without-tcmalloc flag:
scons --without-tcmalloc build/ALL/gem5.debug
To run a check using Valgrind, execute the following:
valgrind --leak-check=yes --suppressions=util/valgrind-suppressions build/ALL/gem5.debug {gem5 arguments}
The above will run the gem5 and do two things:
- Give a stack trace if a run-time error is received.
- Give information about potential memory leaks.
The util/valgrind-suppressions file contains a set of warnings that are
reported by Valgrind but are not considered a problem by gem5 developers.
Valgrind is known to provide false positives. util/valgrind-suppressions
should be updated as these false positives are revealed. More information
about suppressing Valgrind warnings can be found in the Valgrind User Manual.
If a run-time error is received, Valgrind will return an output which looks like the following (taken from the Valgrind Quick Start Guide):
==19182== Invalid write of size 4
==19182== at 0x804838F: f (example.c:6)
==19182== by 0x80483AB: main (example.c:11)
In this output:
- 19182 is the process ID
Invalid writeis what kind of error.- Below this error is the stack trace. In this example the leak occurred at
line 6 in
example.c. This line is contained within functionfwhich was called by themainmethod at line 11 (also inexample.c). 0x804838Fis the code address. This is usually not important.
Valgrind may also return warnings about memory leaks, such as:
==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1
==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)
==19182== by 0x8048385: f (a.c:5)
==19182== by 0x80483AB: main (a.c:11)
The stack trace will tell you where the memory leak occurred. If Valgrind states that a block of memory was “definitely lost” then there is a memory leak. However, if Valgrind states that a block was “probably lost”, Valgrind has reason to believe memory is leaking but perhaps not (this is normally if the code is doing something complex with pointers).
If Valgrind returns an output in which a root cause is difficult to determine,
try running Valgrind with --track-origins=yes. This will increase execution
time but will provide more information.
The Valgrind User Manual should be consulted for more advanced features.
