Previous - Up - Next

4.6   Debugging

Note: Some steps in this section require Hindsight. However, you will grasp the basic debugging commands even if Hindsight is not available.

This section demonstrates some source-level debugging facilities that Simics provides.

Your Simics distribution contains an example code snippet called debug_example.c.

Copy the source file and corresponding compiled executable into your workspace using the following commands.

simics>  !cp [simics]/targets/ebony/debug_example.c .
simics>  !cp [simics]/targets/ebony/images/debug_example .
simics>  c

Replace [simics] with the Simics installation directory.

We recommend opening the file debug_example.c in an editor of your choice to easier follow the debugging example. This file contains the code that we are going to debug. The program is supposed to print some information about the users on a system.

In the previous section we copied the executable into the simulated system by using SimicsFS. Now, run that program by starting it in the simulated terminal.

root@firststeps: ~# ./debug_example
Got segmentation fault!
root@firststeps: ~#

This output indicates that our program crashed. Let us use Simics features to debug it.

Simics needs to know the mapping between addresses and line numbers, and this information is stored in the executable. The symtable module in Simics contains commands related to symbolic debugging.

[press control-C]
simics>  new-symtable file = debug_example
Created symbol table 'debug_example'
[symtable] Symbols loaded at 0x10000000
ABI for debug_example is ppc-elf-32
debug_example set for context primary_context

Note: Remember to write file =, otherwise you will create an empty symtable named debug_example. You can add files to an existing symtable with the <symtable>.add-symbol-file command.

To help debugging your programs, we have introduced magic instructions. These are instructions that have no side-effects on a real machine, but can be programmed to do things when run inside Simics, for example stop the simulation.

The debug example code contains such an instruction in the beginning of the main function. Enable break on magic instruction by the command magic-break-enable:

simics>  magic-break-enable
simics>  c

Now rerun debug_example:

root@firststeps: ~# ./debug_example

Simics will stop at the magic instruction and show the corresponding source code line and assembly opcode.

[cpu0] v:0x10000634 p:0x00737b634  magic instruction (or r0,r0,r0)
main (argc=1, argv=0x7ffffe34) at /tmp/debug_example.c:73
73              MAGIC_BREAKPOINT;

As you can see, this magic instruction is effectively a no-operation, which means that the simulation will run as usual on real hardware, or when magic breakpoints are disabled in Simics.

Now let us find the cause of the segmentation fault. Place a breakpoint in the sigsegv_handler() function. The sigsegv_handler() function is called when the program receives the segmentation fault and will allow the program to gracefully exit.

simics>  break (sym sigsegv_handler)
Breakpoint 1 set on address 0x10000520 with access mode 'x'

Resume the simulation. It will stop at the signal handler, and by giving the stack-trace command, you can also see the chain of function calls leading up to this point. This list gives you useful hints about where the crash occurred.

simics>  c
Code breakpoint 1 reached.
[cpu0] v:0x10000520 p:0x00737b520  stwu r1,-32(r1)
sigsegv_handler (sig=0) at /tmp/debug_example.c:35
35      {
simics>  stack-trace
#0 0x10000520 in sigsegv_handler (sig=0) at /tmp/debug_example.c:35
#1 0x7ffff3d8 in ?? ()
#2 0xff06a44 in ?? ()
#3 0xff0ff60 in ?? ()
#4 0x100006ac in main (argc=1, argv=0x7ffffe34) at /tmp/debug_example.c:82
#5 0xfed5fdc in ?? ()
#6 0x0 in ?? ()

Simics prints question marks when no symbol could be found at the address. This can either be a bogus address or a function inside the standard library, to which no symbols have been loaded.

A few frames down you have the main() function, which caused the crash. Now we run the simulation backward into that function. reverse-step-line will run backwards until the previous known source line is reached.

simics>  reverse-step-line

[cpu0] v:0x100006a8 p:0x00737b6a8  bl 0x10010c54
main (argc=1, argv=0x7ffffe34) at /tmp/debug_example.c:82
82                      printf("Type: %s\n", user.type);

This line cause the crash. Let us examine what user.type contains:

simics>  psym user.type
(char *) 0xa94 (unreadable)
simics>  psym user
{name = 0x7ffffdc0 "shutdown", type = (char *) 0xa94 (unreadable)}

As you can see, the type member points to an unreadable address, which caused the crash. Where does this pointer come from? What we want to do is to find where the last write to this pointer occurred.

Using Hindsight, we can first set a write-access breakpoint on the memory of interest, and run backward (using reverse) until the breakpoint is reached. After some time will find the place where the write takes place.

simics>  break -w (sym "&user.type") (sym "sizeof user.type")
Breakpoint 2 set on address 0x7ffffdc8, length 4 with access mode 'w'
simics>  reverse
Breakpoint on write to address 0x7ffffdc8 in primary_context.
Completing instruction @ 0xff365e0 on cpu0.
[cpu0] v:0x0ff365e4 p:0x0075275e4  beqlr

Now, examine the stack trace:

simics>  stack-trace
#0 0xff365e4 in ?? ()
#1 0x100005d8 in read_developer (p=0x7ffffdc0, f=0x10010ca0)
    at /tmp/debug_example.c:60
#2 0x10000674 in main (argc=1, argv=0x7ffffe34) at /tmp/debug_example.c:80
#3 0xfed5fdc in ?? ()
#4 0x0 in ?? ()

In the stack trace, you will see that a call from read_developer() have caused the crash. Switch to that frame and display the code being run.

simics>  frame 1
#1 0x100005d8 in read_developer (p=0x7ffffdc0, f=0x10010ca0)
    at /tmp/debug_example.c:60
simics>  list read_developer 15
 48   {
 49           char line[100], *colon;
 51           if (fgets(line, 100, f) == NULL)
 52                   return 0;       /* end of file */
 54           /* Type is always developer */
 55           p->type = "developer";
 57           /* Everything until the first colon is the name */
 58           colon = strchr(line, ':');
 59           *colon = '\0';
 60           strcpy(p->name, line);
 61           return 1;
 62   }

On line 60 you can see that while the name field was filled in using strcpy, our failing pointer was accidentally overwritten (remember that the breakpoint was placed on the type member). If you write psym line you will see that the string copied is "shutdown". A look into the declaration of struct person shows that the name field is only 8 bytes big, and hence has no space for the trailing null byte.

Check the contents of p after and before the actual write to verify it is overwritten.

simics>  psym "*p"
{name = 0x7ffffdc0 "shutdown", type = (char *) 0xa94 (unreadable)}
simics>  reverse-step-instruction
Completing instruction @ 0xff365e0 on cpu0.
Breakpoint on write to address 0x7ffffdc8 in primary_context.
[cpu0] v:0x0ff365e0 p:0x0075275e0  stb r0,4(r5)
simics>  frame 1; psym "*p"
#1 0x100005d8 in read_developer (p=0x7ffffdc0, f=0x10010ca0)
    at /tmp/debug_example.c:60
{name = 0x7ffffdc0 "shutdown\020", type = (char *) 0x10000a94 "developer"}

To clean up after our debug session, we must remove the breakpoints that we have set. They are identified with a number and can be shown using list-breakpoints.

simics>  delete 1
simics>  delete 2
simics>  magic-break-disable

You can read more about debugging in chapter 12. Magic instructions are described in section 12.1.7.

Previous - Up - Next