Since version 7.0, gdb has gained the ability to execute Python scripts. This allows to write gdb extensions, commands, or manipulate data in a very easy way. It can also allow to manipulate graphic data (by spawning commands in threads), change the program, or even write a firewall (ahem ..). I’ll assume you’re familiar with both gdb commands and basic Python scripts.

The first and very basic test is to check a simple command

(gdb) python print "Hello, world !"
Hello, world !

So far so good. Yet, printing hello world won’t help us to debug our programs :)

The reference documentation can be found here, but does not really help for really manipulating data. I’ll try to give a few examples here.

The Python script

The first thing to do is to write a script (we’ll call it gdb-wzdftpd.py) containing the Python commands.

We will define a command to print the Glib’s type GList, with nodes and content (which is stored using a void*).

To define a new command, we have to create a new class inherited from gdb.Command. This class has two mandatory methods, __init__ and invoke.

Gdb redirects stdout and stderr to its own printing methods.

__init__

In __init__, we will define the command name, the arguments, and the type of completion that gdb can use (on files, lines, symbols etc). Gdb will automatically use the docstring from the class as the help message for the command.

class PrintGList(gdb.Command):
    """print Glib Glist: wzd_print_glist list objecttype

Iterate through the list of nodes in a GList, and display
a human-readable form of the objects."""
    def __init__(self):
        gdb.Command.__init__(self, "wzd_print_glist", gdb.COMMAND_DATA, gdb.COMPLETE_SYMBOL, True)

invoke

The invoke function is called when the matching gdb command is called. This is where things become interesting.

    def invoke(self, arg, from_tty):

arg is a string containing all arguments given to the command. Instead of the using split, the documentation recommends to use gdb.string_to_argv:

        arg_list = gdb.string_to_argv(arg)
        if len(arg_list) < 2:
            print "usage: wzd_print_glist list objecttype"
            return

Here, the first argument will be the name of the symbol containing the list, and the second will be the type of the stored data.

Gdb will not allow you to manipulate objects directly (you only know the name), you have to resolve it to a gdb.Value (to get the content). There is a gdb.lookup_symbol function, but it does not seem interesting since it returns a gdb.Symbol, which has no method to get the content !

Let’s resolve the list symbol in its value:

l = gdb.parse_and_eval(arg_list[0])

Now that we the list, we have to check that it’s indeed a list (only because we’re good guys here. If we don’t, Python will just throw an exception later ..). The gdb.Type object has a code member, which is an integer and can be used for comparisons. To check the type, we will ask gdb to lookup the type of the Glist object, and since we should have a pointer, call the pointer() function of the result. Then we compare the types:

        if l.type.code != gdb.lookup_type('GList').pointer().code:
            print "%s is not a GList*" % arg_list[0]
            return

And now, the real work can start: printing the list. It’s a doubly-linked list, and we have the first element. We just have to iterate and call another method to print the nodes:

        # iterate list and print value
        while l:
            self._print_node(l, t)
            l = l['next']

When a gdb.Symbol points to a C structure, accessing a member (here next) is done using the Python operators.

_print_node

This function will print a basic description of the node (current address, previous and next), and the data.

To print the data, we have to cast it to the correct type: GList stores it in a void * member, which you can’t of course dereference. In the invoke function, we resolve the type (to do it once and give the object to the method, instead of resolving it each times):

        try:
            t = gdb.lookup_type(typename)
        except RuntimeError, e:
            print "type %s not found" % arg_list[1]
            return

There is no clean way to get the error, gdb will throw an exception …

Now that we have the basic type, we again call the pointer method to get the correct type, and use the cast method to convert it. If the conversion can’t be done, you’ll get an exception.

The we dereference the result and call the standard Python print function on it. For the moment, it will print exactly the same as if you’ve called the print command in gdb (we’ll see later how to change that).

    def _print_node(self, node, typeobject):
        print "Node at %s (prev: %s, next %s)" % (node, node['prev'], node['next'],)
        data = node['data']
        pdata = data.cast( typeobject.pointer() )
        data = pdata.dereference()
        print data

Declaring the command

To finish the script, we just have to create an instance of the class when the script is loaded. Add at the end of the script:

PrintGList()

Loading the script

Inside gdb, loading the script is done as usual with the source command:

(gdb) source gdb-wzdftpd.py

You can also tell gdb to reload automatically sourced files.

(gdb) maint set python auto-load yes

This is very useful when developing, though practically I had to completely exit gdb several times to fix weird behaviors of the commands …

Testing

In gdb, just call the function with the required arguments:

(gdb) help wzd_print_glist
print Glib Glist: wzd_print_glist list objecttype

Iterate through the list of nodes in a GList, and display
a human-readable form of the objects.
...
(gdb) wzd_print_glist list_server_context 'struct context_server_t'
Node at 0x805eb20 (prev: 0x0, next 0x805eb40)
{type = 0, io = 0x806fe80, mode = 0, name = 0x8061850 \"srv1\", h = 0x806ec80, af_type = 2, host = 0x80619a8 \"127.0.0.1\", port = 12345, ssl = 0x0, data = 0x0}
Node at 0x805eb40 (prev: 0x805eb20, next 0x805eb50)
{type = 0, io = 0x806ff98, mode = 1, name = 0x8061b08 \"srv2\", h = 0x806ec80, af_type = 2, host = 0x8061bc8 \"*\", port = 12346, ssl = 0x80700a0, data = 0x0}
Node at 0x805eb50 (prev: 0x805eb40, next 0x0)
{type = 0, io = 0x8080438, mode = 0, name = 0x807ec90 \"srv3\", h = 0x806ec80, af_type = 10, host = 0x8061610 \"::1\", port = 12347, ssl = 0x0, data = 0x0}

Pretty print of the structure

In the precedent example, the structure is pretty simple. When you have many members, embedded structures or unions, pretty-printing the structure will be a nice improvement. Pretty-printing with gdb works in two steps:

  • define a class with a to_string method, quite similar to the __repr__ method in Python
  • define a function to match the symbol types to match, and return the above class

Printing class

  • In the init function, we store a reference on the value to print.
  • In the to_string function, we simply get the values and format them.
port = self.val['port']
ret = " [%8s] %4s %16s %5d %4s" % (self.val['name'].string(), af_type, self.val['host'].string(), port, mode,)
return ret

Lookup function

The lookup function should look at the symbol attributes (we will use the type), and return the Pretty-Printing class if it matches, or None to tell gdb to continue to search.

def serverctx_lookup_function (val):
    lookup_tag = val.type.tag
    regex = re.compile ("^context_server_t$")
    if regex.match (lookup_tag):
        return ServerCtxPrinter (val)
    return None

Registering the function

The lookup function must be registered in the pretty_printers list. When looking for a pretty-printer, gdb will search in the Objfile of the current program space. If no pretty-printer is found, it will then look in the program space, and if not found, in the global list.
I tried to registered in the current Objfile (using the gdb.current_objfile() function), but it always return None so I used the global namespace ..
The code should be called once, when loading the file, we append it at the end of the file.

def register_printers(objfile):
    gdb.pretty_printers.append(serverctx_lookup_function)

register_printers( gdb.current_objfile () )

Testing

Print the same list:

(gdb) wzd_print_glist list_server_context 'struct context_server_t'
Node at 0x805eb20 (prev: 0x0, next 0x805eb40)
 [    srv1] IPv4        127.0.0.1 12345     
Node at 0x805eb40 (prev: 0x805eb20, next 0x805eb50)
 [    srv2] IPv4                * 12346  TLS
Node at 0x805eb50 (prev: 0x805eb40, next 0x0)
 [    srv3] IPv6              ::1 12347

What’s next

Using Python scripts in gdb is really helpful, especially because gdb’s internal language does not easily allow to automatize things or make complex manipulations. In this example, we only add a pretty-print function and add a way to iterate a container.

Using scripts, we can create a library of helper functions to print the status of a complex program, run checks on the state of the program etc.

References