Debugging

Debugging operations are performed by the Debug class. You can receive notification of debugging events by passing a custom event handler to the Debug object when creating it - each event is represented by an Event object. Custom event handlers can also be subclasses of the EventHandler class.

Debug objects can also set breakpoints, watches and hooks and support the use of labels.

The Debug class

A Debug object provides methods to launch new processes, attach to and detach from existing processes, and manage breakpoints. It also contains a System snapshot to instrument debugged processes - this snapshot is updated automatically for processes being debugged.

Example #1: starting a new process and waiting for it to finish

Download

from winappdbg import Debug

import sys

# Instance a Debug object
debug = Debug()
try:

    # Start a new process for debugging
    debug.execv( sys.argv[ 1 : ] )

    # Wait for the debugee to finish
    debug.loop()

# Stop the debugger
finally:
    debug.stop()

Example #2: attaching to a process and waiting for it to finish

Download

from winappdbg import Debug

import sys

# Get the process ID from the command line
pid = int( sys.argv[1] )

# Instance a Debug object
debug = Debug()
try:

    # Attach to a running process
    debug.attach( pid )

    # Wait for the debugee to finish
    debug.loop()

# Stop the debugger
finally:
    debug.stop()

Example #3: attaching to a process by filename

Download

from winappdbg import Debug

import sys

# Get the process filename from the command line
filename = sys.argv[1]

# Instance a Debug object
debug = Debug()
try:

    # Lookup the currently running processes
    debug.system.scan_processes()

    # For all processes that match the requested filename...
    for ( process, name ) in debug.system.find_processes_by_filename( filename ):
        print process.get_pid(), name

        # Attach to the process
        debug.attach( process.get_pid() )

    # Wait for all the debugees to finish
    debug.loop()

# Stop the debugger
finally:
    debug.stop()

Example #4: killing a process by attaching to it

Download

from winappdbg import Debug

import sys
import thread

# Get the process ID from the command line
pid = int( sys.argv[1] )

# Instance a Debug object, set the kill on exit property to True
debug = Debug( bKillOnExit = True )

# Attach to a running process
debug.attach( pid )

# Exit the current thread, killing the attached process
thread.exit()

The Event class

So far we have seen how to attach to or start processes. But a debugger also needs to react to events that happen in the debugee, and this is done by passing a callback function as the eventHandler parameter when instancing the Debug object. This callback, when called, will receive as parameter an Event object which describes the event and contains a reference to the Debug object itself.

Example #5: handling debug events

Download

from winappdbg import Debug, HexDump

def my_event_handler( event ):

    # Get the event name
    name = event.get_event_name()

    # Get the event code
    code = event.get_event_code()

    # Get the process ID where the event occured
    pid = event.get_pid()

    # Get the thread ID where the event occured
    tid = event.get_tid()

    # Get the value of EIP at the thread
    pc = event.get_thread().get_pc()

    # Show something to the user
    format_string = "%s (%s) at address %s, process %d, thread %d"
    message = format_string % ( name, HexDump.integer(code), HexDump.address(pc), pid, tid )
    print message

def simple_debugger( argv ):

    # Instance a Debug object, passing it the event handler callback
    debug = Debug( my_event_handler )
    try:

        # Start a new process for debugging
        debug.execv( argv )

        # Wait for the debugee to finish
        debug.loop()

    # Stop the debugger
    finally:
        debug.stop()

The EventHandler class

Using a callback function is not very flexible when your code is too large. For that reason, the EventHandler class is provided.

Instead of a function, you can define a subclass of EventHandler where each method of your class should match an event - for example, to receive notification on new DLL libraries being loaded, define the load_dll method in your class. If you don’t want to receive notifications on a specific event, simply don’t define the method in your class.

These are the most important event notification methods:

Notification name What does it mean? When is it received
create_process The debugger has attached to a new process. When attaching to a process, when starting a new process for debugging, or when the debugee starts a new process and the bFollow flag was set to True.
exit_process A debugee process has finished executing. When a process terminates by itself or when the Process.kill method is called.
create_thread A debugee process has started a new thread. When the process creates a new thread or when the _Process.start_thread_ method is called.
exit_thread A thread in a debugee process has finished executing. When a thread terminates by itself or when the Thread.kill method is called.
load_dll A module in a debugee process was loaded. When a process loads a DLL module by itself or when the Process.inject_dll method is called.
unload_dll A module in a debugee process was unloaded. When a process unloads a DLL module by itself.
exception An exception was raised by the debugee. When a hardware fault is triggered or when the process calls RaiseException().

The event handler can also receive notifications for specific exceptions as a different event. When you define the method for that exception, it takes precedence over the more generic exception method.

These are the most important exception notification methods:

Notification name What does it mean? When is it received
breakpoint A breakpoint exception was raised by the debugee. When a hardware fault is triggered by the int3 opcode, when the process calls DebugBreak(), or when a code breakpoint set by your program is triggered.
single_step A single step exception was raised by the debugee. When a hardware fault is triggered by the trap flag or the icebp opcode, or when a hardware breakpoint set by your program is triggered.
guard_page A guard page exception was raised by the debugee. When a guard page is hit or when a page breakpoint set by your program is triggered.

In addition to all this, the EventHandler class provides a simple method for API hooking: the apiHooks class property. This property is a dictionary of tuples, specifying which API calls to hook on what DLL libraries, and how many parameter does each call take. That’s it! The EventHandler class will automatically hooks this APIs for you when the corresponding library is loaded, and a method of your subclass will be called when entering and leaving the API function.

Example #6: tracing execution

Download

from winappdbg import Debug, EventHandler, HexDump, CrashDump, win32


class MyEventHandler( EventHandler ):


    # Create process events go here
    def create_process( self, event ):

        # Start tracing the main thread
        event.debug.start_tracing( event.get_tid() )


    # Create thread events go here
    def create_thread( self, event ):

        # Start tracing the new thread
        event.debug.start_tracing( event.get_tid() )


    # Single step events go here
    def single_step( self, event ):

        # Show the user where we're running
        thread = event.get_thread()
        pc     = thread.get_pc()
        code   = thread.disassemble( pc, 0x10 ) [0]
        print "%s: %s" % ( HexDump.address(code[0]), code[2].lower() )


def simple_debugger( argv ):

    # Instance a Debug object, passing it the MyEventHandler instance
    debug = Debug( MyEventHandler() )
    try:

        # Start a new process for debugging
        debug.execv( argv )

        # Wait for the debugee to finish
        debug.loop()

    # Stop the debugger
    finally:
        debug.stop()

Example #7: intercepting API calls

Download

class MyEventHandler( EventHandler ):


    # Here we set which API calls we want to intercept
    apiHooks = {

        # Hooks for the kernel32 library
        'kernel32.dll' : [
                           #  Function            Parameters
                           ( 'CreateFileA'     ,   7  ),
                           ( 'CreateFileW'     ,   7  ),
                         ],

        # Hooks for the advapi32 library
        'advapi32.dll' : [
                           #  Function            Parameters
                           ( 'RegCreateKeyExA' ,   9  ),
                           ( 'RegCreateKeyExW' ,   9  ),
                         ],
    }


    # Now we can simply define a method for each hooked API.
    # Methods beginning with "pre_" are called when entering the API,
    # and methods beginning with "post_" when returning from the API.


    def pre_CreateFileA( self, event, ra, lpFileName, dwDesiredAccess,
             dwShareMode, lpSecurityAttributes, dwCreationDisposition,
                                dwFlagsAndAttributes, hTemplateFile ):

        self.__print_opening_ansi( event, "file", lpFileName )

    def pre_CreateFileW( self, event, ra, lpFileName, dwDesiredAccess,
             dwShareMode, lpSecurityAttributes, dwCreationDisposition,
                                dwFlagsAndAttributes, hTemplateFile ):

        self.__print_opening_unicode( event, "file", lpFileName )

    def pre_RegCreateKeyExA( self, event, ra, hKey, lpSubKey, Reserved,
                                        lpClass, dwOptions, samDesired,
                                       lpSecurityAttributes, phkResult,
                                                     lpdwDisposition ):

        self.__print_opening_ansi( event, "key", lpSubKey )

    def pre_RegCreateKeyExW( self, event, ra, hKey, lpSubKey, Reserved,
                                        lpClass, dwOptions, samDesired,
                                       lpSecurityAttributes, phkResult,
                                                     lpdwDisposition ):

        self.__print_opening_unicode( event, "key", lpSubKey )


    def post_CreateFileA( self, event, retval ):
        self.__print_success( event, retval )

    def post_CreateFileW( self, event, retval ):
        self.__print_success( event, retval )

    def post_RegCreateKeyExA( self, event, retval ):
        self.__print_success( event, retval )

    def post_RegCreateKeyExW( self, event, retval ):
        self.__print_success( event, retval )


    # Some helper private methods...

    def __print_opening_ansi( self, event, tag, pointer ):
        string = event.get_process().peek_string( pointer )
        tid    = event.get_tid()
        print  "%d: Opening %s: %s" % (tid, tag, string)

    def __print_opening_unicode( self, event, tag, pointer ):
        string = event.get_process().peek_string( pointer, fUnicode = True )
        tid    = event.get_tid()
        print  "%d: Opening %s: %s" % (tid, tag, string)

    def __print_success( self, event, retval ):
        tid = event.get_tid()
        if retval:
            print "%d: Success: %x" % (tid, retval)
        else:
            print "%d: Failed!" % tid

Breakpoints, watches and hooks

A Debug object provides a small set of methods to set breakpoints, watches and hooks. These methods in turn use an underlying, more sophisticated interface that is described at the wiki page HowBreakpointsWork.

The break_at method sets a code breakpoint at the given address. Every time the code is run by any thread, a callback function is called. This is useful to know when certain parts of the debugee’s code are being run (for example, set it at the beginning of a function to see how many times it’s called).

The hook_function method sets a code breakpoint at the beginning of a function and allows you to set two callbacks - one when entering the function and another when returning from it. It works pretty much like the apiHooks property of the EventHandler class, only it doesn’t need the function to be exported by a DLL library. It’s useful for intercepting calls to internal functions of the debugee, if you know where they are.

The watch_variable method sets a hardware breakpoint at the given address. Every time a read or write access is made to that address, a callback function is called. It’s useful for tracking accesses to a variable (for example, a member of a C++ object in the heap). It works only on specific threads, to monitor the variable on the entire process you must set a watch for each thread.

Finally, the watch_buffer method sets a page breakpoint at the given address range. Every time a read or write access is made to that part of the memory a callback function is called. It’s similar to watch_variable but it works for the entire process, not just a single thread, and it allows any range to be specified (watch_variable only works for small address ranges, from 1 to 8 bytes).

Debug objects also allow stalking. Stalking basically means to set one-shot breakpoints - that is, breakpoints that are automatically disabled after they’re hit for the first time. The term was originally coined by Pedram Amini for his Process Stalker tool, and this technique is key to differential debugging.

The stalking methods and their equivalents are the following:

Stalking method Equivalent to
stalk_at break_at
stalk_function hook_function
stalk_variable watch_variable
stalk_buffer watch_buffer

Example #8: setting a breakpoint

Download

# This function will be called when our breakpoint is hit
def action_callback( event ):

    # Get the address of the top of the stack
    stack   = event.get_thread().get_sp()

    # Get the return address of the call
    address = event.get_process().read_pointer( stack )

    # Get the process and thread IDs
    pid     = event.get_pid()
    tid     = event.get_tid()

    # Show a message to the user
    message = "kernel32!CreateFileW called from %s by thread %d at process %d"
    print message % ( HexDump.address(address), tid, pid )


class MyEventHandler( EventHandler ):

    def load_dll( self, event ):

        # Get the new module object
        module = event.get_module()

        # If it's kernel32.dll...
        if module.match_name("kernel32.dll"):

            # Get the process ID
            pid = event.get_pid()

            # Get the address of CreateFile
            address = module.resolve( "CreateFileW" )

            # Set a breakpoint at CreateFile
            event.debug.break_at( pid, address, action_callback )

            # If you use stalk_at instead of break_at,
            # the message will only be shown once
            #
            # event.debug.stalk_at( pid, address, action_callback )

Example #9: hooking a function

Download

# This function will be called when the hooked function is entered
def wsprintf( event, ra, lpOut, lpFmt ):

    # Get the format string
    lpFmt = event.get_process().peek_string( lpFmt, fUnicode = True )

    # Get the vararg parameters
    count      = lpFmt.replace( '%%', '%' ).count( '%' )
    parameters = event.get_thread().read_stack_dwords( count, offset = 3 )

    # Show a message to the user
    showparams = ", ".join( [ hex(x) for x in parameters ] )
    print "wsprintf( %r, %s );" % ( lpFmt, showparams )


class MyEventHandler( EventHandler ):

    def load_dll( self, event ):

        # Get the new module object
        module = event.get_module()

        # If it's user32...
        if module.match_name("user32.dll"):

            # Get the process ID
            pid = event.get_pid()

            # Get the address of wsprintf
            address = module.resolve( "wsprintfW" )

            # Hook the wsprintf function
            event.debug.hook_function( pid, address, wsprintf, paramCount = 2 )

            # Use stalk_function instead of hook_function
            # to be notified only the first time the function is called
            #
            # event.debug.stalk_function( pid, address, wsprintf, paramCount = 2 )

Example #10: watching a variable

Download

# This function will be called when the breakpoint is hit
def entering( event ):

    # Get the thread object
    thread = event.get_thread()

    # Get the thread ID
    tid = thread.get_tid()

    # Get the return address location (the top of the stack)
    stack_top = thread.get_sp()

    # Get the return address and the parameters from the stack
    return_address, hModule, lpProcName = thread.read_stack_dwords( 3 )

    # Get the string from the process memory
    procedure_name = event.get_process().peek_string( lpProcName )

    # Show a message to the user
    message = "%.08x: GetProcAddress(0x%.08x, %r);"
    print message % ( return_address, hModule, procedure_name )

    # Watch the DWORD at the top of the stack
    try:
        event.debug.stalk_variable( tid, stack_top, 4, returning )
        #event.debug.watch_variable( tid, stack_top, 4, returning )

    # If no more slots are available, set a code breakpoint at the return address
    except RuntimeError:
        event.debug.stalk_at( event.get_pid(), return_address, returning_2 )


# This function will be called when the variable is accessed
def returning( event ):

    # Get the address of the watched variable
    variable_address = event.breakpoint.get_address()

    # Stop watching the variable
    event.debug.dont_stalk_variable( event.get_tid(), variable_address )
    #event.debug.dont_watch_variable( event.get_tid(), variable_address )

    # Get the return address (in the stack)
    return_address = event.get_process().read_uint( variable_address )

    # Get the return value (in EAX)
    return_value = event.get_thread().get_context() [ 'Eax' ]

    # Show a message to the user
    message = "%.08x: GetProcAddress() returned 0x%.08x"
    print message % ( return_address, return_value )


# This function will be called if we ran out of hardware breakpoints,
# and we ended up setting a code breakpoint at the return address
def returning_2( event ):

    # Get the return address from the breakpoint
    return_address = event.breakpoint.get_address()

    # Remove the code breakpoint
    event.debug.dont_stalk_at( event.get_pid(), return_address )

    # Get the return value (in EAX)
    return_value = event.get_thread().get_context() [ 'Eax' ]

    # Show a message to the user
    message = "%.08x: GetProcAddress() returned 0x%.08x"
    print message % ( return_address, return_value )


# This event handler sets a breakpoint at kernel32!GetProcAddress
class MyEventHandler( EventHandler ):

    def load_dll( self, event ):

        # Get the new module object
        module = event.get_module()

        # If it's kernel32...
        if module.match_name("kernel32.dll"):

            # Get the process ID
            pid = event.get_pid()

            # Get the address of GetProcAddress
            address = module.resolve( "GetProcAddress" )

            # Set a breakpoint at the entry of the GetProcAddress function
            event.debug.break_at( pid, address, entering )

Example #11: watching a buffer

Download

class MyHook (object):

    # Keep record of the buffers we watch
    def __init__(self):
        self.__watched = dict()


    # This function will be called when entering the hooked function
    def entering( self, event, ra, hModule, lpProcName ):

        # Ignore calls using ordinals intead of names
        if lpProcName & 0xFFFF0000 == 0:
            return

        # Get the procedure name
        procName = event.get_process().peek_string( lpProcName )

        # Ignore calls using an empty string
        if not procName:
            return

        # Show a message to the user
        print "GetProcAddress( %r );" % procName

        # Watch the procedure name buffer for access
        pid     = event.get_pid()
        address = lpProcName
        size    = len(procName) + 1
        action  = self.accessed
        event.debug.watch_buffer( pid, address, size, action )

        # Use stalk_buffer instead of watch_buffer to be notified
        # only of the first access to the buffer.
        #
        # event.debug.stalk_buffer( pid, address, size, action )

        # Remember the location of the buffer
        self.__watched[ event.get_tid() ] = ( address, size )


    # This function will be called when leaving the hooked function
    def leaving( self, event, return_value ):

        # Get the thread ID
        tid = thread.get_tid()

        # Get the buffer location
        ( address, size ) = self.__watched[ tid ]

        # Stop watching the buffer
        event.debug.dont_watch_buffer( event.get_pid(), address, size )
        #event.debug.dont_stalk_buffer( event.get_pid(), address, size )

        # Forget the buffer location
        del self.__watched[ tid ]


    # This function will be called every time the procedure name buffer is accessed
    def accessed( self, event ):

        # Show the user where we're running
        thread = event.get_thread()
        pc     = thread.get_pc()
        code   = thread.disassemble( pc, 0x10 ) [0]
        print "0x%.08x: %s" % ( code[0], code[2].lower() )


class MyEventHandler( EventHandler ):

    # Called guard page exceptions NOT raised by our breakpoints
    def guard_page( self, event ):
        print event.get_exception_name()

    # Called on DLL load events
    def load_dll( self, event ):

        # Get the new module object
        module = event.get_module()

        # If it's kernel32...
        if module.match_name("kernel32.dll"):

            # Get the process ID
            pid = event.get_pid()

            # Get the address of wsprintf
            address = module.resolve( "GetProcAddress" )

            # Hook the wsprintf function
            event.debug.hook_function( pid, address, MyHook().entering, paramCount = 2 )

Labels

Labels are used to represent memory locations in a more user-friendly way than simply using their addresses. This is useful to provide a better user interface, both for input and output. Also, labels can be useful when DLL libraries in a debugee are relocated on each run - memory addresses change every time, but labels don’t.

For example, the label “kernel32 CreateFileA” always points to the CreateFileA function of the kernel32.dll library. The actual memory address, on the other hand, may change across Windows versions.

In addition to exported functions, debugging symbols are used whenever possible.

A complete explanation on how labels work can be found at the wiki page HowLabelsWork.

Example #12: getting the label for a given memory address

Download

from winappdbg import System, Process

def print_label( pid, address ):

    # Request debug privileges
    System.request_debug_privileges()

    # Instance a Process object
    process = Process( pid )

    # Lookup it's modules
    process.scan_modules()

    # Resolve the requested label address
    label = process.get_label_at_address( address )

    # Print the label
    print "%s == 0x%.08x" % ( label, address )

Example #13: resolving a label back into a memory address

Download

from winappdbg import System, Process

def print_label_address( pid, label ):

    # Request debug privileges
    System.request_debug_privileges()

    # Instance a Process object
    process = Process( pid )

    # Lookup it's modules
    process.scan_modules()

    # Resolve the requested label address
    address = process.resolve_label( label )

    # Print the address
    print "%s == 0x%.08x" % ( label, address )