Codementor Events

Advanced Python Debugging with pdb

Published Jan 29, 2018Last updated Jul 28, 2018
Advanced Python Debugging with pdb

Python's built-in pdb module is extremely useful for interactive debugging, but has a bit of a learning curve. For a long time, I stuck to basic print-debugging and used pdb on a limited basis, which meant I missed out on a lot of features that would have made debugging faster and easier.

In this post I will show you a few tips I've picked up over the years to level up my interactive debugging skills.

First, why would you want to use an interactive debugger instead of inserting print or logging statements into your code?

With pdb, you have a lot more flexibility to run, resume, and alter the execution of your program without touching the underlying source. Once you get good at this, it means more time spent diving into issues and less time context switching back and forth between your editor and the command line.

Also, by not touching the underlying source code, you will have the ability to step into third party code (e.g. modules installed from PyPI) and the standard library.

Post-mortem debugging

The first workflow I used after moving away from print debugging was pdb's "post-mortem debugging" mode. This is where you run your program as usual, but whenever an unhandled exception is thrown, you drop down into the debugger to poke around in the program state. After that, you attempt to make a fix and repeat the process until the problem is resolved.

You can run an existing script with the post-mortem debugger by using Python's -mpdb option:

python3 -mpdb path/to/script.py

From here, you are dropped into a (Pdb) prompt. To start execution, you use the continue or c command. If the program executes successfully, you will be taken back to the (Pdb) prompt where you can restart the execution again. At this point, you can use quit / q or Ctrl+D to exit the debugger.

If the program throws an unhandled exception, you'll also see a (Pdb) prompt, but with the program execution stopped at the line that threw the exception. From here, you can run Python code and debugger commands at the prompt to inspect the current program state.

Testing our basic workflow

To see how these basic debugging steps work, I'll be using this (buggy) program:

import random

MAX = 100

def main(num_loops=1000):
    for i in range(num_loops):
        num = random.randint(0, MAX)
        denom = random.randint(0, MAX)
        result = num / denom
        print("{} divided by {} is {:.2f}".format(num, denom, result))

if __name__ == "__main__":
    import sys
    arg = sys.argv[-1]
    if arg.isdigit():
        main(arg)
    else:
        main()

We're expecting the program to do some basic math operations on random numbers in a loop and print the result. Try running it normally and you will see one of the bugs:

$ python3 script.py
2 divided by 30 is 0.07
65 divided by 41 is 1.59
0 divided by 70 is 0.00
...
38 divided by 26 is 1.46
Traceback (most recent call last):
  File "script.py", line 16, in <module>
    main()
  File "script.py", line 7, in main
    result = num / denom
ZeroDivisionError: division by zero

Let's try post-mortem debugging this error:

$ python3 -mpdb script.py
> ./src/script.py(1)<module>()
-> import random
(Pdb) c
49 divided by 46 is 1.07
...
Traceback (most recent call last):
  File "/usr/lib/python3.4/pdb.py", line 1661, in main
    pdb._runscript(mainpyfile)
  File "/usr/lib/python3.4/pdb.py", line 1542, in _runscript
    self.run(statement)
  File "/usr/lib/python3.4/bdb.py", line 431, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "./src/script.py", line 1, in <module>
    import random
  File "./src/script.py", line 7, in main
    result = num / denom
ZeroDivisionError: division by zero
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> ./src/script.py(7)main()
-> result = num / denom
(Pdb) num
76
(Pdb) denom
0
(Pdb) random.randint(0, MAX)
56
(Pdb) random.randint(0, MAX)
79
(Pdb) random.randint(0, 1)
0
(Pdb) random.randint(1, 1)
1

Once the post-mortem debugger kicks in, we can inspect all of the variables in the current frame and even run new code to help us figure out what's wrong and attempt to make a fix.

Dropping into the debugger from Python code using pdb.set_trace

Another technique that I used early on, after starting to use pdb, was forcing the debugger to run at a certain line of code before an error occurred. This is a common next step after learning post-mortem debugging because it feels similar to debugging with print statements.

For example, in the above code, if we want to stop execution before the division operation, we could add a pdb.set_trace call to our program here:

    import pdb; pdb.set_trace()
    result = num / denom

And then run our program without -mpdb:

$ python3 script.py
> ./src/script.py(10)main()
-> result = num / denom
(Pdb) num
94
(Pdb) denom
19

The problem with this method is that you have to constantly drop these statements into your source code, remember to remove them afterwards, and switch between running your code with python vs. python -mpdb.

Using pdb.set_trace gets the job done, but breakpoints are an even more flexible way to stop the debugger at any line (even third party or standard library code), without needing to modify any source code. Let's learn about breakpoints and a few other useful commands.

Debugger commands

There are over 30 commands you can give to the interactive debugger, a list that can be seen by using the help command when at the (Pdb) prompt:

(Pdb) help

Documented commands (type help <topic>):
========================================
EOF    c          d        h         list      q        rv       undisplay
a      cl         debug    help      ll        quit     s        unt
alias  clear      disable  ignore    longlist  r        source   until
args   commands   display  interact  n         restart  step     up
b      condition  down     j         next      return   tbreak   w
break  cont       enable   jump      p         retval   u        whatis
bt     continue   exit     l         pp        run      unalias  where

You can use help <topic> for more information on a given command.

Instead of walking through each command, I'll list out the ones I've found most useful and what arguments they take.

Setting breakpoints:

  • l(ist): displays the source code of the currently running program, with line numbers, for the 10 lines around the current statement.
  • l 1,999: displays the source code of lines 1-999. I regularly use this to see the source for the entire program. If your program only has 20 lines, it'll just show all 20 lines.
  • b(reakpoint): displays a list of current breakpoints.
  • b 10: set a breakpoint at line 10. Breakpoints are referred to by a numeric ID, starting at 1.
  • b main: set a breakpoint at the function named main. The function name must be in the current scope. You can also set breakpoints on functions in other modules in the current scope, e.g. b random.randint.
  • b script.py:10: sets a breakpoint at line 10 in script.py. This gives you another way to set breakpoints in another module.
  • clear: clears all breakpoints.
  • clear 1: clear breakpoint 1.

Stepping through execution:

  • c(ontinue): execute until the program finishes, an exception is thrown, or a breakpoint is hit.
  • s(tep): execute the next line, whatever it is (your code, stdlib, third party code, etc.). Use this when you want to step down into function calls you're interested in.
  • n(ext): execute the next line in the current function (will not step into downstream function calls). Use this when you're only interested in the current function.
  • r(eturn): execute the remaining lines in the current function until it returns. Use this to skip over the rest of the function and go up a level. For example, if you've stepped down into a function by mistake.
  • unt(il) [lineno]: execute until the current line exceeds the current line number. This is useful when you've stepped into a loop but want to let the loop continue executing without having to manually step through every iteration. Without any argument, this command behaves like next (with the loop skipping behavior, once you've stepped through the loop body once).

Moving up and down the stack:

  • w(here): shows an annotated view of the stack trace, with your current frame marked by >.
  • u(p): move up one frame in the current stack trace. For example, when post-mortem debugging, you'll start off on the lowest level of the stack and typically want to move up a few times to help figure out what went wrong.
  • d(own): move down one frame in the current stack trace.

Additional commands and tips:

  • pp <expression>: This will "pretty print" the result of the given expression using the pprint module. Example:
(Pdb) stuff = "testing the pp command in pdb with a big list of strings"
(Pdb) pp [(i, x) for (i, x) in enumerate(stuff.split())]
[(0, 'testing'),
 (1, 'the'),
 (2, 'pp'),
 (3, 'command'),
 (4, 'in'),
 (5, 'pdb'),
 (6, 'with'),
 (7, 'a'),
 (8, 'big'),
 (9, 'list'),
 (10, 'of'),
 (11, 'strings')]
  • !<python code>: sometimes the Python code you run in the debugger will be confused for a command. For example c = 1 will trigger the continue command. To force the debugger to execute Python code, prefix the line with !, e.g. !c = 1.

  • Pressing the Enter key at the (Pdb) prompt will execute the previous command again. This is most useful after the s/n/r/unt commands to quickly step through execution line-by-line.

  • You can run multiple commands on one line by separating them with ;;, e.g. b 8 ;; c.

  • The pdb module can take multiple -c arguments on the command line to execute commands as soon as the debugger starts.

Example:

python3 -mpdb -cc script.py # run the program without you having to enter an initial "c" at the prompt
python3 -mpdb -c "b 8" -cc script.py # sets a breakpoint on line 8 and runs the program

Restart behavior

Another thing that can shave time off debugging is understanding how pdb's restart behavior works. You may have noticed that after execution stops, pdb will give a message like, "The program finished and will be restarted," or "The script will be restarted." When I first started using pdb, I would always quit and re-run python -mpdb ... to make sure that my code changes were getting picked up, which was unnecessary in most cases.

When pdb says it will restart the program, or when you use the restart command, code changes to the script you're debugging will be reloaded automatically. Breakpoints will still be set after reloading, but may need to be cleared and re-set due to line numbers shifting. Code changes to other imported modules will not be reloaded — you will need to quit and re-run the -mpdb command to pick those up.

Watches

One feature you may miss from other interactive debuggers is the ability to "watch" a variable change throughout the program's execution. pdb does not include a watch command by default, but you can get something similar by using commands, which lets you run arbitrary Python code whenever a breakpoint is hit.

To watch what happens to the denom variable in our example program:

$ python3 -mpdb script.py
> ./src/script.py(1)<module>()
-> import random
(Pdb) b 9
Breakpoint 1 at ./src/script.py:9
(Pdb) commands
(com) silent
(com) print("DENOM: {}".format(denom))
(com) c
(Pdb) c
DENOM: 77
71 divided by 77 is 0.92
DENOM: 27
100 divided by 27 is 3.70
DENOM: 10
82 divided by 10 is 8.20
DENOM: 20
...

We first set a breakpoint (which is assigned ID 1), then use commands to start entering a block of commands. These commands function as if you had typed them at the (Pdb) prompt. They can be either Python code or additional pdb commands.

Once we start the commands block, the prompt changes to (com). The silent command means the following commands will not be echoed back to the screen every time they're executed, which makes reading the output a little easier.

After that, we run a print statement to inspect the variable, similar to what we might do when print debugging. Finally, we end with a c to continue execution, which ends the command block. Typing c again at the (Pdb) prompt starts execution and we see our new print statement running.

If you'd rather stop execution instead of continuing, you can use end instead of c in the command block.

Running pdb from the interpreter

Another way to run pdb is via the interpreter, which is useful when you're experimenting interactively and would like to drop into pdb without running a standalone script.

For post-mortem debugging, all you need is a call to pdb.pm() after an exception has occurred:

$ python3
>>> import script
>>> script.main()
17 divided by 60 is 0.28
...
56 divided by 94 is 0.60
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "./src/script.py", line 9, in main
    result = num / denom
ZeroDivisionError: division by zero
>>> import pdb
>>> pdb.pm()
> ./src/script.py(9)main()
-> result = num / denom
(Pdb) num
4
(Pdb) denom
0

If you want to step through normal execution instead, use the pdb.run() function:

$ python3
>>> import script
>>> import pdb
>>> pdb.run("script.main()")
> <string>(1)<module>()
(Pdb) b script:6
Breakpoint 1 at ./src/script.py:6
(Pdb) c
> ./src/script.py(6)main()
-> for i in range(num_loops):
(Pdb) n
> ./src/script.py(7)main()
-> num = random.randint(0, MAX)
(Pdb) n
> ./src/script.py(8)main()
-> denom = random.randint(0, MAX)
(Pdb) n
> ./src/script.py(9)main()
-> result = num / denom
(Pdb) n
> ./src/script.py(10)main()
-> print("{} divided by {} is {:.2f}".format(num, denom, result))
(Pdb) n
66 divided by 70 is 0.94
> ./src/script.py(6)main()
-> for i in range(num_loops):

This one is a little trickier than -mpdb because you don't have the ability to step through an entire program. Instead, you'll need to manually set a breakpoint, e.g. on the first statement of the function you're trying to execute.

Conclusion

Hopefully these tips have given you a few new ideas on how to use pdb more effectively. After getting a handle on these, you should be able to pick up the other commands and start customizing pdb via a .pdbrc file (example).

You can also look into other front-ends for debugging, like pdbpp, pudb, and ipdb, or GUI debuggers like the one included in PyCharm. Happy debugging!

Discover and read more posts from Steven Kryskalla
get started
post commentsBe the first to share your opinion
Narendra Kumar S S
4 years ago

Hi Steven,

This is a very useful post.
I am looking for one more thing. I have a Python script calling a function from C Library.
I want to know, how to debug the functions from C Library using this debugger?
How to set breakpoints in these C functions?
Please do suggest.

Regards,
Narendra

Steven Kryskalla
4 years ago

Hi Narendra. You need to use gdb (or whatever is used for C debugging on your platform of choice, e.g. Visual Studio on Windows) to debug C extensions and core cpython code.

Here’s an example. Say I want to debug the StringIO code in Python 3, which is implemented in C. Let’s say I want to set a breakpoint on this method:

https://github.com/python/cpython/blob/4a21e57fe55076c77b0ee454e1994ca544d09dc0/Modules/_io/stringio.c#L345-L373

I will use this program to trigger the execution of this method:

from io import StringIO

def main():
    x = StringIO()
    x.write("hello world\n")
    x.seek(0)
    print(x.readline())

if __name__ == "__main__":
    main()

If I have gdb and Python 3 installed (with debug symbols) I can set a breakpoint on _stringio_readline, execute my program, have it break into the debugger when that method is executed, and then use gdb's commands and some Python extensions to step through the code, examine the contents of variables, grab a Python traceback, etc.

$ gdb python3
GNU gdb (Ubuntu 8.3-0ubuntu1) 8.3
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from python3...
Reading symbols from /usr/lib/debug/.build-id/68/345a54ae0e51abc3a24750d0ad5ee6620525dd.debug...
(gdb) b _stringio_readline
Breakpoint 1 at 0x5ee7c0: file ../Modules/_io/stringio.c, line 352.
(gdb) run x.py
Starting program: /usr/bin/python3 x.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, _stringio_readline (self=0x7ffff7500410, limit=-1) at ../Modules/_io/stringio.c:352
352	../Modules/_io/stringio.c: No such file or directory.
(gdb) py-bt
Traceback (most recent call first):
  File "x.py", line 8, in main
    print(x.readline())
  File "x.py", line 11, in <module>
    main()
(gdb) step
355	in ../Modules/_io/stringio.c
(gdb) step
356	in ../Modules/_io/stringio.c
(gdb) step
359	in ../Modules/_io/stringio.c
(gdb) info locals
start = 0x972630
end = <optimized out>
old_char = <optimized out>
len = <optimized out>
consumed = 140737342082224
(gdb) p consumed
$1 = 140737342082224
(gdb) p limit
$2 = 26
(gdb) p self->string_size
$3 = 26
(gdb) p self->pos
$4 = 0
Chaim Koifman
6 years ago

Hi Steven,
Nice article about debugging Python.
I’m looking for a way to stop the program execution (and jump into pdb) for any exception (handled
& unhandled).
Thanks,
Chaim.

Steven Kryskalla
6 years ago

Hi Chaim. See the post-mortem debugging section of this article. Running your program with python -mpdb script.py then using the c command will run your program and drop you into a debugger prompt for unhandled exceptions.

Steven Kryskalla
6 years ago

Re: triggering the debugger on handled exceptions, can you explain more about why you want to do that? Exceptions are commonly used for control flow, so your program might handle hundreds or thousands of exceptions during normal execution (especially in the standard library and installed packages).

That said you could probably do it using the trace module. Here’s one way to display handled exceptions (this is a small program that imports a few modules from the standard library, then tries to run 1/0 and catch the ZeroDivisionError):

$ python3 -mtrace --trace x.py | grep "except "
re.py(277):     except KeyError:
sre_parse.py(237):         except IndexError:
re.py(277):     except KeyError:
sre_parse.py(237):         except IndexError:
re.py(277):     except KeyError:
sre_parse.py(237):         except IndexError:
functools.py(55):         except AttributeError:
functools.py(55):         except AttributeError:
functools.py(55):         except AttributeError:
functools.py(55):         except AttributeError:
functools.py(55):         except AttributeError:
x.py(6):     except ZeroDivisionError as exc:

You can see how the standard library raised and handled 11 exceptions before one line of my code even ran.

It might be possible to subclass trace.Trace to trigger the debugger on caught exceptions, but I would want to know more about why you want to do that before pursuing that.

Chaim Koifman
6 years ago

Hi Steven,
My problem is: I’m running a 3rd party Python packages (Django), which in turn calls other Python packages. I’m getting the following error from Django: TypeError: unsupported operand type(s) for &: ‘str’ and ‘int’. And, because of this error the application fails to start (open a web page). So, I want to know where the error comes from, so I can fix it of avoid it!!! Because it is handled exception, I want to force the debugger (pdb) to stop on ‘handled’ exception!

Steven Kryskalla
6 years ago

How are you getting that error? Is it in the console when you run Django’s runserver command, or is does it display in the browser from Django’s error page? Do you get a full traceback, or just that one “TypeError” line? I wouldn’t jump to the debugger yet, especially trying to debug all caught exceptions. I think there’s a simpler way to solve it. Let me know if you’d like to start a session so I can take a look at it.

Chaim Koifman
6 years ago

Hi Steven,
Thanks for your help. I finally found the problem. It came from ‘flup’ package. After I downgrade the ‘flup’ version from 1.0.3 to 1.0.2 the problem disappeared. I run the program via debugger, and before starting the Django service I set the ‘option.traceback’ to 1 (in the pdb) and that cause the traceback to be printed. (BTW, I assume that there should be a configuration parameter to set it, but I didn’t fine it yet.)
Anyway, for the next time…I still interested in a way to enforce the PDB to stop (or hook) on any exception (handled & unhandled). I think that it can be very useful to debug problems like this.
Thanks,
Chaim.

Show more replies