Now that we have a debugger process up and
running,
let start adding some actual features. First up, we want to be able to
set breakpoints. One of the nice things MDbg does is auto-set a
breakpoint on the entrypoint function. For ipydbg, we’re going to
auto-set a breakpoint on the first line of the python file being
debugged.
In order to set a breakpoint, we need debugger symbols. They allow us to
translate between “line one of simpletest.py” and the actual location in
the code and back. We’re all used to seeing the PDB files that are
produced when we compile a C# assembly. Unsurprisingly, the symbol
store binder
provides a method to load these PDB files from
disk. But where
do IronPython debug symbols come from? I know from my extensive reading
of the ipy.exe command line parameters that you pass –D to enable
application debugging, but since all the IL is being generated in
memory, how does the debugger get access to the PDB files?
It turns out the debugger API includes a UpdateModuleSymbols callback
method that the
runtime uses to notify the debugger when the symbols change. The
debugger symbols are provided in an IStream, and then you use the
symbol binder to
get a symbol
reader.
The .NET Framework already provides a managed
API
for reading and writing debug symbols. However, that API doesn’t support
loading symbols from a stream, so the MDbg code includes it’s own
wrapper
around the symbol binder API to include that functionality. Here’s some
code to get the debug symbol reader for an updated module and iterate
through the associated files:
sym_binder = SymbolBinder()
def OnUpdateModuleSymbols(s,e):
print "OnUpdateModuleSymbols"
metadata_import = e.Module.GetMetaDataInterface[IMetadataImport]()
reader = sym_binder.GetReaderFromStream(metadata_import, e.Stream)
for doc in reader.GetDocuments():
print "t", doc.URL
process.OnUpdateModuleSymbols += OnUpdateModuleSymbols
If we run this version of
ipydbg
on simpletest.py with the IPy 2.0.1
release
and the Python standard library installed, OnUpdatedModuleSymbols gets
called six times, once for each python file that gets loaded when
simpletest runs. (site.py, os.py, ntpath.py, stat.py, UserDict.py and
simpletest.py). BTW, I tried running this code on the latest build of
IPy
(changeset
47624)
and I’m getting a COM Interop exception. So for now, stick with 2.0.1.
Now that we can get these dynamically generated debug symbols, we can
use them to create a breakpoint on the first line of the script being
debugged. Everytime OnUpdateModuleSympols is called, I try to bind the
initial breakpoint (unless it’s already been bound of course) by calling
the following create_breakpoint function.
def create_breakpoint(doc, line, module, reader):
line = doc.FindClosestLine(line)
method = reader.GetMethodFromDocumentPosition(doc, line, 0)
function = module.GetFunctionFromToken(method.Token.GetToken())
for sp in get_sequence_points(method):
if sp.doc.URL == doc.URL and sp.start_line == line:
bp = function.ILCode.CreateBreakpoint(sp.offset)
bp.Activate(True)
return bp
bp = function.CreateBreakpoint()
bp.Activate(True)
return bp
This code translates a given document/line into a function/offset where
we can set a breakpoint. To do this, we use sequence points which as
per Rick
Byers
are “used to mark a spot in the IL code that corresponds to a specific
location in the original source”. So once we find the function that
corresponds to a given line of code, we iterate over the sequence points
until we find the one that matches the line we want to break on. If we
find a matching sequence point, we set the breakpoint there. If we
don’t, we set the breakpoint on the function itself.
get_sequence_points is a simple wrapper around ISymbolMethod
GetSequencePoints.
The original API is pretty ugly to use – managing six separate arrays of
information – so get_sequence_points turns it into a generator
function you can iterate over.
Now that the breakpoint is set, we want to trap the breakpoint event as
well. That’s easy enough, we create an event handler for
process.OnBreakpoint similar to the OnUpdateModuleSymbols event above.
Eventually, we’ll have the ability to step when we break, but for now
I’m just going to print out the current location when the breakpoint is
hit. This is kind of the reverse of the operation above. Setting a
breakpoint means going from a source location to an IL offset within a
function. Printing the current location means going from an IL offset in
a function back to the source location. Here’s the function to do that:
def get_location(reader, thread):
frame = thread.ActiveFrame
function = frame.Function
offset, mapping_result = frame.GetIP()
method = reader.GetMethod(SymbolToken(frame.Function.Token))
real_sp = None
for sp in get_sequence_points(method):
if sp.offset > offset:
break
if sp.start_line != 0xfeefee:
real_sp = sp
if real_sp == None:
return "Location (offset %d)" % (offset)
return "Location %s:%d (offset %d)" % (
Path.GetFileName(real_sp.doc.URL), real_sp.start_line, offset)
def OnBreakpoint(s,e):
print "OnBreakpoint", get_location(
symbol_readers[e.Thread.ActiveFrame.Function.Module], e.Thread)
Given a symbol reader and a debug thread, get_location returns a
location string. It loops thru the sequence points, similar to
create_breakpoint, in order to find the closest corresponding line of
python code to the current offset (check out Mike Stall’s blog as for
why I’m checking for
0xfeefee).
In order to make this work, I need the symbol reader for the module that
I retrieved in OnUpdateModuleSymbols. For now, I’m stashing the reader
in a global dictionary keyed by the module named symbol_readers where
OnBreakpoint can access it.
Ipydbg isn’t interactive yet, but it is now running, setting a
breakpoint and successfully breaking at that breakpoint. As usual, the
latest version of
ipydbg
is up on GitHub.