Series: IronPython And *

16 posts

IronPython and WPF Background Processing Revisited

Yesterday, I blogged about using decorators to indicate if a given function should execute on the UI or background thread. While the solution works, I wrote “I’m thinking there might be a way to use SynchronizationContext to marshal it automatically, but I haven’t tried to figure that out yet.” I had some time this morning so I figured out how to use SynchronizationContext instead of the WPF dispatcher.

Leslie Sanford wrote a pretty good overview, but the short version is that SyncContext is an abstraction for concurrency management. It lets you write code that is ignorant of specific synchronization mechanisms in concurrency-aware managed frameworks like WinForms and WPF. For example, while my previous version worked fine, it was specific to WPF. If I wanted to provide similar functionality that worked with WinForms, I’d have to rewrite my decorators to use Control.Invoke. But if I port them over to use SyncContext, they would work with WinForms, WPF and any other library that plugs into SyncContext.

SyncContext abstracts away both initially obtaining the sync context as well as marshaling calls back to the UI thread. SyncContext provides a static property to access  current context, instead of a framework specific mechanism like accessing the Dispatcher property of the WPF Window class. Once you have a context, you can call Send or Post to marshal the call back to the UI thread (Send blocks the calling thread, Post doesn’t).

With that in mind, here’s the new version of BGThread and UIThread. Slightly more complex, but still pretty simple clocking in at just under 30 lines.

def BGThread(fun):  
  def argUnpacker(args):  
    oldSyncContext = SynchronizationContext.Current
    try:
      SynchronizationContext.SetSynchronizationContext(args[-1])
      fun(*args[:-1])
    finally:
      SynchronizationContext.SetSynchronizationContext(oldSyncContext)

  def wrapper(*args):
    args2 = args + (SynchronizationContext.Current,)
    ThreadPool.QueueUserWorkItem(WaitCallback(argUnpacker), args2)

  return wrapper

def UIThread(fun):
  def unpack(args):  
    ret = fun(*args)
    if ret != None:
      import warnings
      warnings.warn(fun.__name__ + " function returned " + str(ret) + " but that return value isn't propigated to the calling thread")

  def wrapper(*args):
    if SynchronizationContext.Current == None:
      fun(*args)
    else:
      SynchronizationContext.Current.Send(SendOrPostCallback(unpack), args)

  return wrapper

In the BGThread wrapper, I add the current SyncContext to the parameter tuple that I pass to the background thread. Once on the background thread, I set the current SyncContext to the last element of the the parameter tuple then call the decorated function with the remaining parameters. (for the non pythonic: args[:-1] is Python slicing syntax that means “all but the last element of args”). Using a try/finally block is probably overkill – I expect the current SyncContext to be either None or leftover garbage – but the urge to clean up after myself is apparently much stronger on the background thread than it is in say my office. 😄

In the UIThread wrapper, I grab the current context and invoke the decorated method via the Send method. Like QueueUserWorkItem, SyncContext Send and Post only support a single parameter, so I use the same *args trick I described in my last post. (I changed the name to unpack in the code above for blog formatting purposes)

One major caveat about this approach is that there’s no way to return a value from a function decorated as UIThread. I understand why SyncContext.Post doesn’t return a value (it’s async) but SyncContext.Send is synchronous call, so why doesn’t it marshal the return value back to the calling thread? WPF’s Dispatcher.Invoke and WinForm’s Control.Invoke both return a value. I didn’t handle the return value in my original version of UIThread, but now that I’ve moved over to using SyncContext, I can’t. Not sure why the SyncContext is designed that way – seems like a design flaw to me. Since the return value won’t propagate, I sniff the result decorated function’s return value and raise a warning if it’s not None.

I’ve uploaded the SyncContext version to my SkyDrive in case you want the code for yourself. Note, I’ll thinking I’ll revise code this one more time – I want to rebuild the WPF version so that it propagates return values and picks up an dispatcher via Application.Current.MainWindow rather than having to have a dispatcher property on my class.

IronPython and WPF Part 5: Interactive Console

One of the hallmarks of dynamic language programming is the use of the interactive prompt, otherwise known as the Read-Eval-Print-Loop or REPL. Even though I’m building a WPF client application, I’d still like to have the ability to poke around and even modify the app as it’s running from the command prompt, REPL style.

If you work thru the IronPython Tutorial, there are exercises for interactively building both a WinForms and a WPF application. In both scenarios, you create a dedicated thread to service the UI so it can run while the interactive prompt thread is blocked waiting for user input. However, as we saw in the last part of this series, UI elements in both WinForms and WPF can only be accessed from the thread they are created on. We already know how to marshal calls to the correct UI thread – Dispatcher.Invoke. However, what we need is a way to intercept commands entered on the interactive prompt so we can marshal them to the correct thread before they execute.

Luckily, IronPython provides just such a mechanism: clr module’s SetCommandDispatcher. A command dispatcher is a function hook that gets called for every command the user enters. It receives a single parameter, a delegate representing the command the user entered. In the WPF and WinForms tutorials, you use this function hook to marshal the commands to the right thread to be executed. Here’s the command dispatcher from the WPF tutorial:

def DispatchConsoleCommand(consoleCommand):
    if consoleCommand:
        dispatcher.Invoke(DispatcherPriority.Normal, consoleCommand)

The dispatcher.Invoke call looks kinda like the UIThread decorator from the Background Processing part of this series, doesn’t it?

Quick aside: I looked at using SyncContext here instead of Dispatcher, since I don’t care about propagating a return value back to the interactive console thread. However, SyncContext expects a SendOrPostDelegate, which expects a single object parameter. The delegate passed to the console hook function is an Action with no parameters. I could have built a wrapper function that took a single parameter which it would ignore, but I decided it wasn’t worth it. The more I look at it, the more I believe SyncContext is a good idea with a bad design.

I wrapped all the thread creation and command dispatching into a reusable helper class called InteractiveApp.

class InteractiveApp(object):
  def __init__(self):
    self.evt = AutoResetEvent(False)

    thrd = Thread(ThreadStart(self.thread_start))
    thrd.ApartmentState = ApartmentState.STA
    thrd.IsBackground = True
    thrd.Start()

    self.evt.WaitOne()
    clr.SetCommandDispatcher(self.DispatchConsoleCommand)

  def thread_start(self):
    try:
      self.app = Application()
      self.app.Startup += self.on_startup
      self.app.Run()
    finally:
      clr.SetCommandDispatcher(None)

  def on_startup(self, *args):
    self.dispatcher = Threading.Dispatcher.FromThread(Thread.CurrentThread)
    self.evt.Set()

  def DispatchConsoleCommand(self, consoleCommand):
    if consoleCommand:
        self.dispatcher.Invoke(consoleCommand)

  def __getattr__(self, name):
    return getattr(self.app, name)

The code is pretty self explanatory. The constructor (__init__) creates the UI thread, starts it, waits for it to signal that it’s ready via an AutoResetEvent and then finally sets the command dispatcher. The UI thread creates and runs the WPF application, saves the dispatcher object as a field on the object, then signals that it’s ready. DispatchConsoleCommand is nearly identical to the earlier version, I’ve just made it an instance method instead of a stand-alone function. Finally, I define __getattr__ so that any operations invoked on InteractiveApp are passed thru to the contained WPF Application instance.

In my app.py file, I look to see if the module has been started directly or if it’s been imported into another module. If the module is run directly (aka ‘ipy app.py’) then the global __name__ variable will be ‘__main__’. In that case, we start the application up normally (i.e. without the interactive prompt) by just creating an Application then running it with a Window instance. Otherwise, we are importing this app into another module (typically, the interactive console), so we create an InteractiveApp instance and we create an easy to use run method that can create the instance of the main window.

if __name__ == '__main__':
  app = wpf.Application()
  window1 = MainWin.MainWindow()
  app.Run(window1.root)

else:  
  app = wpf.InteractiveApp()

  def run():
    global mainwin
    mainwin = MainWin.MainWindow()
    mainwin.root.Show()

If you want to run the app interactively, you simply import the app module and call run. Here’s a sample session where I iterate thru the items bound to the first list box. Of course, I can do a variety of other operations I can do such as manipulate the data or create new UI elements.

IronPython 2.0 (2.0.0.0) on .NET 2.0.50727.3053
>>> import app
>>> app.run()
#at this point the app window launches
>>> for i in app.mainwin.allAlbumsListBox.Items:
...     print i.title
...
Harvest Festivals
Mrs. Gardner's Art
Riley's Playdate
August 13
Camp Days
July 14
May Photo Shoot
Summer Play 2006
Lake Washington With The Gellers
Camp Pierson '06
January 28

One small thing to keep in mind: if you exit the command prompt, the UI thread will also exit since it’s marked as a background thread. Also, it looks like you could shut the client down then call run again to restart it, but you can’t. If you shut the client down, the Run method in InteractiveApp.thread_start exits, resets the Command Dispatcher to nothing and the thread terminates. I could fix it so that you could run the app multiple times, but I find I typically only run the app once for a given session anyway.

IronPython and Linq to XML Part 1: Introduction

Shortly after I joined the VS Languages team, we had a morale event that included a Rock Band tournament. I didn’t play that day in the tournament since I had never played before, but I was hooked just the same. I got Rock Band for my birthday, Rock Band 2 shortly after it came out in September and I’m hoping to get the AC/DC Track Pack for Christmas.

There are lots of songs available for Rock Band – 461 currently available between on-disc and downloadable tracks – with more added every week. Frankly, there’s lots of music on that list that I don’t recognize. Luckily, I’m also a Zune Pass subscriber, so I can go out and download all the Rock Band tracks and listen to them on my Zune. But who has time to manually search for 461 songs? Not me. So I wrote a little Python app to download the list of Rock Band songs and save it as a Zune playlist.

I ended up use Linq to XML very heavily in this project. Zune playlists use the same XML format as Windows playlists, Zune exposes the backend music catalog via a Atom feeds and I used Chris Lovett’s SgmlReader to expose the HTML list of Rock Band songs as XML. I realize Linq to XML wasn’t on “the list”, but I had a specific need so it got bumped to the head of the line.

BTW, for those who just want the playlist, I stuck it on my Skydrive. Unfortunately, there’s no Skydrive API right now, so I can’t automate uploading the new playlist every week. If anyone has alternative suggestions or a way to programmatically upload files to SkyDrive, let me know.

IronPython and Linq to XML Part 2: Screen Scraping

First, I need to convert the HTML list of Rock Band songs into a machine readable format. That means doing a little screen scraping. Originally, I used Beautiful Soup but I found that UnicodeDammit got confused on names like Blue Öyster Cult and Mötley Crüe. I’m guessing it’s broken because IronPython doesn’t have non-unicode strings.

Instead, I used SgmlReader to provide an XmlReader interface over the HTML, then queried that data via Linq to XML. I used the version of SgmlReader from MindTouch since they include a compiled binary and it seems to be the only active maintained version. I wrapped it all up in a function called load that loads HTML from either disk or the network (based on the URI scheme) into an XDocument.

def loadStream(streamreader):
  from System.Xml.Linq import XDocument
  from Sgml import SgmlReader

  reader = SgmlReader()
  reader.DocType = "HTML"
  reader.InputStream = streamreader
  return XDocument.Load(reader)

def load(url):
  from System import Uri
  from System.IO import StreamReader

  if isinstance(url, str):
    url = Uri(url)

  if url.Scheme == "file":
    from System.IO import File
    with File.OpenRead(url.LocalPath) as fs:
      with StreamReader(fs) as sr:
        return loadStream(sr)
  else:
    from System.Net import WebClient
    wc = WebClient()
    with wc.OpenRead(url) as ns:
      with StreamReader(ns) as sr:
        return loadStream(sr)

def parse(text):
  from System.IO import StringReader
  return loadStream(StringReader(text))

I call load, passing in the URL to the list of songs. The “official” Rock Band song page loads the actual content from a different page via AJAX, so I just load the actual list directly via my load function.

Once the HTML is loaded as an XDocument, I need a way to find the specific HTML nodes I was looking for. As I said earlier, XDocument uses Linq to XML – there is not other API for querying the XML tree. In the HTML, there’s a div tag with the id “content” that contains all the song rows as table row elements. I built a simple function that uses the LINQ Single method to find the tag by it’s id attribute value.

def FindById(node, id):
  def CheckId(n):
    a = n.Attribute('id')
    return a != None and a.Value == id

  return linq.Single(node.Descendants(), CheckId)

(Side note – I didn’t like the verbosity of the a != None and a.Value == id line of code, but XAttributes are not comparable by value. That is, I can’t write node.Attribute('id') == XAttribute('id', id). And writing ``node.Attribute('id').Value == id11 only works if every node has an id attribute. Not making XAttribute comparable by value seems like a strange design choice to me.)

LINQ to objects works just fine from IronPython, with a few caveats. First, IronPython doesn’t have extension methods, so you can’t chain calls together sequentially like you can in C#. So instead of collection.Where(…).Select(…), you have to write Select(Where(collection, …), …). Second, all the LINQ methods are generic, so you have to use the verbose list syntax (for example: Single[object] or Select[object,object]). Since Python doesn’t care about the generic types, I wrote a bunch of simple helper functions around the common LINQ methods that just use object as the generic type. Here are a few examples:

def Single(col, fun):
  return Enumerable.Single[object](col, Func[object, bool](fun))

def Where(col, fun):
  return Enumerable.Where[object](col, Func[object, bool](fun))

def Select(col, fun):
  return Enumerable.Select[object, object](col, Func[object, object](fun))

Once I have the content node, all the songs are in tr nodes beneath it. I wrote a function called ScrapeSong that transforms a song tr node into a Song object (which I’ll talk about in the next installment of this series). I use LINQ methods Select, OrderBy and ThenBy to provide me an enumeration of Song objects, ordered by date added (descending) than artist name.

def ScrapeSong(node):
  tds = list(node.Elements(xhtml.ns+'td'))
  anchor = list(tds[0].Elements(xhtml.ns+'a'))[0]

  title = anchor.Value
  url = anchor.Attribute('href').Value
  artist = tds[1].Value
  year = tds[2].Value
  genre = tds[3].Value
  difficulty = tds[4].Value
  _type = tds[5].Value
  added = DateTime.Parse(tds[6].Value)

  return Song(title, artist, added, url, year, genre, difficulty, _type)

songs = ThenBy(OrderByDesc(
          Select(content.Elements(xhtml.ns +'tr'), ScrapeSong),
          lambda s: s.added), lambda s: s.artist)

And that’s pretty much it. Next, I’ll iterate thru the list of songs and get the details I need from Zune’s catalog web services in order to write out a playlist that the Zune software will understand.

IronPython and Linq to XML Part 3: Consuming Atom Feeds

Now that I have my list of Rock Band songs, I need to generate a Zune playlist. I wrote that Zune just uses the WMP playlist format, but that’s not completely true. Media elements in a Zune playlist have several attributes that appear unique to Zune.

Because of Zune Pass, Zune supports the idea of streaming playlists where the songs are downloaded on demand instead of played from the local hard drive. In order to enable this, media elements in Zune playlists can have a serviceID attribute, a GUID that uniquely identifies the song on the Zune service. We also need the song’s album and duration – the Zune software summarily removes songs that don’t include the duration.

Of course, the Rock Band song list doesn’t include the Zune song service ID. It also doesn’t include the song’s album or duration. So we need a way, given the song’s title and artist (which we do have) to get its album, duration and service ID. Luckily, the Zune service provides a way to do exactly this, albeit an undocumented way. Via Fiddler2, I learned that Zune exposes a set of Atom feed web services on catalog.zune.net that the UI uses when you search the marketplace from the Zune software. There are feeds to search by artist and by album but the one we care about is the search by track. For example, here’s the track query for Pinball Wizard by The Who.

Since these feeds are real XML, I can simply use XDocument.Load to suck down the XML. Then I look for the first Atom entry element using similar LINQ to XML techniques I wrote about last time. If there’s no Atom elements, that means that the search failed – either Zune doesn’t know about the song or it can’t find it via the Rock Band provided title and artist. Of the 461 songs on Rock Band right now, my script can find 417 of them on Zune automatically.

Of course, since the Zune data is in XML instead of HTML, finding the data I’m looking for is much easier that it was to find the Rock Band song data. Here’s the code pull the relevant information out of the Zune catalog feed that we need.

def ScrapeEntry(entry):
  id = entry.Element(atomns+'id').Value  
  length = entry.Element(zunens+'length').Value  

  d = {}  
  d['trackTitle'] = entry.Element(atomns+'title').Value  
  d['albumArtist'] = entry.Element(zunens+'primaryArtist').Element(zunens+'name').Value  
  d['trackArtist'] = d['albumArtist']  
  d['albumTitle'] = entry.Element(zunens+'album').Element(zunens+'title').Value  

  if id.StartsWith('urn:uuid:'):  
    d['serviceId'] = "{" + id.Substring(9) + "}"  
  else:  
    d['serviceId'] = id  

  m = length_re.Match(length)  
  if m.Success:  
    min = int(m.Groups[1].Value)  
    sec = int(m.Groups[2].Value)  
    d['duration'] = str((min * 60 + sec) * 1000)  
  else:  
    d['duration'] = '60000'  

  return d  

trackurl = catalogurl + song.search_string
trackfeed = XDocument.Load(trackurl)  
trackentry = First(trackfeed.Descendants(atomns+'entry'))  
track = ScrapeEntry(trackentry)

A few quick notes:

  • song.search_string returns the song title and artist as a plus delimited string. i.e. pinball+wizard+the+who. However, many Rock Band songs end in a parenthetical like (Cover Version) so I automatically strip that off for the search string
  • duration in the Atom feed is stored like PT3M23S, which means the song is 3:23 long. The playlist file expect the song length in milliseconds, so I use a .NET regular expression to pull out the minutes and seconds and do the conversion. It’s not exact – songs lengths usually aren’t exactly a factor of seconds, but as far as I can understand, Zune just uses that to display in the UI – it doesn’t affect playback at all.

Now I have a list of songs with all the relevant metadata, next time I’ll write it out into a Zune playlist file.

← All series