Monday, 19 June 2017

External Entity Vulnerability in Expat 2.2.0 And Earlier

(First posted on the LibExpat Documentation Site)

I'm in the intriguing position of being employed to work on a free software project by the Linux Foundation.  I'm expanding the test coverage of the Expat library under the eagle eye of maintainer Sebastian Pipping, which is an excellent way of discovering how the brilliant, twisted and almost entirely undocumented internals work.  (There will be articles.  Many articles.)

One thing you expect to come across when writing tests is the occasional bug.  I came across a serious one in the parsing of external parameter entities.  CVE-2017-9233 (to give it its formal identification) says that bad XML in an external entity will cause the parser to go into an infinite loop and never return control to the application.

For those of you who haven't been saturated in XML terminology for the last however long, an example is in order.  Suppose you have the following trivial piece of XML:

    <!DOCTYPE doc SYSTEM "">

And the DTD level1.dtd that it reads:

    <!ELEMENT doc EMPTY>
    <!ENTITY % e SYSTEM "">

And the external entity definition in yet another resource, level2.ent:


Now a quick riffle through the XML standards will tell you that ordinary elements such as <el> aren't allowed in DTDs.  All you can legally use are the DTD declarations <!ELEMENT>, <!ATTLIST>, <!ENTITY> and <!NOTATION>.  When our entity %e in level.dtd gets substituted in, it will put the <el/> element straight into the DTD, leaving us with a malformed DTD.  The parser should detect this and reject the whole shebang.

What actually happened is hard to follow in the source code.  (Many articles.)  When parsing reached the 'e' of <el/>, the tokenizer recognised it as a valid start of an ordinary element (which is true) and returned the appropriate value, XML_TOK_INSTANCE_START.

The calling code knew how to recognise the limited number of tokens that are legal and most of the tokens that aren't legal in a DTD, but unfortunately it missed this specific case.  Without any better instructions from its decision logic, the code assumed that it had found something valid and tried tokenising again, in case there was more text to be parsed.

It did this expecting that its internal pointers were updated to point to that unparsed text.  That would have been true if it had dealt with something legal, but since "<e" isn't a legal part of a DTD the pointers had been left alone.  The tokenizer therefore saw "<e" again, returned XML_TOK_INSTANCE_START again, and the whole thing repeated until the application was killed.

In brief:
  • The lowest levels of the parser recognise "<e" as starting an element.
  • The next level up correctly doesn't recognise it as a valid case
  • ...but also fails to recognise it as an invalid case.
  • The parser assumes it was successful and tries again on the same string.

Why Do I Care?

"How does this affect me?" you may well ask.  "I don't use Expat.  I don't even write in C.  How can I possibly be affected?"  The answer is, you may well be using Expat unknowingly.  There are wrapper libraries for Python, Java and many other languages for Expat.  Many other application and libraries (particularly C and C++ libraries like Poco, libDOM or libwww) use Expat under the hood.

The libraries often enable external entity parsing, which does potentially allow this bug to bite your application.  Some of them will, or used to, download the URIs for you, which is a serious problem.  Some of the applications are just parsing local configuration files, but even those might follow a DTD if one was
inserted into the configuration file somehow.  Other applications explicitly use external DTDs, leaving you vulnerable if those sites are compromised or malicious.

In short, it's entirely possible for you to be using Expat and not know it.

What Should I Do?

First and most obviously, upgrade.  The current version of libexpat has patched this bug, and a few other things; read the change log for details.

The other thing you should always do is consider how you use your XML parser.  I am not a security expert by any stretch of the imagination, but even I know not to download arbitrary URIs, for example.  Reading "" is probably safe; reading "" probably isn't.  You should check that any library you use doesn't automatically download arbitrary URIs for you, and you should be careful about what URIs you do allow to be downloaded.


If you want a moral from this article: always test.  Murphy's Law ensures there will always be bugs in your code, and you should make at least some effort to find them before they annoy other people.  And if by chance you should be one of those people annoyed by a bug, let the author know.  He or she will usually be glad for the feedback and want it fixed as much as you do!

Thursday, 17 November 2016

Join us!

Just a heads-up to say that we'll be at the Cambridge University Computer Lab recruiting event tomorrow (um, today), showing off a few things we've knocked together and trying to persuade all you lovely people to come and work for us.

Do drop by if you're about - we will literally have cookies - and if you're after work or an internship, get in touch at .

In the meantime, here's a photo of an oscilloscope playing (3-d) pong:

Source code (for a RasPI with the scope tied to audio out and the AC decouplers removed) is at .

Sunday, 9 October 2016

Fun facts about ethernet debugging, number 3+4i in an occasional series

So, I've been writing a driver for an OS-less machine which uses (for reasons you will be relieved that I am not going to go into) a LAN9221 as its ethernet interface.

Now, this chip is moderately spiffy - in particular, it seems Microchip have now encountered every possible bus-related design cock-up and have a handy register ready for reversing out most of them. However, it requires quite careful FIFO handling and turns out to be exquisitely read-sensitive. So far, so good.

So, one thing I tried was doing a Tx test. Send a packet every couple of seconds, read status back out of the status FIFO, report. Check that all is well.

Now, I didn't have a LAN port that didn't have too much chatter on it, so I used an ASIX AX88178 I had lying around to do the capture (and incidentally, what is it with Linux desktops these days that they just can't seem to shut up? You no sooner plug something in than you have kilobits of traffic asking if, against all knowledge and probability, an ethernet interface with no IP is a link back to some mothership or other. Sheesh).

Anyway, you look at the trace in wireshark, and all is well, except that if you send:

0000  68 05 ca 1e 35 18 00 50 9a 00 00 00 08 00 01 02   h...5..P........
0010  03 04 05 76 40 30 00 00 40 00 0b 00 00 10 01 02   ...v@0..@.......
0020  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................
0030  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................

You get your original packet, but immediately after it, you also get:

0000  40 00 bf ff 68 05 ca 1e 35 18 00 50 9a 00 00 00   @...h...5..P....
0010  08 00 01 02 03 04 05 76 40 30 00 00 40 00 0b 00   .......v@0..@...
0020  00 10 01 02 03 04 05 06 07 08 01 02 03 04 05 06   ................
0030  07 08 01 02 03 04 05 06 07 08 01 02 03 04 05 06   ................
0040  07 08 01 02                                       ....

which is your packet, with 0x4000bfff prepended to it. No status word in the Tx status FIFO, no IRQ_SIS TXE bit, nothing. Weird, huh?

It's a replay, so can't be bad FIFO management (well, not obvious bad FIFO management), and it's not the CPU double-writing or you'd get word dups, not packet dups. If you vary the size of your packet, you find that the first byte is the length of the original, and the third and fourth byte are always something like 0xNfff where N seems to be something to do with the top nybble of your packet length.

So, you try sending two packets back to back. Send:

0000  68 05 ca 1e 35 18 00 50 9a 00 00 00 08 00 01 02   h...5..P........
0010  03 04 05 77 40 30 00 00 40 00 0c 00 00 10 01 02   ...w@0..@.......
0020  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................
0030  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................

0000  68 05 ca 1e 35 18 00 50 9a 00 00 00 08 00 01 02   h...5..P........
0010  03 04 05 78 40 30 00 00 40 00 0c 00 00 10 01 02   ...x@0..@.......
0020  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................
0030  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................


0000  40 00 bf ff 68 05 ca 1e 35 18 00 50 9a 00 00 00   @...h...5..P....
0010  08 00 01 02 03 04 05 77 40 30 00 00 40 00 0c 00   .......w@0..@...
0020  00 10 01 02 03 04 05 06 07 08 01 02 03 04 05 06   ................
0030  07 08 01 02 03 04 05 06 07 08 01 02 03 04 05 06   ................
0040  07 08 01 02 40 00 bf ff 68 05 ca 1e 35 18 00 50   ....@...h...5..P
0050  9a 00 00 00 08 00 01 02 03 04 05 78 40 30 00 00   ...........x@0..
0060  40 00 0c 00 00 10 01 02 03 04 05 06 07 08 01 02   @...............
0070  03 04 05 06 07 08 01 02 03 04 05 06 07 08 01 02   ................
0080  03 04 05 06 07 08 01 02                           ........

Sometimes you get two of these curious runt packets, and sometimes one. Awooga! You then spend all sodding night debugging the cursed thing, convinced that your MMU has somehow rewritten the FIFO, or you've accidentally written it a negative length and it's wrapped, or that you're trying to send a packet in the middle of a reset.

Anyway, finally, in desperation, you plug your ASIX adaptor into a different machine, running kernel 3.13, rather than the 3.11 (sheesh, that old?) on your original box and you get your original packets. Two at a time, all fine and dandy - and my onboard adaptors on two different laptops seem to agree with that.

It seems that older ASIX drivers and/or 3.11 will insert comedy packets into your wireshark captures, for fun and profit. Now, the reason I switched to the ASIX in the first place was because I was seeing these runt packets on another interface, so I'm not sure if I blame the ASIX driver or not. My desktop has both 802.11q and IPv6 enabled, so it may be that some component of the network stack is simply failing to cope with an attempt to configure 802.11q and IFF_PROMISC at the same time. Be warned!

Hopefully, if you found this post via google at 3am, you can now change adaptor, go to bed and sleep the sleep of the justly offended.

Gah. Onward to lwip! (which is at least reasonably well-behaved)

Friday, 29 July 2016

Console application for MTAPI

No, this is still not the power measurement post.  Richard will get around to it soon, honest.

We spend a lot of time playing around with TI's CC2538 Zigbee chip running Z-Stack, which means that we spend a lot of time programming other embedded chips to talk TI's Monitor and Test API serial protocol to it.  Needless to say, debugging at more than one CPU's remove can be a tedious exercise, and there's nothing (that I know about, at least) that will allow you to talk MTAPI to a CC2538 from your nice, comfortable Linux development environment.

Enter MTConsole, in the repository.  In a fit of enthusiasm, and a strong desire to be able to script tests, I've put together a Python program to translate to and from MTAPI.  It is limited at the moment, and some areas are, to be blunt, not very pretty.  It falls over with an exception on a parse error, for example, because that was what I wanted when testing the parser.  That will get visited with fire and the sword when it first annoys me in real use.

So far I've put a lot of effort into parsing binary input (from the CC2538) into text, and not much into parsing text into MTAPI commands.  That will be changing as I need more commands (fire and the sword, people), but at the moment it's not useful for much more than proving that your chip is up and talking to the outside world.  Useful as that is, I would eventually like to get to the point where I can script it well enough to mock up complex command sequences.

Please feel free to grab and use as the mood takes you.  Patches, bug reports, comments and suggestions are always welcome, just bring your own sword.

Wednesday, 20 July 2016

Hacking Wireshark for fun and profit

We've been working on some low-power Zigbee sensors recently, based on CC2538 / ZStack 1.2.2 and running Zigbee HA 1.2.1 ; whilst we were doing this it occurred to us that Wireshark could usefully decode a few more of the IAS Zigbee messages into something useful.

So we did that.

It also turns out that some of our clients (and compliance test houses) use the very nice Ubiqua protocol analyser tool

However, we are a Wireshark shop.

Step up Vadim , who contributed a bunch of patches in Wireshark bug 7426. Those had rotted a bit, so I resurrected them and the upshot is that the repository now contains a bunch of things that Zigbee hackers might find fun:

  • Better decoding for Zigbee IAS messages - from Rhodri James.
  • Support for CUBX, TI SmartRF Studio and Ember Insight Desktop file formats as per Vadim's patch.
  • Support for more recent Ubiqua 3 file formats (at least, on the traces I have here).
  • A nasty backdoor mechanism so that you can decode Ubiqua traces which don't contain the TC key transport packet in the trace (Ubiqua stashes this in a separate table, and we pass it round the back to the Zigbee packet dissector).

One day I will get around to trying to push this lot upstream, but I suspect we will want a better way to do the backdoor key transport than the ugly hack I have in there at the moment.

Anyway, if you feel minded, grab it, enjoy and do report any bugs you come across (and I will do another post on power measurement for low-power radio, honest).

Tuesday, 12 April 2016

It's been a while ..

It looks as though the last post on this blog was waaay back in 2013, which goes to show how busy we've been. So, what's happened between then and now? Well, our open source stuff has migrated to github - . There you'll find a bunch of stuff you might find useful, including:
  • The venerable tstools and muddle.
  • Our local variants of ccsniffpiper - a program which allows you to sniff Zigbee with a TI CC2531 USB dongle, and wireshark - which contains some patches from Rhodri that help decode Zigbee HA IAS ACE commands in a more friendly way.
  • The current version of kbus - still 10% the size of kdbus :-)
  • upc2 - an easy to use, easy to cross-compile, terminal program that can speak xmodem and Andrew Gordon's semi-proprietory grouch protocol.
And lately we've been playing with some automotive powertrain stuff, some very low power Zigbee sensors based on TI ZStack Home 1.2.2a (under some circumstances, you can get battery life projections as long as the shelf life of the batteries), BT LE dynamic advertising with Android, and TI's newer, even lower power CC2650 - but those will have to be the subject of their own posts.

Wednesday, 9 October 2013

Changing the "theme" in Enthought Traits UI with Qt as the backend

I'm currently writing code for a customer using Enthought Traits, and in particular Traits UI. Traits as a whole is a very nice toolkit for use in scientific programming, but you can find out about that by googling in the normal manner.

The specific issue, though, is  that we've moved to using pyside/Qt4 as the GUI backend, instead of wx(Python) - enabled, for instance, by doing:

        from traits.etsconfig.api import ETSConfig
        ETSConfig.toolkit = 'qt4'
    except ValueError:

Some things are done a little bit differently with the Qt backend, though, and it's not always obvious how. In particular, traits UI has always allowed "theming", and the normal way to do this with the traditional wx backend is using something like:

    from traitsui.api import Theme
    Item('move_to_loading_area_btn', item_theme=Theme('@std:BE5')),

 (note that this is not the syntax for the item_theme argument shown in the documentation, that doesn't seem to work - so this is also "a useful reference").

This doesn't work with the Qt backend. It turns out that the way to do it is to specify a Qt stylesheet, which can be done as a string:

    Item('move_to_loading_area_btn', style_sheet='* { color: red }'),

Useful Qt references are then:
(blogged here because no-one else seems to have described doing this)