On the 1st of January 2010, many German bank customers found that their banking smart cards had stopped working. Details of why are still unclear, but indications are that the cards believed that the date was 2016, rather than 2010, and so refused to process a transaction supposedly after their expiry dates. This problem could turn out to be quite expensive for the cards’ manufacturer, Gemalto: their shares dropped almost 4%, and they have booked a €10 m charge to handle the consequences.
These cards implement the EMV protocol (the same one used for Chip and PIN in the UK). Here, the card is sent the current date in 3-byte YYMMDD binary-coded decimal (BCD) format, i.e. “100101” on 1 January 2010. If however this was interpreted as hexadecimal, then the card will think the year is 2016 (in hexadecimal, 1 January 2010 should have actually been “0a0101″). Since the numbers 0–9 are the same in both BCD and hexadecimal, we can see why this problem only occurred in 2010*.
In one sense, this looks like a foolish error, and should have been caught in testing. However, before criticizing too harshly, one should remember that EMV is almost impossible to implement perfectly. I have written a fairly complete implementation of the protocol and frequently find edge cases which are insufficiently documented, making dealing with them error-prone. Not only is the specification vague, but it is also long — the first public version in 1996 was 201 pages, and it grew to 765 pages by 2008. Moreover, much of the complexity is unnecessary. In this article I will give just one example of this — the fact that there are nine different ways to encode integers.
Compliant implementations must be able to implement all of these encoding forms; the one used in any one place depends on context.
- TLV field lengths (< 128)
- When data is sent from the card to the terminal, they are encoded in tag-length-value format (TLV), inherited from ASN.1. The tag specifies the type, and the length specifies how many bytes follow in the value. When the length is less than 128 bytes, the single length-byte has the most-significant bit cleared, and bits 7–1 encode the length.
- TLV field lengths (≥ 128)
- When a TLV field value is 128 bytes or longer, the length is encoded using multiple bytes. The first byte has the most-significant bit set, and bits 7–1 encode the number of length bytes that follow. These are then interpreted as a big-endian integer.
- TLV tags (< 31)
- EMV tags are also variable length. For tags which are less than 31, a single byte is used with bits 5–1 encoding the tag number (bits 8–7 specifies the namespace of the tag, and bit 6 specifies whether the value is encoded in TLV too).
- TLV tags (≥ 31)
- When the tag is 31 or greater, bits 5–1 of the first byte are set to “11111” and the actual tag number is encoded in bits 7–1 of subsequent bytes. The last of these bytes has the most-significant bit cleared, whereas preceding ones have it set.
- Compact TLV
- Alternatively, data can be sent from the card to the terminal in compact TLV format. Here, both tag and length are combined into a single byte — bits 8–4 (after adding 0x40) becomes the tag, and bits 3–1 specifies the length. This is used for encoding data in the historical bytes of the answer-to-reset.
- Many data items are encoded as binary-coded decimal, with two digits per byte and left-padded with zeros (e.g. used for dates, amounts of currency, some protocol flags).
- Compressed numeric
- Some data items are encoded in “compressed” binary-coded decimal, with two digits per byte and right-padded with ‘F’s (e.g. used for account number, copy of track two magstrip data).
- Other data items are big-endian integer encoded (e.g. used for encoding the transaction counter, public key parameters, and some protocol flags). Fields are fixed-length, but not necessarily on byte-boundaries.
- Some integers are encoded as their corresponding ASCII representation, with one digit per byte (e.g. used for encoding the copy of track one magstrip data) .
Given this wide variety of subtly different encodings (and I have probably missed some), it is not that surprising that a smart card developer could slip up, as has appeared to have occurred with the German EMV cards. Handling variable-length integers is a frequent source of bugs (ASN.1 is a particularly bad offender in this regard). This also makes me wonder whether there are security vulnerabilities in the cards or terminals.
Update (2012-07-30): Indeed, it looks like there are vulnerabilities in terminals. Nils and Rafael Dominguez Vega from MWR Labs successfully took over Chip & PIN terminals using a specially designed smart card, and made them play a racing game (as well as capture card details and PINs). There aren’t any details yet on what vulnerability they exploited.
* Actually, while this scenario explains the wide-scale problems occurring in 2010, if the month and day were interpreted as hexadecimal too, then cards which expired in September–December, would have problems in their last year of validity. The wrap-around for the day will cause a problem for cards expiring in September, during 20–30 September, because the BCD for 20 September 2009 is “090920”, which is greater than 31 September 2009 in hexadecimal — “09091f”. Similarly, the month will become a problem for cards expiring in October–December, during 1 October — 31 December, because the BCD for 1 October 2009 is 091001, which is greater than the 31 October/November/December 2009 in hexadecimal — “090a1f”/”090b1f”/”090c1f”). So either these glitches occurred but were attributed to random failure, or there is something special about the handling of the year in the failing Gemalto implementation (perhaps related to a conversion between two and four digit years).