The role of software engineering in electronic elections

Many designs for trustworthy electronic elections use cryptography to assure participants that the result is accurate. However, it is a system’s software engineering that ensures a result is declared at all. Both good software engineering and cryptography are thus necessary, but so far cryptography has drawn more attention. In fact, the software engineering aspects could be just as challenging, because election systems have a number of properties which make them almost a pathological case for robust design, implementation, testing and deployment.

Currently deployed systems are lacking in both software robustness and cryptographic assurance — as evidenced by the English electronic election fiasco. Here, in some cases the result was late and in others the electronic count was abandoned due to system failures resulting from poor software engineering. However, even where a result was returned, the black-box nature of auditless electronic elections brought the accuracy of the count into doubt. In the few cases where cryptography was used it was poorly explained and didn’t help verify the result either.

End-to-end cryptographically assured elections have generated considerable research interest and the resulting systems, such as Punchscan and Prêt à Voter, allow voters to verify the result while maintaining their privacy (provided they understand the maths, that is — the rest of us will have to trust the cryptographers). These systems will permit an erroneous result to be detected after the election, whether caused by maliciousness or more mundane software flaws. However should this occur, or if a result is failed to be returned at all, the election may need to fall back on paper backups or even be re-run — a highly disruptive and expensive failure.

Good software engineering is necessary but, in the case of voting systems, may be especially difficult to achieve. In fact, such systems have more similarities to the software behind rocket launches than more conventional business productivity software. We should thus expect the consequential high costs and, despite all this extra effort, that the occasional catastrophe will be inevitable. The remainder of this post will discuss why I think this is the case, and how manually-counted paper ballots circumvent many of these difficulties.

I think the most significant challenges in electronic elections come from the nature of deployment. The election date is immovable and ready or not, the software must be deployed then. The 1995 Sandish Report found that only 16.2% of IT projects were delivered on-time and on-budget, which is representative of the situation both before and since. Re-use of election software can help, but different regions and countries have different requirements and they change over time. The US has write-in votes, ballot papers in the UK must be retained after the election, linked to the voters name, and Scotland introduced STV this year. These customizations need to be implemented and tested in time for the election.

Another factor is that in the long gap between elections, staff with experience of previous elections will move on and know-how will be lost. The resulting unfamiliarity increases the risk of mistakes, and nobody might remember the previous problems and how they could be worked-around or prevented. In the Bedford e-counting trial, a significant source of problems was in the production of ballot papers (wrong size, wrong ink and tended to tear). No doubt, someone at the contractor was given into trouble for that, but when the next election comes in three years, will there be anyone who remembers?

Furthermore, hardware, operating systems and middleware will evolve between elections so the vote counting software will need to be adapted. The cost of this should not be underestimated — one survey reported that adaptation to new platforms accounted for 18% of software maintenance. All these changes, as well as ones due to changing requirements, must be tested, but the cost of performing a full system test, with a realistic number of votes, voters and staff, would be prohibitive. Instead, only unit tests and small integration tests are feasible, which risk missing feature interactions, race conditions and scaling problems. The last two appeared to be behind the delayed Bedford elections.

Another case where full testing is costly, deployments infrequent and failure expensive is rocket-launch control software. These are developed using expensive, high-integrity software development methods. This involves robust programming techniques, extensive testing and use of reliable hardware components (which also typically come with extended manufacturer support, to reduce the maintenance costs discussed above). Despite these measures, failures do occur. One well known example is the Ariane 5 Flight 501. The details of the failure are not relevant here, but testing did not catch the problem, and the reasons behind this also apply to voting systems.

Every components in Ariane was tested individually, but the failure occurred because of a interaction between two components and high g-forces which could not be repeated outside of a test-flight. Even simulating the input from the accelerometer would be costly, so the decision was made to rely on the test results from Ariane 4, which had a lower acceleration. When exposed to the Ariane 5 flight profile a software component failed, which was non-critical in itself, but the knock-on effect caused the destruction of the rocket. This closely matched the Bedford experience, where the voting system passed a small scale test but, when faced with a high number of manually adjudicated votes (due partially to paper problems), first slowed down then exhibited failures.

Where operators are under stress from dangerous events occurring, they make errors in judgement around 20–30% of the time. Elections are also stressful, and this increases the probability of mistakes. Moreover, in the case of e-counting, the processing is often in the night following the election, and run without breaks hence causing operator fatigue, further increasing the error rate. Operators are also inexperienced, because elections are infrequent. If exceptional events occur they have no experience to draw from, and so are more likely to make the wrong decision. Usability is thus even more critical, yet electronic elections are more complex — in Punchscan the poll staff must follow around 16 steps per ballot, rather than the 3 or 4 for UK paper elections. In one demonstration I saw, even a designer of the system performed two critical actions in the wrong order.

Finally, all of this assumes only accidents, but elections are subject to attack. Murphy has proved more than capable of disrupting the English election trials, but what happens if someone is malicious. The cryptography will prevent them from altering the result undetected, but if they can hack into the computers, disrupt the communications or destroy critical infrastructure, the entire election could be halted. Backups can help, but as any experienced sysadmin can tell you, good backups are expensive and even then failures do occur.

These factors will result in electronic voting systems being unreliable, and the cryptographic solutions will only make it seem worse because wrong results will be detected. For example, the Breckland e-counting system seemed to be working until a manual re-count discovered the computer had lost 368 ballots. Expensive high-integrity software development practices will reduce, but not eliminate these problems. One alternative is to remain with paper ballots and manual counting, but these come with problems too. However, I argue that they have advantages when considered from a “software” engineering perspective.

It’s hard to perform non-reversible actions with paper by accident, and it gets harder with scale, whereas accidentally deleting or corrupting all files on a network filesystem, rather than one, could be the matter of an extra space character, whether the result of a slip when entering commands manually, or hidden in the depths of an unexercised, unexamined code path. Accidentally damaging a room full of paper is harder than the equivalent number of electronic records.

Paper wins when it comes to the principle of least astonishment — your average poll worker understands how paper behaves, but even experts are regularly caught out by unexpected computer behaviour. This factor, coupled with the fact that humans are adaptable, makes it far easier to change procedures in an manual count, rather than in electronic ones. In response to unexpected circumstances, for example voters filling in the ballot incorrectly, an announcement can be made on how to treat this case. In contrast, making an equivalent change to the software, without the opportunity for even cursory testing, risks introducing new bugs and could harm the integrity of the election.

In summary, cryptographically verifiable electronic elections have advantages — they have the potential to run more complex voting systems, such as Condorcet, speed up counting and give voters better assurances that their vote has been counted. However, the involvement of computers introduces complexity and the consequent higher risk of failure. Spending more on development can mitigate this problem, but paper votes and manual counting side-steps many of the risk factors, is transparent and robust, so is an option that should not be discarded solely in the interest of apparent modernization.

15 thoughts on “The role of software engineering in electronic elections

  1. I agree absolutely. The community up to now has tended to ignore the issues of robust engineering, recovery procedures etc. I suppose regarding them as somewhat secondary to the challenge of coming up with voter-friendly, verifiable, trustworthy schemes. I think that we are now getting close to achieving the goal of trustworthy schemes (at least in the sense of being able to detect malfunctions, corruption etc.). The time is certainly ripe, indeed overdue, to address the issues of robust engineering.

    And of course, the issues of public (and stakeholder) confidence still pose challenges.

    BTW, Rivest has recently come up with a scheme that strives to provide voter-verifiability without crypto:

    http://en.wikipedia.org/wiki/ThreeBallot

    It’s cool as hell (for example it can provide unconditional privacy), but the interface is non-trivial and a number of vulnerabilities have been indentified.

    Peter

  2. Anecdotal connection between voting and high-reliability engineering (if you need another example):

    When designing systems that go to space (and some sensitive terrestrial projects) single event upsets (SEU’s) caused by ambient radiation becomes a significant concern. One method to deal with this is called triple modular redundancy (TMR), where everything is triplicated, and voters are used to determine the correct answer. As an added guarantee for correctness, the hardware is scrubbed often to clear away the SEU’s, which are “soft”, along with other modes of failure. For example, the Mars Rover carrier, when it was en-route, survived the strongest solar storm in recorded history using some of these techniques.

    All of this comes at a cost overhead, both in hardware and engineering, but is deemed necessary for a successful mission (there is only one attempt). In high-reliability projects, cost is secondary to robustness; when this occurs for voting system engineering, I suppose we can start expecting better results.

  3. Even if we somehow manage to overcome all the software engineering obstacles that you mentioned, e-voting will remain, nevertheless, a black-box to the majority of the public without offering any important benefit to them. Even if we come up with a really reliable cryptosystem, the non-expert citizen will never stop doubting that there might be some fraud in the results of the e-elections.

    I believe that it will be far more difficult for e-voting to make its way through the public than it was for e-commerce, where unknown merchants from some random place of the world request the customers to provide them with their credit cards details in order send them the goods. Yet, as regards the e-commerce, the customers had much to gain. Less time to waste on shopping, better prices, ability to buy goods that were locally unavailable. As regards e-voting, what are we going to gain that traditional elections don’t provide us? Maybe quicker results. And what is the government going to gain? Cheaper elections? As Steven’s article cleverly shows, this will not happen. So why is everybody so determined to establish e-voting in US and Europe? Is it “the interest of apparent modernization” or the fact that digital information is always easier to be manipulated?

  4. The accurate counting of ballots is not particularly important for the democratic process. What is vital is the trustworthy counting of ballots.

    It can never be acceptable to say “the result of the election is X and you have to believe me because I am an expert”, and this is effectively what is meant by saying “the machine says the result of the election is X and I, an expert, certify it to be correct”.

    When the result of an election is both narrow and the opposite of what everyone knew was going to happen (for instance, in the UK in 1992), no expert IT security consultant is going to placate the angry mob; but inviting their children in to repeat the manual count just might.

  5. Unlike with space rockets, robust engineering and recovery procedures are not essential for voting machines, because they don’t burn when they crash. But error reporting is absolutely vital for both.

    We could get by with voting machines that sometimes fail, provided they can be trusted to report all errors. Then a human can correct the problem, for example by a manual count.

    Or for example cars. You don’t need a wonder vehicle which never runs out of petrol, it’s enough to that it has a fuel guage.

  6. @giafly

    Plenty of aerospace projects don’t burn when they crash, for example the control software for sensors on probes. They are still developed using high-integrity software engineering techniques because if they fail you’ve sent an expensive brick across the solar system for nothing.

    A failure of a voting machine is also extremely expensive. It’s not just the staff who have to come out and operate the polling stations, but voters have to waste time voting a second time. If you factor in the economic cost of that, you will end up with a very large number.

    Also remember, electronic voting is supposed to make things better, not worse. I can’t think of a paper election which has had to be re-run due to failures, but if the keys are lost in a cryptographic scheme, you need to re-do it from scratch.

  7. Excellent post, which I agree with completely.

    The scale issue, which you touch upon, is an important one which often resonates with politicians. Logistically it is much harder to change a significant number of paper ballots than electronic ballots. This has to be a good feature of paper elections that we remember and retain.

    Secondly, not only does understanding the crypto require maths – as you point out, but even when open sourced, the e-voting or e-counting systems as a whole remain a black box to most people. Unless you have the skills, the time and are willing to do the work most voters will not examine the source code before voting on a system. Still even if we did view the source, how can we be sure that source is actually running on the system or that a flaw isn’t in the hardware?

    There are too many opportunities to raise doubts over electronic elections to make them suitable for public use.

  8. @Steven,

    Even though paper balots are a well known quantity they like most other systems have scale issues, which is one of the reasons they will need to be replaced at some point in the not to distant future.

    Another reason is that currently we do not have a democracy in the true sense but a Parlimentry Democracy, that is you vote for a person not on issues.

    A lot of people feel that our current Parlimentary Democracy is full of crooks shysters and other undesirables, which might be one of the reasons voter turn out is about as low as it has ever got in this country.

    If we are to move towards a more Democratic system we would effectivly need more occasions on which people vote on issues (ie a referendum). This would require not just trust worthy voting machines but a whole new rethink on how we carry a vote out.

    Paper has been good for several hundred years but it needs to be updated to the needs of the modern world. If we do not get experiance on implementing voting machines or E-Voting then we will not be able to move forward.

    It took the Germans many many trial flights to get the V2 to launch and travel in a predictable fashion. Likewise I can see many many attempts being made to get electronic voting working.

  9. Interesting post full of good points.

    My only difference is that I am more optimistic regarding crypto voting. SSL is a black-box to the majority of internet shoppers or online bank customers who are content to “trust the cryptographers.”

    I can also say that Punchscan takes reasonable measures to prevent having an election rerun as a result of errors or tampering. It includes audits before the election to ensure the parameters needed to conduct the election are correct, and after the election, a brand new tallying function can be created and used without having to reconduct the entire election.

    That said, I am perfectly content with my country’s system (Canada) of hand-counting the ballots.

  10. @Jeremy

    Regarding Punchscan, it is true that the audit will help catch certain errors, but the accuracy still depends on the software engineering of the application. The output I saw displayed by the auditor was a dialog box, saying that all selected votes had been successfully audited. If a bug in the program caused it to erroneously report that all is well, a problem would be missed.

    Essentially this is a variation of N-version programming. One application generates the ballots and a different one checks them. Provided software errors are independent this should hugely reduce the error rate. The fact that they’re written in different languages will help too (was this intentional?)

    Unfortunately Knight and Leveson showed that software errors are not independent, so it is fairly likely that the same flaw will exist in both the generator and auditor. There might even be code-sharing between the two applications, further increasing the chance of missing problems.

    So auditing helps, but not quite as much as it might intuitively seem. It should be a component of a robust process (and it needs to be there for other reasons too), but is not a panacea and needs to be implemented as part of an integrated development process.

  11. @Steven

    1) The software is open source — you don’t trust the report, step through the program with a debugger and watch it perform the operations.

    2) The software is open spec — you don’t like the implementation, write your own. If you can parse XML, and call a Crypto library, then you can do it with the language of your choice.

    At the end of the day you’re saying a program that checks 1+1=2 can’t be trusted. But 1+1 is an open spec just like AES-128 and SHA-256. Plus you can buy any calculator you wish to perform it (or just do it by hand). Checking that something matches its commitment is just like checking an addition.

  12. @Aleks

    Think about the whole design process and the resulting product.

    At one end you have the specification at the other the hardware, in between you have the software.

    As far as I am aware there is no software out there that checks it’s own specification for correctness (it is infact probably impossible for it to do so), and there are very very few devices where the software can 100% test the hardware.

    If you have a specification with faults in it, then the software that perfores the vote count, and the software that audits the count irespective of programing language or entities writting it are both likly to process the fault according to the specification (except by chance).

    Likewise hardware changes without changes to the software, think for instance DOS Software written in the 1980s for the 8086 that still runs on the latest intel CPUs.

    I think you will agree that if the software tested certain aspects of the 8086 operation then there is due to backwards compatability some expectation the modern CPU will respond in the correct way.

    But the software will have no knowledge or ability to test the more modern features of the CPU that might well affect the outcome of the hardware usage.

    The software is just one small part of the overal system, and it is all the individual parts of the system that have to work correctly 100% of the time ask yourself how likley this is to happen on the tenth let alone the first revision of the system…

  13. @Clive

    First I need to be sure that anyone reading this is aware that we’re talking about independent verification. You, someone not associated with the election authority, can write/use the software tool of your choosing to verify the correctness of a cryptographic identity used to prove the election tally.

    “As far as I am aware there is no software out there that checks it’s own specification for correctness”

    Agreed. But that’s not what we’re doing.

    Think about an AES-128 implementation. It doesn’t check itself for correctness. You do. And it’s easy… a dozen test vectors and you’re done.

    “If you have a specification with faults in it…”

    We’re not talking about software specs. We’re talking about algorithmic (esp. cryptographic) specs.

    1+1=2… that’s the spec by definition. There’s no room for philosophical argument. It is what it is, by definition. AES-128 has a specification too. And given its cryptographic nature, you can PROVE correctness of implementation with a dozen test vectors.

    So if you write software that gives you 1+1=3, then you know you botched your implementation; it doesn’t matter what platform you use. 8088 or Core2… it doesn’t matter.

    In this E2E (e.g. Punchscan) voting system audit business, that’s the bulk of what you’re doing: checking that something encrypted the way it was supposed to.

    So it’s trivial get assurance of correctness of implementation, regardless of the platform. You can establish the cryptographic correctness with a dozen test vectors.

    Once you have done so, you can audit the election with the assurance that you will catch any incorrect encryptions or tampering perpetrated by the election authority.

    AND you can do it using any UTM you choose. 8086 or otherwise.

  14. Going back to Steven Murdoch’s post, I find the comparison with rocket launch software to be quite an overstatement. I agree entirely with the critical need for software verification, and with Steven’s account of some of the general software engineering challenges. However, unlike rocket control, e-voting …

    – is not a real time control problem
    (at any rate, it is not a Hard Real Time problem)
    – has very few inputs
    – has very few outputs
    – enjoys very simple reasonableness tests on its input data.

    It beats me why the infamous US voting solutions apparently run to hundreds of thousands of lines of new code.

    I also fear that the benefits of e-voting have been under-stated, which may trivialise the pursuit of a solution. More than merely saving cost to the government and delivering quicker results, e-voting …

    – is more accessible for absentee voting
    – possibly more inclusive
    – has the massive scalability required for citizen initiated referenda and deliberative polling.

Leave a Reply to Aleks Cancel reply

Your email address will not be published. Required fields are marked *