In previous work we have shown how stolen bitcoins can be traced if we simply apply existing law. If bitcoins are “mixed”, that is to say if multiple actors pool together their coins in one transaction to obfuscate which coins belong to whom, then the precedent in Clayton’s Case says that FIFO ordering must be used to track which fragments of coin are tainted. If the first input satoshi (atomic unit of Bitcoin) was stolen then the first output satoshi should be marked stolen, and so on.
This led us to design Taintchain, a system for tracing stolen coins through the Bitcoin network. However, we quickly discovered a problem: while it was now possible to trace coins, it was harder to spot patterns. A decent way of visualizing the data is important to make sense of the patterns of splits and joins that are used to obfuscate bitcoin transactions. We therefore designed a visualization tool that interactively expands the taint graph based on user input. We first came up with a way to represent transactions and their associated taints in a temporal graph. After realizing the sheer number of hops that some satoshis go through and the high outdegree of some transactions, we came up with a way to do graph generation on-the-fly while assuming some restrictions on maximum hop length and outdegree.
Using this tool, we were able to spot many of the common tricks used by bitcoin launderers. A summary of our findings can be found in the short paper here.
I am at the 2018 Open Source Summit Europe in Edinburgh where I’ll be speaking about Hyperledger projects. In follow-ups to this post, I’ll live-blog security related talks and workshops.
The first workshop of the summit I attended was a crash course introduction to EdgeX Foundry by Jim White, the organization’s chief architect. EdgeX Foundry is an open source, vendor neutral software framework for IoT edge computing. One of the primary challenges that it faces is the sheer number of protocols and standards that need to be supported in the IoT space, both on the south side (various sensors and actuators) as well as the north side (cloud providers, on-premise servers). To do this, EdgeX uses a microservices based architecture where all components interact via configurable APIs and developers can choose to customize any component. While this architecture does help to alleviate the scaling issue, it raises interesting questions with respect to API security and device management. How do we ensure the integrity of the control and command modules when those modules themselves are federated across multiple third-party-contributed microservices? The team at EdgeX is aware of these issues and is a major theme of focus for their upcoming releases.
I was at The Fifth International Workshop on Graphical Models for Security (part of FLoC 2018) this weekend where I presented a paper. Following is a summarized account of the talks that took place there. Slides can be found here.
The first speaker was Mike Fisk who was giving an invited talk on Intrusion Tolerance in Complex Cyber Systems. Mike started off by elaborating the differences in the construction of physically secure systems such as forts versus the way software engineers go about creating so-called secure systems. He then made the case for thinking in terms of intrusion tolerance rather than just intrusion resistance – even if an intruder gets in, your system should be designed in such a way that it impedes the intruder’s exploration of your network. He then instantiated this idea by talking about credentials for accessing network resource and how they’re stored. He noted that normal users (with the notable exceptions of sysadmins) show predictable access patterns whereas attackers show wildly different access patterns; an intrusion tolerant system should take these into account and ask for re-authentication in case of abnormal patterns. He then talked about metrics for figuring out which nodes in a network are most interesting to an attacker. While some of these are expected – say, the ActiveDirectory server – others are quite surprising such as regular desktops with very high network centrality. He concluded by giving advise on how to use these metrics to direct resources for intrusion resistance most effectively.
Sabarathinam Chockalingam gave a talk on using Bayesian networks and fishbone diagrams to distinguish between intentional attacks and accidental technical failures in cyber-physical systems. His work focused specifically on water level sensors used in floodgates. He first gave an introduction to fishbone diagrams highlighting their salient features such as the ability to facilitate brainstorming sessions while showcasing all the relevant factors in a problem. He then presented a way to translate fishbone diagrams into Bayesian networks. He utilized this technique to convert the risk factor fishbone diagram for the water level sensors into a Bayesian network and generated some predictions. These predictions were mostly based on expert knowledge and literature review. He concluded by pointing at some possible future research directions primary of which was exploring the conversion of fishbone diagrams into conditional probability tables.
I gave a talk on visualizing the diffusion of stolen bitcoins. This works builds upon our previous work on applying the FIFO algorithm to tainting bitcoins, presented at SPW2018. Here, I focused on the challenges facing effective visualization of the tainting dataset. I highlighted the size of the dataset (>450 GB for just 56 kinds of taint), the unbounded number of inputs and outputs as well as the unbounded number of hops a satoshi can take. All these make visualization without abstraction challenging. We refused to use lossy abstractions since what is interesting to the user might be something that we abstract away. Instead, we made two prototypes that, for the most part, convey the underlying information in an accessible manner to the end-user without using any abstractions. The first provides a static map of the taint-graph, useful for getting a global view of the graph; the second provides an interactive way to explore individual transactions. I concluded by pointing out that this is a much more general problem since what we are trying to do is visualize a large subset of transactions in a massive dataset – something that is encountered in many other domains.
Ross Horne presented a specialization of attack trees where he took into consideration of an attacker about the underlying system that they are trying to compromise. He pointed out that existing attack trees assume perfect knowledge on the part of the attacker whereas this is not realistic. The attacker often acts under uncertainty. To model this, he introduced a new operator to act between branches of an attack tree that conveys ignorance on the effectiveness and possible outcomes in case the attacker chooses to traverse that sub-tree. He then introduced a way of reasoning about the specialization of such trees and showed how the placement of the newly introduced operator has varying impact on the capabilities of the attacker. He concluded by remarking how these new attack trees could be used for moving target defence.
Harley Eades III gave a talk on applying linear logic to attack trees. He started off by pointing out that when understanding the difficulty of execution of an attack, we only care about the weights assigned to the leaves of the tree, the root nodes only serve as combinatorial operators. He then presented an exhaustive list of operators and provided a representation to convert attack trees into linear logic statements. He then introduced Maude, a quarternary semantics of attack trees followed by the introduction of Lina, an embedded domain specific programming language. Lina is used to do automated reasoning about attack trees using Maude. He presented Lina’s functionalities and showed an example application of Lina: automated threat analysis. He concluded by talking about future work conjecturing different formal models of causal attack trees specifically mentioning a petri net model.