There aren’t that many serious spammers any more

I’ve recently been analysing the incoming email traffic data for Demon Internet, a large(ish) UK ISP, for the first four weeks of March 2007. The raw totals show a very interesting picture:

Email & Spam traffic at Demon Internet, March 2007

The top four lines are the amount of incoming email that was detected as “spam” by the Cloudmark technology that Demon now uses. The values lie in a range of 5 to 13 million items per day, with the day of the week being irrelevant, and huge swings from day to day. See how 5 million items on Saturday 18th is followed by 13 million items on Monday 20th!

The bottom four lines are the amount of incoming email that was not detected as spam (and it also excludes incoming items with a “null” sender, which will be bounces, almost certainly all “backscatter” from remote sites “bouncing” spam with forged senders). The values here are between about 2 and 4 million items a day, with a clear pattern being followed from week to week, with lower values at the weekends.

There’s an interesting rise in non-spam email on Tuesday 27th, which corresponds to a new type of “pump and dump” spam (mainly in German) which clearly wasn’t immediately spotted as spam. By the next day, things were back to normal.

The figures and patterns are interesting in themselves, but they show how summarising an average spam value (it was in fact 73%) hides a much more complex picture.

The picture is also hiding a deeper truth. There’s no “law of large numbers” operating here. That is to say, the incoming spam is not composed of lots of individual spam gangs, each doing their own thing and thereby generating a fairly steady amount of spam from day to day. Instead, it is clear that very significant volumes of spam is being sent by a very small number of gangs, so that as they switch their destinations around: today it’s .uk, tomorrow it’s aol.com and on Tuesday it will be .de (hmm, perhaps that’s why they hit .demon addresses? a missing $ from their regular expression!).

If there’s only a few large gangs operating — and other people are detecting these huge swings of activity as well — then that’s very significant for public policy. One can have sympathy for police officers and regulators faced with the prospect of dealing with hundreds or thousands of spammers; dealing with them all would take many (rather boring and frustrating) lifetimes. But if there are, say, five, big gangs at most — well that’s suddenly looking like a tractable problem.

Spam is costing us [allegedly] billions (and is a growing problem for the developing world), so there’s all sorts of economic and diplomatic reasons for tackling it. So tell your local spam law enforcement officials to have a look at the graph of Demon Internet’s traffic. It tells them that trying to do something about the spammers currently makes a lot of sense — and that by just tracking down a handful of people, they will be capable of making a real difference!

20 thoughts on “There aren’t that many serious spammers any more

  1. Richard — spot on. We see similar massive swings in our private addresses and at the SpamAssassin spamtraps. I drew *exactly* the same conclusion from this data…

  2. We also see huge variations in the volume of spam to mx.cam.ac.uk, both over the short term (less than an hour) and longer term (months). We quite often see huge pulses of spam that exceed the background volume by a factor of two or more, sometimes with a cyclic behaviour, e.g. on for half an hour, off for an hour…

    I monitor things partly with some fairly arcane high-density graphs at http://canvas.csi.cam.ac.uk/stats/ppsw/

  3. While it’s clear that a small group of spammers is responsible for some huge volume of spam, it’s not clear that *all* or even most of spam comes from them. E.g. your numbers are consistent with 5M spam messages a day of “background” spam with thousands of different sources, and the rest of it due to a small gang.

  4. Numbers like these indeed show that a small group of spammers can pump out an enormous amount of messages. A small (say, five) part of our ROKSO list has such resources that they can dwarf almost any other spam operation out there. Only by controlling large numbers of infected machines these dents can be made.

  5. I’m a Demon customer and if my experience is anything to go by, at least 90% of spam is sent with addresses of the form @
    with the random letters arising from the message-ids put on my past postings to Usenet News. Since I changed my mail collection policy to only collect messages sent to the few user-names actually in use at my domain, spam volumes are down from a few hundred per day to maybe 10 per day.

    Of course now anyone emailing us and making a small typo in the user-name won’t get through any more, but that’s only a small drawback. If Demon discarded this stuff it would surely make a huge difference.

  6. It is known that spam is a huge multi-billion dollar problem. As shown here, most of the problem is due to a few gangs. My opinion is that spam is also a critical threat to business (and other) infrastructure.

    I believe that it is not impossible, given enough ressources, to find these spammers since their are in for the money and financial transactions still leavesa lot of traces.

    Therefore, I don’t understand why some governments don’t use the same “covert” methods against these spammers that they use against other organisations that present a threat to critical infrastructures. And since these low life are in for the money, getting rid of the few big ones might even scare the others.

  7. Why don’t governments go after these spammers? because their perception is of several hundred small gangs (the ROKSO list grew from an initial 100 to the current 200 or so). Remember that it’s the same amount of police time to track down a spammer who is sending a million items a day as one who is sending a billion.

    That’s why datasets with spikiness matter : they’re showing that this perception is out of date and it’s time to reconsider whether enforcement action targeted at a handful of gangs would make a difference in real terms and not just a PR coup : pour encourager les autres.

  8. Well, in my daily work I track counterfeiters and I found out that the difficulty is almost never to find the guys, but the time and ressource consuming tasks are to build a case and bring them to justice, very often for a very meager result in the end (usually a fine that the guys will never pay since they are often smart enough to have legally no assets).

    I was hinting that in my opinion governments should skip a few steps and use the secret services for physical elimination as they are well-known to do for other threats.

    Of course, some people might find this idea ethically questionnable.

  9. “I was hinting that in my opinion governments should skip a few steps and use the secret services for physical elimination as they are well-known to do for other threats.”

    It shouldn’t be necessary to actually eliminate them physically. We can just remove them from the gene pool. Sometimes I feel that castration would be more appropriate.

  10. That’s interesting, and consistent with my experience as the owner of a long-standing Demon account, with an email address which I daresay is known to every spammer in the world by now :-/.

    Almost all of my current and recent spam load (not pre-filtered by Demon for various reasons) is of about five or so different types. Although I haven’t attempted any proper analysis, each type looks to have a rather consistent Modus Operandi. I suspected that each type was coming from a single gang – looks as though I was right!

    One more data point which may be of interest to someone: my particular demon subdomain has a handful of valid email addresses, and one extra which, as far as anyone related to the domain knows, was never a valid address and was never used. Despite this, it started receiving spam in about ’94 or so, and has continued to do so more or less consistently to this day. I have no idea why this should be the case.

  11. Richard, maybe I’ve missed something. How does your graph say anything about how many spammers there are or which countries they are targeting? To draw this kind of conclusion, you would presumably need to identify the separate spam sources and correlate the numbers between multiple countries … which you don’t appear to have done in your study.

    Let me put it another way: I feel your graph clearly shows a correlation with phases of the moon, therefore conclusively proving that all spammers are in fact loonies. 😉

    Kind regards,
    Gary

  12. What you’ve missed is the large day-to-day changes (the moon isn’t bright one day and dark the next). There are two widely different possible reasons for this — one is that there are a few large spammers who target UK ISP customers at Demon one day and US ISP customers at, say, Earthlink the next. The other explanation is that several hundred smaller scale spammers agree that one Saturday they will go for Earthlink, on Monday they will switch to sending to Demon…

    … I feel that the swings are more likely to be caused by a few large-scale spammers, than by some general factor (phase of the moon or whatever) that is causing many small spammers to behave the same way.

    Of course this argument doesn’t count the spammers or identify their targets when they aren’t hitting Demon (maybe they just take days off?) but if you look at your inbox (especially the non-filtered version of it) then you can see a lot of repetition in the spam that arrives and what it advertises… my graph is not the only evdence for my proposition! but it is, in my view, striking corroboration.

    Finally, it’s a mistake to view spammers as foolish or mad — they seem to me to be quite successful and inventive manipulators of the systems they encounter 🙁

  13. Doesn’t this only show that a given ISP is only being spammed by a small number of groups? It does seem to show that there aren’t thousands of high-volume spammers using similar/identical lists of addresses to send to, but couldn’t there be hundreds of groups of high-volume spammers, each of whom focused on a small number of domains and cycled among them? That is, there are high and low volume spammers, and the high volume spammers are (for the most part) getting their lists from different places, so that no more than a few are spamming any given ISP.

  14. The graph might conceivably show that only a small number of groups were targetting Demon, except that other ISPs anecdotally report similar patterns, so it’s a worldwide phenomenon. Also, Demon has quite a lot of customers using their own domans (not just example.demon.co.uk sub-domains of the main ISP) so it’s a bit more complex than the situation at major webmail hubs or more consumer-oriented ISPs.

    In particular there’s no evidence that spammers, in the main anyway, work out where domain MX records point. For if they did, MessageLabs would report very low percentages of spam as spammers didn’t waste their time trying to get material through their filters… but that’s not what they see.

    However, it is indeed possible that there’s thousands of spam gangs each targetting a handful of ISPs each — you’d see the same graph at Demon, but you’d see all sorts of other indications that this was the case as well (such as spam filtering systems having to be customised on a per-ISP basis), and I can’t name any such evidence.

  15. Do you really think castration for email spammers would be a deterrent? I mean, with a ready supply of penis enlargement medications, I’m sure they have remedies for castration too.

  16. Huge expenditures to build a legal case is the primary drawback. Finding them is relatively easy. The only way this crap will stop is when the Spamford Wallaces of the world answer a knock on their front doors and get two in the brainpan.

    The “authorities” are too busy sucking on the public tit.

  17. I dunno, shooting them can be messy both from a legal and physical point of view. On the other hand, a very unofficial Narn Bat Squad subjecting a few of these jokers to a close encounter with a blanket and a Louisville Slugger or five…

    After all, we want a few of them to live and tell their tales, for the encouragement of the others…. 🙂

    Oh, and as a calling card? Leave a few dozen FIJA pamphlets, which would make a great poison pen in case Mister Spammer ever decided to track down his assailants…

  18. Richard,

    On the assumption that you are atleast partialy correct in your view (and I see no real false indicators) then going after the few would be a worthwhile investment in resources.

    However once one Major Spammer has had their head put on a pole and paraded around what is the chance of catching more?

    I am assuming (like you are) that the spammers are not particularly stupid people therefore it will be interesting to see if and when the Major few change their tactics to cover their tracks a little more.

  19. Hi

    I got to this page from an old Crypto-Gram I had missed.

    As it happens the company I work for is a Postini customer who at ~2Bn emails a day see quite a bit of traffic. Personally I think that you are incorrect with your assessment. Postini’s Connection Manager heuristics suggest that most spam comes from Botnets which are of course available for hire, and this puts a step between the spammer and the source of their spam.

    Botnets vary in size, and there is now competition between the Botnets. So my hypothesis is that the are a relativey small number of the large Botnets, and its the large botnets that send most of the spam. Each Botnet has a different spam traffic pattern, and is in essence a wave. So if you get say 5 waves at the same time, wave dynamics apply and large variances are inevitable, as are quiet periods.

    Just my 2p worth.

    Cheers

    BP

Leave a Reply

Your email address will not be published. Required fields are marked *