Posts tagged with: Spam

Targeted political spam

I've complained about spammers before, but this one is new. I recently received a spam that supports the case of Michael Skelly for Congress, saying negative things about incumbent John Culberson. What's interesting: this is my home precinct. These people are actually competing for my vote. This leads to the question: how on earth did the Skelly people manage to map my work email address to my home mailing address? Is there a database out there that they used? Maybe they just spammed everybody at my employer, since this particular Congressional district includes our campus; all of our students, in our dorms, who are registered locally will be voting in this particular race.

Part of me wants to bias my voting decision against the idiot candidate who thought that email marketing was a good way to efficiently reach voters. Sadly, that decision will have to be based on more substantial issues, like which candidate I think will perform better in Congress. Instead, I'm going to direct my fire at VerticalResponse, the service provider who the Skelly campaign used to send me the spam. According to their anti-spam policy,

VerticalResponse has no tolerance for the sending of spam and unsolicited mail, and we prohibit the use of third-party, purchased, rented, or harvested mailing lists. Any customer found using VerticalResponse to send such mail is banned from the use of our service.

VerticalResponse takes several steps to keep abuse to a minimum. Among other things, we:

- Interview new clients about both the origins of their mailing lists and their marketing practices. Clients who do not meet our standards are not allowed to use the VerticalResponse service.

...

- Read most emails before they can go out the door. Email sent through our system goes to a staging area where it is looked over by a member of the VerticalResponse staff. If we have any concerns, the mailing is stopped and we contact the client.

Really? I find that impossible to believe. In what way could any reasonable human have decided that a blob of partisan political attack messaging being delivered to what we can only presume is a non-trivial mailing list is, in any way, anything other than gratuitous spam? For the record, I have never supported either the Democratic or Republican parties financially. I am not a member of either party. The only possible way my email address could have been used is that it was either harvested in bulk, along with other Rice email addresses, or perhaps more charitably, if somebody thought "ahh, that Prof. Wallach seems like he'd be interested political propaganda from our party and/or candidate." Neither one would appear to be compatible with VerticalResponse's stated anti-spam policies.

I'll also note that, while VerticalResponse provides a one-click way for me to opt out of this particular spam source, they provide no way for me to opt out of any other future source or otherwise specify any sort of policy from my end. There's no way, short of training my spam filter, for me to say "I never want to receive email from VerticalResponse, ever again." Surely, I figured, I can't be the first person to complain about them, yet a Google search on any of the usual terms didn't find anybody else complaining like this.

Instead, I started digging through my historical email. It appears that there have been a handful of VerticalResponse "campaigns" that I considered to be non-spam and have kept. One series of non-spam messages were from a house builder who I thought I might want to use at one point. Another was an update notice for a web service that I use. Historically, I've reported one other spam to them, via their abuse email address. They stated, in response, that they removed me from that particular mailing list and would investigate the infraction. I received no subsequent email about the resolution of that case.

Of course, that's far from everything. Generally, when I get these things, I generally just click the "unsubscribe" link, retrain my spam filter, and move on with life. I haven't kept count of how many such spams I've treated this way.

I did a similar search through my old mail for ConstantContact, one of VerticalResponse's competitors. I found not a single email, from them to me, that I had kept, although several were forwarded to mailing lists that I archive, so those I kept. I have no records of having ever contacted their abuse department.

Does this mean that one vendor is more spammy than the other, does it mean that one vendor just has more market share than the other, or does it mean that my spam filter is removing more of this stuff before I have to look at it? It's hard to say without more data.

Okay, big policy question: given that political campaigns and everybody else on the marketing side of the equation deeply loves the idea of targeted email marketing campaigns, how should we accommodate them? Should they be required to provide better proof to to firms like VerticalResponse or ConstantContact that their email addresses were harvested in some proper fashion? How on earth could they actually do such a thing? Short of having users opt-in directly at the email distribution service, everything else boils down to the email service taking the marketer at their word, which seems about as likely to be true as those "no documentation required" mortgages.

Maybe the answer is for "ethical" email distributors to pay fees, per message, perhaps as a government tax. Call it "spam postage", and tweak the fee structure so the sender ends up paying more money when the recipient hit the "unsubscribe" or "abuse" button. First off, by adding a real monetary expense to the process, senders might be incentivized to reduce their mailing lists. The penalties incentivize them to cull their lists down to their true supporters. The only problem with a structure like this is that it tends to push email marketers away from "ethical" email distribution services and toward either do-it-yourself solutions or toward shady vendors who don't charge the postage fees. (And, we all know that the real-money postage costs of physical mail do seemingly little to deter all the paper spam that we receive.)

For better or for worse, we'll never get rid of email spam. Maybe we can filter out recurring messages from Nigerian dictators or overseas pharmacies, but no training-based spam filter is going to be able to learn every new thing to come down the block when it's still new. The only thing that will ever truly work is if and when people just stop paying attention.

[Sidebar: so how should a political campaign effectively reach people like me to convey their message? I tend to go out and surf their web sites, read their policy papers, and I pay attention to the endorsements of newspapers, bloggers, and others who I trust. For the "down-ballot" races, I tend to spend some quality time with the non-partisan League of Women Voters guide. The LWV asks candidates to respond to a variety of relevant questions, but space constraints limit the answers. An online version could presumably give the candidates space to really explain their positions (and/or firmly demonstrate their lack of clue). At the end of all that, I make a cheat sheet with my favorite candidates and bring it with me to the polls.]

Tagged:  

A curious phone scam

My phone at work rings.  The caller ID has a weird number ("50622961841" – yes, it's got an extra digit in it).  I answer.  It's a recording telling me I can get lower rates on my card (what card?) if I just hit one to connect me to a representative.  Umm, okay.  "1".  Recorded voiced: "Just a moment."  Human voice: "Hello, card center."

At this point, I was mostly thinking that this was unsolicited spam, not a phishing attack.  Either way, I knew I had a limited time to ask questions before they'd hang up. "Who is this?  What company is this?"  They hung up.  Damn! I should have played along a little further.  I imagine they would have asked for my credit card number.  I could have then made something up to see how far the interaction would go.  Oh well.

Clearly, this was a variant on a credit card phishing attack, except instead of an email from a Nigerian dictator, it was a phone call.  I'm sure the caller ID is total garbage, although that, along with the demon-dialer, says that the scammer has some non-trivial infrastructure in place to make it happen.

So, the next time one of you receives an unsolicited call offering to get you lower rates on your card, please do play along and feed them random numbers when they ask for data.  At the very least, there's some entertainment value.  If you're lucky, you might be able to learn something that would be useful to mount a criminal investigation.  Maybe half-way through you could suddenly have an important meeting to get to and see if you can get them to give you a callback phone number.

Update: reader "anon" points to an article from The Register that discusses this in more detail.

Tagged:  

Cheap CAPTCHA Solving Changes the Security Game

ZDNet's "Zero Day" blog has an interesting post on the gray-market economy in solving CAPTCHAs.

CAPTCHAs are those online tests that ask you to type in a sequence of characters from a hard-to-read image. By doing this, you prove that you're a real person and not an automated bot – the assumption being that bots cannot decipher the CAPTCHA images reliably. The goal of CAPTCHAs is to raise the price of access to a resource, by requiring a small quantum of human attention, in the hope that legitimate human users will be willing to expend a little attention but spammers, password guessers, and other unwanted users will not.

It's no surprise, then, that a gray market in CAPTCHA-solving has developed, and that that market uses technology to deliver CAPTCHAs efficiently to low-wage workers who solve many CAPTCHAs per hour. It's no surprise, either, that there is vigorous competition between CAPTCHA-solving firms in India and elsewhere. The going rate, for high-volume buyers, seems to be about $0.002 per CAPTCHA solved.

I would happily pay that rate to have somebody else solve the CAPTCHAs I encounter. I see two or three CAPTCHAs a week, so this would cost me about twenty-five cents a year. I assume most of you, and most people in the developed world, would happily pay that much to never see CAPTCHAs. There's an obvious business opportunity here, to provide a browser plugin that recognizes CAPTCHAs and outsources them to low-wage solvers – if some entrepreneur can overcome transaction costs and any legal issues.

Of course, the fact that CAPTCHAs can be solved for a small fee, and even that most users are willing to pay that fee, does not make CAPTCHAs useless. They still do raise the cost of spamming and other undesired behavior. The key question is whether imposing a $0.002 fee on certain kinds of accesses deters enough bad behavior. That's an empirical question that is answerable in principle. We might not have the data to answer it in practice, at least not yet.

Another interesting question is whether it's good public policy to try to stop CAPTCHA-solving services. It's not clear whether governments can actually hinder CAPTCHA-solving services enough to raise the price (or risk) of using them. But even assuming that governments can raise the price of CAPTCHA-solving, the price increase will deter some bad behavior but will also prevent some beneficial transactions such as outsourcing by legitimate customers. Whether the bad behavior deterred outweighs the good behavior deterred is another empirical question we probably can't answer yet.

On the first question – the impact of cheap CAPTCHA-solving – we're starting a real-world experiment, like it or not.

30th Anniversary of First Spam Email; No End in Sight

Today marks the 30th anniversary of (what is reputed to be) the first spam email. Here's the body of the email:

DIGITAL WILL BE GIVING A PRODUCT PRESENTATION OF THE NEWEST MEMBERS OF THE DECSYSTEM-20 FAMILY; THE DECSYSTEM-2020, 2020T, 2060, AND 2060T. THE DECSYSTEM-20 FAMILY OF COMPUTERS HAS EVOLVED FROM THE TENEX OPERATING SYSTEM AND THE DECSYSTEM-10 (PDP-10) COMPUTER ARCHITECTURE. BOTH THE DECSYSTEM-2060T AND 2020T OFFER FULL ARPANET SUPPORT UNDER THE TOPS-20 OPERATING SYSTEM. THE DECSYSTEM-2060 IS AN UPWARD EXTENSION OF THE CURRENT DECSYSTEM 2040 AND 2050 FAMILY. THE DECSYSTEM-2020 IS A NEW LOW END MEMBER OF THE DECSYSTEM-20 FAMILY AND FULLY SOFTWARE COMPATIBLE WITH ALL OF THE OTHER DECSYSTEM-20 MODELS.

WE INVITE YOU TO COME SEE THE 2020 AND HEAR ABOUT THE DECSYSTEM-20 FAMILY AT THE TWO PRODUCT PRESENTATIONS WE WILL BE GIVING IN CALIFORNIA THIS MONTH. THE LOCATIONS WILL BE:

TUESDAY, MAY 9, 1978 - 2 PM
HYATT HOUSE (NEAR THE L.A. AIRPORT)
LOS ANGELES, CA

THURSDAY, MAY 11, 1978 - 2 PM
DUNFEY'S ROYAL COACH
SAN MATEO, CA
(4 MILES SOUTH OF S.F. AIRPORT AT BAYSHORE, RT 101 AND RT 92)

A 2020 WILL BE THERE FOR YOU TO VIEW. ALSO TERMINALS ON-LINE TO OTHER DECSYSTEM-20 SYSTEMS THROUGH THE ARPANET. IF YOU ARE UNABLE TO ATTEND, PLEASE FEEL FREE TO CONTACT THE NEAREST DEC OFFICE FOR MORE INFORMATION ABOUT THE EXCITING DECSYSTEM-20 FAMILY.

This is relatively mild by the standards of today's spam. The message announced legitimate events relating to legitimate products in which the recipients might plausibly be interested. The sender was apparently unaware that this kind of message was against the rules.

Yet this message has much in common with today's spam. The message used ALL CAPS, which was more common in those days but not the universal practice for email. The list of recipients was long. The message was incorrectly formatted – the original had more recipients than the email software of the day could handle, so what was supposed to be the recipient list actually spilled over into the body of the email, apparently unnoticed by the sender.

At the time, the Net's rules forbade commercial activity, so the message was against the rules. Beyond the rule violation,the message's propriety was widely questioned, and people debated what to do about it. (Brad Templeton has posted parts of the debate.)

Thirty years later, there is more spam than ever and no end is in sight. This shouldn't be surprising, because the spam problem is fundamentally driven by economics. If anyone can send to anyone, and the cost of sending is nearly zero, many messages will be sent. Distinguishing unwanted email from wanted email is notoriously difficult – often you have to read a message to decide whether reading it was a waste of time. In this environment, spam will be a fact of life. The surprise, if anything, is that we have done as well as we have in coping with it.

Tagged:  

spammers gone wild

I'm sure this sort of behavior is old news, but it's still really annoying.  Starting last night and continuing as I'm writing this, some annoying spammer has been forging my email address as the "From" line of a variety of spams.  This is causing a staggering volume of backscatter, mostly of the "Delivery Status Notification (failure)" variety.  Sampling these messages, I'm seeing several interesting things.

  1. The spammer is using my proper email address (dwallach@...) on each message, but a different "real" name on each one.  The name "Dan Wallach" does not appear anywhere.
  2. I forward everything to Gmail.  Gmail considers all of this backscatter to be spam.  That's probably the correct answer, but I'm not sure I want to train my own DSPAM to do the same thing.  (DSPAM runs locally, and then I save a local copy and forward to Gmail.)  If I send a real message and it legitimately bounces, I want to know about it.  If I train DSPAM that all of these delivery status notifications are spam, it will inevitably throw away anything from "mailer-daemon".  I'm unclear on whether that's good or bad.
  3. You could easily build a bounce-message validator.  Every backscatter seems to have the original message ID in it, somewhere.  If the backscatter mentions a message ID that my system actually generated, then the backscatter is allowed.  Otherwise it's dropped.  (This idea appears to be a variation of VERP; I'd make the message ID be a keyed MAC of a sequence number.)
  4. A large number of these spams have a message body consisting entirely of "Take a look at yourself :)"  and linking to "video.exe" on a variety of different web sites.  Gmail helpfully rewrites those links such that they can track that I clicked on it.  This would also seem to give them an opportunity to give me an anti-virus warning, but they don't do any such thing.  ("video.exe" is one of the common names used by the Storm worm.)
  5. Many spams include links that redirect through Google's PageAd server to yet another server.  I clicked on one of them.  It appears that the PageAd redirector worked, but then Firefox's "badware" detector caught the destination as being bad, ultimately taking me to stopbadware.org.  Go Firefox!
  6. Some legit antispam firewall products (including Barracuda) are helpfully telling me my message "was blocked by our Spam Firewall. The email you sent with the following subject has NOT BEEN DELIVERED".  This is clearly broken behavior.  Just drop it and move on!
  7. Several of the backscatter messages are actually validation messages (sender address verification).  This has been largely discredited due to a variety of practical problems, never mind common-case annoyance to normal users.
  8. One of the spammers seems to be quite keen to sell replicas of expensive wristwatches, and those links take you to some kind of seemingly real online store, albeit with a funky DNS name.  Somehow, even if I did want a fake expensive watch, I'm not sure I'd be comfortable typing my credit card number into a web site whose name is a list of random characters and who (clearly) is closely related to the underworld of lecherous spammers.

EDIT: fixed post that had gone out before it was done.

Tagged:  

attack of the context-sensitive blog spam?

I love spammers, really I do. Some of you may recall my earlier post here about freezing your credit report. In the past week, I've deleted two comments that were clearly spam and that made it through Freedom to Tinker's Akismet filter. Both had generic, modestly complementary language and a link to some kind of credit card application processing site. What's interesting about this? One of two things.

  1. Akismet is letting those spams through because their content is "related" to the post.
  2. Or more ominously, the spammer in question is trolling the blogosphere for "relevant" threads and is then inserting "relevant" comment spam.

If it's the former, then one can certainly imagine that Akismet and other such filters will eventually improve to the point where the problem goes away (i.e., even if it's "relevant" to a thread here, if it's posted widely then it must be spam). If it's the latter, then we're in trouble. How is an automated spam catcher going to detect "relevant" spam that's (statistically) on-topic with the discussion where it's posted and is never posted anywhere else?

Tagged:  

Debate: Will Spam Get Worse?

This week I participated in Business Week Online's Debate Room feature, where two people write short essays on opposite sides of a proposition.

The proposition: "Regardless of how hard IT experts work to intercept the trillions of junk e-mails that bombard hapless in-boxes, the spammers will find ways to defeat them." I argued against, concluding that "We'll never be totally free of spam, but in the long run it's a nuisance—not a fundamental threat—to the flourishing of the Internet."

Wikipedia Leads; Will Search Engines NoFollow?

Wikipedia has announced that all of its outgoing hyperlinks will now include the rel="nofollow" attribute, which instructs search engines to disregard the links. Search engines infer a page's importance by seeing who links to it – pages that get many links, especially from important sites, are deemed important and are ranked highly in search results. A link is an implied endorsement: "link love". Adding nofollow withholds Wikipedia's link love – and Wikipedia, being a popular site, has lots of link love to give.

Nofollow is intended as an anti-spam measure. Anybody can edit a Wikipedia page, so spammers can and do insert links to their unwanted sites, thereby leeching off the popularity of Wikipedia. Nofollow will reduce spammers' incentives by depriving them of any link love. Or that's the theory, at least. Bloggers tried using nofollow to attack comment spam, but it didn't reduce spam: the spammers were still eager to put their spammy text in front of readers.

Is nofollow a good idea for Wikipedia? It depends on your general attitude toward Wikipedia. The effect of nofollow is to reduce Wikipedia's influence on search engine rankings (to zero). If you think Wikipedia is mostly good, then you want it to have influence and you'll dislike its use of nofollow. If you think Wikipedia is unreliable and random, then you'll be happy to see its influence reduced.

As with regular love, it's selfish to withhold link love. Sometimes Wikipedia links to a site that competes with it for attention. Without Wikipedia's link love, the other site will rank lower, and it could lose traffic to Wikipedia. Whether intended or not, this is one effect of Wikipedia's action.

There are things Wikipedia could do to restore some of its legitimate link love without helping spammers. It could add nofollow only to links that are suspect – links that are new, or were added by an user without a solid track record on the site, or that have survived several rewrites of a page, or some combination of such factors. Even a simple policy of using nofollow for the first two weeks might work well enough. Wikipedia has the data to make these kinds of distinctions, and it's not too much to ask for a site of its importance to do the necessary programming.

But the one element missing so far in this discussion is the autonomy of the search engines. Wikipedia is asking search engines not to assign link love, but the search engines don't have to obey. Wikipedia is big enough, and quirky enough, that the search engines' ranking algorithms probably have Wikipedia-specific tweaks already. The search engines have surely studied whether Wikipedia's link love is reliable enough – and if it's not, they are surely compensating, perhaps by ignoring (or reducing the weight of) Wikipedia links, or perhaps by a rule such as ignoring links for the first few weeks.

Whether or not Wikipedia uses nofollow, the search engines are free to do whatever they think will optimize their page ranking accuracy. Wikipedia can lead, but the search engines won't necessarily nofollow.

Tagged:  

Spam is Back

A quiet trend broke into the open today, when the New York Times ran a story by Brad Stone on the recent increase in email spam. The story claims that the volume of spam has doubled in recent months, which seems about right. Many spam filters have been overloaded, sending system administrators scrambling to buy more filtering capacity.

Six months ago, the conventional wisdom was that we had gotten the upper hand on spammers by using more advanced filters that relied on textual analysis, and by identifying and blocking the sources of spam. One smart venture capitalist I know declared spam to be a solved problem.

But now the spammers have adopted new tactics: sending spam from botnets (armies of compromised desktop computers), sending images rather than text, adding randomly varying noise to the messages to make them harder to analyze, and providing fewer URLs in messages. The effect of these changes is to neutralize the latest greatest antispam tools; and so the spammers are pulling back ahead, for now.

In the long view, not much has changed. The arms race will continue, with each side deploying new tricks in response to the other side's moves, unless one side is forced out by economics, which looks unlikely.

To win, the good guys must make the cost of sending a spam message exceed the expected payoff from that message. A spammer's per-message cost and payoff are both very small, and probably getting smaller. The per-message payoff is probably decreasing as spammers are forced to new payoff strategies (e.g., switching from selling bogus "medical" products to penny-stock manipulation). But their cost to send a message is also dropping as they start to use other people's computers (without paying) and those computers get more and more capable. Right now the cost is dropping faster, so spam is increasing.

From the good guys' perspective, the cost of spam filtering is increasing. Organizations are buying new spam-filtering services and deploying more computers to run them. The switch to image-based spam will force filters to use image analysis, which chews up a lot more computing power than the current textual analysis. And the increased volume of spam will make things even worse. Just as the good guys are trying to raise the spammers' costs, the spammers' tactics are raising the good guys' costs.

Spam is growing problem in other communication media too. Blog comment spam is rampant – this blog gets about eight hundred spam comments a day. At the moment our technology is managing them nicely (thanks to akismet), but that could change. If the blog spammers get as clever as the email spammers, we'll be in big trouble.

Tagged:  

Why So Little Attention to Botnets?

Our collective battle against botnets is going badly, according to Ryan Naraine's recent article in eWeek.

What's that? You didn't know we were battling botnets? You're not alone. Though botnets are a major cause of Internet insecurity problems, few netizens know what they are or how they work.

In this context, a "bot" is a malicious software agent that gets installed on an unsuspecting user's computer. Bots get onto computers by exploiting security flaws. Once there, they set up camp and wait unobtrusively for instructions. Bots work in groups, called "botnets", in which many thousands of bots (hundreds of thousands, sometimes) all over the Net work together at the instruction of a remote badguy.

Botnets can send spam or carry out coordinated security attacks on targets elsewhere on the Net. Attacks launched by botnets are very hard to stop because they come from so many places all at once, and tracking down the sources just leads to innocent users with infected computers. There is an active marketplace in which botnets are sold and leased.

Estimates vary, but a reasonable guess is that between one and five percent of the computers on the net are infected with bots. Some computers have more than one bot, although bots nowadays often try to kill each other.

Bots exploit the classic economic externality of network security. A well-designed bot on your computer tries to stay out of your way, only attacking other people. An infection on your computer causes harm to others but not to you, so you have little incentive to prevent the harm.

Nowadays, bots often fight over territory, killing other bots that have infected the same machine, or beefing up the machine's defenses against new bot infections. For example, Brian Krebs reports that some bots install legitimate antivirus programs to defend their turf.

If bots fight each other, a rationally selfish computer owner might want his computer to be infected by bots that direct their attacks outward. Such bots would help to defend the computer against other bots that might harm the computer owner, e.g. by spying on him. They'd be the online equivalent of the pilot fish that swim into sharks' mouths with impunity, to clean the sharks' teeth.

Botnets live today on millions of ordinary users' computers, leading to nasty attacks. Some experts think we're losing the war against botnets. Yet there isn't much public discussion of the problem among nonexperts. Why not?

Syndicate content