Posts tagged with: Copyright

Jailbreaking Copyright's Extended Scope

A bit late for the rule's "triennial" cycle, the Librarian of Congress has released the sec 1201(a)(1)(C) exceptions from the DMCA prohibitions on circumventing copyright access controls. For the next three years, people will not be " circumventing" if they "jailbreak" or unlock their smartphones, remix short portions of motion pictures on DVD (if they are college and university professors or media students, documentary filmmakers, or non-commercial video-makers), research the security of videogames, get balky obsolete dongled programs to work, or make an ebook read-aloud. (I wrote about the hearings more than a year ago, when the movie studios demoed camcording a movie -- that didn't work to stop the exemption.)

Since I've criticized the DMCA's copyright expansion, I was particularly interested in the inter-agency debate over EFF's proposed jailbreak exemption. Even given the expanded "para-copyright" of anticircumvention, the Register of Copyrights and NTIA disagreed over how far the copyright holder's monopoly should reach. The Register recommended that jailbreaking be exempted from circumvention liability, while NTIA supported Apple's opposition to the jailbreak exemption.

According to the Register (PDF), Apple's "access control [preventing the running of unapproved applications] does not really appear to be protecting any copyright interest." Apple might have had business reasons for wanting to close its platform, including taking a 30% cut of application sales and curating the iPhone "ecosystem," those weren't copyright reasons to bar the modification of 50 bytes of code.

NTIA saw it differently. In November 2009, after receiving preliminary recommendations from Register Peters, Asst. Secretary Larry Strickling wrote (PDF):

NTIA does not support this proposed exemption [for cell phone jailbreaking].... Proponents argue that jailbreaking will support open communications platforms and the rights of consumers to take maximum advantage of wireless networks and associated hardware and software. Even if permitting cell phone "jailbreaking" could facilitate innovation, better serve consumers, and encourage the market to utilize open platforms, it might just as likely deter innovation by not allowing the developer to recoup its development costs and to be rewarded for its innovation. NTIA shares proponents' enthusiasm for open platforms, but is concerned that the proper forum for consideration of these public policy questions lies before the expert regulatory agencies, the U.S. Department of Justice and the U.S. Congress.

The debate affects what an end-user buys when purchasing a product with embedded software, and how far copyright law can be leveraged to control that experience and the market. Is it, as Apple would have it, only the right to use the phone in the closed "ecosystem" as dictated by Apple, with only exit (minus termination fees) if you don't like it there? or is it a building block, around which the user can choose a range of complements from Apple and elsewhere? In the first case, we see the happenstance of software copyright locking together a vertically integrated or curated platform, forcing new entrants to build the whole stack in order to compete. In the second, we see opportunities for distributed innovation that starts at a smaller scale: someone can build an application without Apple's approval, improving the user's iPhone without starting from scratch.

NTIA would send these "public policy" questions to Congress or the Department of Justice (antitrust), but the Copyright Office and Librarian of Congress properly handled them here. "[T]he task of this rulemaking is to determine whether the availability and use of access control measures has already diminished or is about to diminish the ability of the public to engage in noninfringing uses of copyrighted works similar or analogous to those that the public had traditionally been able to make prior to the enactment of the DMCA," the Register says. Pre-DMCA, copyright left room for reverse engineering for interoperability, for end-users and complementors to bust stacks and add value. Post-DMCA, this exemption helps to restore the balance toward noninfringing uses.

In a related vein, economists have been framing research into proprietary strategies for two-sided markets, in which a platform provider is mediating between two sets of users -- such as iPhone's end-users and its app developers. In their profit-maximizing interests, proprietors may want to adjust both price and other aspects of their platforms, for example selecting fewer app developers than a competitive market would support so each earns a scarcity surplus it can pay to Apple. But just because proprietors want a constrained environment does not mean that the law should support them, nor that end-users are better off when the platform-provider maximizes profits. Copyright protects individual works against unauthorized copying; it should not be an instrument of platform maintenance -- not even when the platform is or includes a copyrighted work.

Census of Files Available via BitTorrent

BitTorrent is popular because it lets anyone distribute large files at low cost. Which kinds of files are available on BitTorrent? Sauhard Sahi, a Princeton senior, decided to find out. Sauhard's independent work last semester, under my supervision, set out to measure what was available on BitTorrent. This post, summarizing his results, was co-written by Sauhard and me.

Sauhard chose a (uniform) random sample of files available via the trackerless variant of BitTorrent, using the Mainline DHT. The sample comprised 1021 files. He classified the files in the sample by file type, language, and apparent copyright status.

Before describing the results, we need to offer two caveats. First, the results apply only to the Mainline trackerless BitTorrent system that we surveyed. Other parts of the BitTorrent ecosystem might be different. Second, all files that were available were equally likely to appear in the sample -- the sample was not weighted by number of downloads, and it probably contains files that were never downloaded at all. So we can't say anything about the characteristics of BitTorrent downloads, or even of files that are downloaded via BitTorrent, only about files that are available on BitTorrent.

With that out of the way, here's what Sauhard found.

File types

46% movies and shows (non-pornographic)
14% games and software
14% pornography
10% music
1% books and guides
1% images
14% could not classify

Movies/Shows

For the movies and shows category, the predominant file format was AVI, and other formats included RMVB (a proprietary format for RealPlayer), MPEG, raw DVD, and some multi-part RAR archives. Interestingly, this section was heavily biased towards recent movies, instead of being spread out evenly over a number of years. In descending order of frequency, we found that 60% of the randomly selected movies and shows were in English, 8% were in Spanish, 7% were in Russian, 5% were in Polish, 5% were in Japanese, 4% were in Chinese, 4% could not be determined, 3% were in French, 1% were in Italian, and other infrequent languages accounted for 2% of the distribution.

Games/Software

For the games and software category, there was no clearly dominant file type, but common file types for software included ISO disc images, multi-part RAR archives, and EXE (Windows executables). The games were targeted for running on different architectures, such as the XBOX 360, Nintendo Wii, and Windows PC’s. In descending order, we found that 74% of games and software in the sample were in English, 12% were in Japanese, 5% were in Spanish, 4% were in Chinese, 2% were in Polish, and 1% were in Russian and French each.

Pornography

For the pornography category, the predominant encoding format was AVI, similar to the movies category. However, there were significantly more MPG and WMV (Windows Media Video) files available. Also, most pornography torrents included the full pornographic video, a sample of the video (a 1-5 minute extract of the video), as well as posters or images of the porn stars in JPEG format. Also, as these videos are not typically dated like movies are, it is difficult to make any remarks regarding the recency bias for pornographic torrents. Our assumption would be that demand for pornography is not as time-sensitive as demand for movies, so it is likely that these pornographic videos constitute a broader spectrum of time than the movies do. In descending order, we found that 53% of pornography in our sample was in English, 16% was in Chinese, 15% was in Japanese, 6% was in Russian, 3% was in German, 2% was in French, 2% was unclassifiable, and Italian, Hindi, and Spanish appeared infrequently (1% each).

Music

For the music category, the predominant encoding format for music was MP3, there were some albums ripped to WMA (Windows Media Audio, a Microsoft codec), and there were also ISO images and multi-part RAR archives. There is still a bias towards recent albums and songs, but it is not as strongly evident as it is for movies—perhaps because people are more willing to continue seeding music even after it is no longer new, so these torrents are able to stay alive longer in the DHT. In descending order, we found that 78% of music torrents in our sample were in English, 6% were in Russian, 4% were in Spanish, 2% were in Japanese and Chinese each, and other infrequent languages appeared 1% each.

Books/Guides

The books/guides and images categories were fairly minor. We classified 15 torrents under books and guides—13 were in English, 1 was in French, and 1 was in Russian. We classified 3 image torrents—one was a set of national park wallpapers, one was a set of pictures of BMW cars (both of these are English), and one was a Japanese comic strip.

Apparent Copyright Infringement

Our final assessment involved determining whether or not each file seemed likely to be copyright-infringing. We classified a file as likely non-infringing if it appeared to be (1) in the public domain, (2) freely available through legitimate channels, or (3) user-generated content. These were judgment calls on our part, based on the contents of the files, together with some external research.

By this definition, all of the 476 movies or TV shows in the sample were found to be likely infringing. We found seven of the 148 files in the games and software category to be likely non-infringing—including two Linux distributions, free plug-in packs for games, as well as free and beta software. In the pornography category, one of the 145 files claimed to be an amateur video, and we gave it the benefit of the doubt as likely non-infringing. All of the 98 music torrents were likely infringing. Two of the fifteen files in the books/guides category seemed to be likely non-infringing.

Overall, we classified ten of the 1021 files, or approximately 1%, as likely non-infringing, This result should be interpreted with caution, as we may have missed some non-infringing files, and our sample is of files available, not files actually downloaded. Still, the result suggests strongly that copyright infringement is widespread among BitTorrent users.

Erroneous DMCA notices and copyright enforcement, part deux

A few weeks ago, I wrote about a deluge of DMCA notices and pre-settlement letters that CoralCDN experienced in late August. This article actually received a bit of press, including MediaPost, ArsTechnica, TechDirt, and, very recently, Slashdot. I'm glad that my own experience was able to shed some light on the more insidious practices that are still going on under the umbrella of copyright enforcement. More transparency is especially important at this time, given the current debate over the Anti-Counterfeiting Trade Agreement.

Given this discussion, I wanted to write a short follow-on to my previous post.

The VPA drops Nexicon

First and foremost, I was contacted by the founder of the Video Protection Alliance not long after this story broke. I was informed that the VPA has not actually developed its own technology to discover users who are actively uploading or downloading copyrighted material, but rather contracts out this role to Nexicon. (You can find a comment from Nexicon's CTO to my previous article here.) As I was told, the VPA was contracted by certain content publishers to help reduce copyright infringement of (largely adult) content. The VPA in turn contracted Nexicon to find IP addresses that are participating in BitTorrent swarms of those specified movies. Using the IP addresses given them by Nexicon, the VPA subsequently would send pre-settlement letters to the network providers of those addresses.

The VPA's founder also assured me that their main goal was to reduce infringement, as opposed to collecting pre-settlement money. (And that users had been let off with only a warning, or, in the cases where infringement might have been due to an open wireless network, informed how to secure their wireless network.) He also expressed surprise that there were false positives in the addresses given to them (beyond said open wireless), especially to the extent that appropriate verification was lacking. Given this new knowledge, he stated that the VPA dropped their use of Nexicon's technology.

BitTorrent and Proxies

Second, I should clarify my claims about BitTorrent's usefulness with an on-path proxy. While it is true that the address registered with the BitTorrent tracker is not usable, peers connecting from behind a proxy can still download content from other addresses learned from the tracker. If their requests to those addresses are optimistically unchoked, they have the opportunity to even engage in incentivized bilateral exchange. Furthermore, the use of DHT- and gossip-based discovery with other peers---the latter is termed PEX, for Peer EXchange, in BitTorrent---allows their real address to be learned by others. Thus, through these more modern discovery means, other peers may initiate connections to them, further increasing the opportunity for tit-for-tat exchanges.

Some readers also pointed out that there is good reason why BitTorrent trackers do not just accept any IP address communicated to it via an HTTP query string, but rather use the end-point IP address of the TCP connection. Namely, any HTTP query parameter can be spoofed, leading to anybody being able to add another's IP address to the tracker list. That would make them susceptible to receiving DMCA complaints, just we experienced with CoralCDN. From a more technical perspective, their machine would also start receiving unsolicited TCP connection requests from other BitTorrent peers, an easy DoS amplification attack.

That said, there are some additional checks that BitTorrent trackers could do. For example, if the IP query string or X-Forwarded-For HTTP headers are present, only add the network IP address if it matches the query string or X-Forwarded-For headers. Additionally, some BitTorrent tracker operators have mentioned that they have certain IP addresses whitelisted as trusted proxies; in those cases, the X-Forwarded-For address is used already. Otherwise, I don't see a good reason (plausible deniability aside) for recording an IP address that is known to be likely incorrect.

Best Practices for Online Technical Copyright Enforcement

Finally, my article pointed out a strategy that I clearly thought was insufficient for copyright enforcement: simply crawling a BitTorrent tracker for a list of registered IP addresses, and issuing a infringement notice to each IP address. I'll add to that two other approaches that I think are either insufficient, unethical, or illegal---or all three---yet have been bandied about as possible solutions.

  • Wiretapping: It has been suggested that network providers can perform deep-packet inspection (DPI) on their customer's traffic in order to detect copyrighted content. This approach probably breaks a number of laws (either in the U.S. or elsewhere), creates a dangerous precedent and existing infrastructure for far-flung Internet surveillance, and yet is of dubious benefit given the move to encrypted communication by file-sharing software.
  • Spyware: By surreptitiously installing spyware/malware on end-hosts, one could scan a user's local disk in order to detect the existence of potentially copyrighted material. This practice has even worse legal and ethical implications than network-level wiretapping, and yet politicians such as Senator Orrin Hatch (Utah) have gone as far as declaring that infringers' computers should be destroyed. And it opens users up to the real danger that their computers or information could be misused by others; witness, for example, the security weaknesses of China's Green Dam software.

So, if one starts from the position that copyrights are valid and should be enforceable---some dispute this---what would you like to see as best practices for copyright enforcement?

The approach taken by DRM is to try to build a technical framework that restricts users' ability to share content or to consume it in a proscribed manner. But DRM has been largely disliked by end-users, mostly in the way it creates a poor user experience and interferes with expected rights (under fair-use doctrine). But DRM is a misleading argument, as copyright infringement notices are needed precisely after "unprotected" content has already flown the coop.

So I'll start with two properties that I would want all enforcement agencies to take when issuing DMCA take-down notices. Let's restrict this consideration to complaints about "whole" content (e.g., entire movies), as opposed to those DMCA challenges over sampled or remixed content, which is a legal debate.

  • For any end client suspected of file-sharing, one MUST verify that the client was actually uploading or downloading content, AND that the content corresponded to a valid portion of a copyrighted file. In BitTorrent, this might be that the client sends or receives a complete file block, and that the file block hashes to the correct value specified in the .torrent file.
  • When issuing a DMCA take-down notice, the request MUST be accompanied by logged information that shows (a) the client's IP:port network address engaged in content transfer (e.g., a record of a TCP flow); (b) the actual application request/response that was acted upon (e.g., BitTorrent-level logs); and (c) that the transferred content corresponds to a valid file block (e.g., a BitTorrent hash).

So my question to the readers: What would you add to or remove from this list? With what other approaches do you think copyright enforcement should be performed or incentivized?

Tech Policy in the SkyMall Catalog

These days tech policy issues seem to pop up everywhere. During a recent flight delay, I was flipping through the SkyMall Catalog ("Holiday 2009" edition), and found tech policy even there.

There were lots of ads for surveillance and recording devices, some of them clearly useful for illegal purposes: the New Agent Cam HD Color Video Spy Camera (p. 14), the Original Agent Cam Color Video Spy Camera (p. 14), the Video Recording Sunglasses (p. 23), the Wireless Remote Controlled Pan and Tilt Surveillance Camera (p. 23), the Spy Pen (with hidden audio and video recorders, p. 42), the Orbitor Electronic Listening Device, and the GPS Tracking Key (p. 224)

There were also plenty of ads for media-copying technologies, of the sort that various copyright owners might find objectionable: the LP and Cassette to CD Recorder (p. 16), the Slide and Negative to Digital Picture Converter (p. 17), the Digital Photo to DVD Converter (p. 20), the Easy Ipod Media Sharer (p. 27), the One Step DVD/CD Duplicator (p. 31), the Photograph to Digital Picture Converter (p. 40), and the Crosley Encoding Turntable (converts LP records to MP3s, p. 179).

Are these things illegal? Probably not, I guess, but there are surely people out there who would want to make them illegal. And some of them are pretty good ways to get a tech policy debate started.

I'm not about to start reading the SkyMall catalog for fun. But it's interesting to know that it offers more than just Slankets and yeti statues.

Tagged:  

Inaccurate Copyright Enforcement: Questionable "best" practices and BitTorrent specification flaws

[Today we welcome my Princeton Computer Science colleague Mike Freedman. Mike's research areas include computer systems, network software, and security. He writes a technical blog about these topics at Princeton S* Network Systems -- required reading for serious systems geeks like me. -- Ed Felten]

In the past few weeks, Ed has been writing about targeted and inaccurate copyright enforcement. While it may be difficult to quantify the actual extent of inaccurate claims, we can at least try to understand whether copyright enforcement companies are making a "good faith" best effort to minimize any false positives. My short answer: not really.

Let's start with a typical abuse letter that gets sent to a network provider (in this case, a university) from a "copyright enforcement" company such as the Video Protection Alliance.

This notice is intended solely for the primary Massachusetts Institute of Technology internet service account holder. Someone using this account has engaged in illegal copying or distribution (downloading or uploading) of ...

Evidence:
Infringement Source: BitTorrent
Infringement Timestamp: 2009-08-28 09:33:20 PST
Infringers IP Address: 128.31.1.13
Infringers Port: 40951
...
The information in this notification is accurate. We have a good faith belief that use of the material in the manner complained of herein is not authorized by the copyright owner, its agent, or by operation of law. We swear under penalty of perjury, that we are authorized to act on behalf of DISCOUNT VIDEO CENTER INC..
...
You and everyone using this computer must immediately and permanently cease and desist the unauthorized copying and/or distribution (including, but not limited to, downloading, uploading, file sharing, file 'swapping' or other similar activities) of the videos and/or other content owned by DISCOUNT VIDEO CENTER INC., including, but is not limited to, the copyrighted material listed above.

DISCOUNT VIDEO CENTER INC. is prepared to pursue every available remedy including damages, recovery of attorney's fees, costs and any and all other claims that may be available to it in a lawsuit filed against you.

While DISCOUNT VIDEO CENTER INC. is entitled to monetary damages, attorneys' fees and court costs from the infringing party under 17 U.S.C. 504, DISCOUNT VIDEO CENTER INC. believes that it may be beneficial to settle this matter without the need of costly and time-consuming litigation. We have been authorized to offer a reasonable settlement to resolve the infringement of the works listed above. To access this settlement offer, please follow the directions below.

Settlement Offer: To access your settlement offer please copy and paste the address below into a browser and follow the instructions:

https://www.videoprotectionalliance.com/?n_id=AB-XXXXXX
Password: XXXXXXX

In other words: we have a record of you (supposedly) uploading and downloading BitTorrent content. That content is copyrighted. We could pursue costly and painful litigation, but if you want us to just go away, you can pay us now.

Now, any type of IP-based identification is not going to be perfect, especially given the wide-spread use of Network Address Translation (NAT) boxes and open WiFi at homes. Especially in dense urban areas, unapproved third parties might use their neighbor's wireless network for Internet access, potentially leading to the wrong homeowner being blamed. And IP-based identification relies on accurate ISP mappings from IP addresses to users, as these mappings change over time (although typically slowly) given dynamic address assignment (i.e., DHCP). But one could rightly claim that such sources of false positives are rare in practice and that a enforcement company is still making a best effort to accurately identify IP addresses engaging in copyright-infringing file sharing.

So what's a reasonable strategy to identify such infringing behavior?

Let me first give a high-level overview of how BitTorrent works. To download a particular file on BitTorrent, a client first needs to discover a set of other peers that have the file. Earlier peer-to-peer systems like Napster, Gnutella, and KaZaA had peers connect to one another somewhat randomly (or, in Napster, through a more centralized directory service). These peers would then broadcast search requests for files, downloading the content directly from those peers that responded as having matching files. In the basic BitTorrent architecture, on the other hand, the global ecosystem is split into distinct groups of users that are all trying to download a particular file. Each such group---known as a swarm---is managed by a centralized server called a tracker. The tracker keeps a list of the swarm's peers and, for each peer, a bit-vector of which file blocks it already has. When a client joins a swarm by announcing itself to the tracker, it gets a list of other peers, and it subsequently attempts to connect to them and download file blocks. How a client discovers a particular swarm is outside the scope of the system, but there are plenty of BitTorrent search engines that allow clients to perform keyword searches. These searches return .torrent files, which includes high-level meta-data about a particular swarm, including the URL(s) at which its tracker(s) can be accessed.

So there are three phases to downloading content from BitTorrent:

  1. Finding a .torrent meta-data file
  2. Registering with the .torrent's tracker and getting a list of peer addresses
  3. Connecting to a peer, swapping the bit-vector of which file blocks each has, and potentially downloading or uploading needed blocks

Unfortunately, the verification that copyright enforcement agencies such as the VPA use stops at #2. That is, if some random BitTorrent tracker lists your IP address as being part of a swarm, then the VPA considers this to be sufficient proof to warrant a DMCA takedown notice (such as the one above), with clear instructions on how to pay a monetary settlement. Now, a very reasonable question is whether such information should indeed constitute proof.

Last year, researchers at the University of Washington published a paper with the subtitle Why My Printer Received a DMCA Takedown Notice. Their conclusions were that:

  • Practically any Internet user can be framed for copyright infringement today.
  • Even without being explicitly framed, innocent users may still receive complaints.

The title came from the fact that they "registered" the IP address of a networked printer with BitTorrent trackers, and they subsequently received 9 DMCA takedown notices claiming that their printer was engaging in illegal file sharing. (They did not, however, receive any pre-settlement offers such as the one above, which suggests a possible escalation of enforcement techniques since then.)

I have had my own repeated experiences with such false claims. This September, for instance, a research system I operate called CoralCDN received approximately 100 pre-settlement letters, including the one above. A little background: CoralCDN is an open, free, self-organizing content distribution network (CDN). CDNs are widely used by commercial high-volume websites to scalably deliver their content, such as Hulu's use of Akamai or CNN's use of Level 3. CoralCDN was designed to help solve the Slashdot effect, which is when portals such as slashdot.org link to underprovisioned third-party sites and cause that site to become quickly overwhelmed by the unexpected surge of resulting traffic. CoralCDN's answer was to provide an open CDN that would cache and serve any URL that was requested from it. To use CoralCDN, one simply appends a suffix to a URL's hostname, i.e., http://www.cnn.com/ becomes http://www.cnn.com.nyud.net/. CoralCDN's been running on PlanetLab---a distributed research testbed of virtualized servers, spread over several hundred universities worldwide---since March 2004. It handles requests from about 2 million users per day.

Because CoralCDN provides an open platform, one can access any URL through it via an HTTP GET request (with the exception of a small number of blacklisted domains and those for content larger than 50MB). Thus, requests to BitTorrent trackers can also use CoralCDN, as these are simply HTTP GETs with a client's relevant information encoded in the tracker URL's query string, e.g., http://denis.stalker.h3q.com.6969.nyud.net/announce?info_hash=(hash)&peer_id=(name)&port=52864&uploaded=231374848&downloaded=2227372596&left=0&corrupt=0&key=E0591124&numwant=200&compact=1&no_peer_id=1.

Notice that the HTTP request includes a peer's unique name (a long random string) and a port number, but notably does not include an IP address for that client. It's an optional parameter in the specification that many BitTorrent clients don't include. (In fact, even if the request includes this IP parameter, some trackers ignore it.) Instead, the tracker records the network-level IP address from where the HTTP request originated (the other end of the TCP connection), together with the supplied port, as the peer's network address.

When this request is via an HTTP proxy, things go wrong. Here, the BitTorrent client is connecting to an HTTP proxy, which in turn is connecting to the tracker. So this practice results in the tracker recording an unusable address: the combination of the proxy's IP and the client's port. Needless to say, the proxy isn't running BitTorrent, let alone on that particular (often randomized) port. Not only does this design damage the client's BitTorrent experience---other clients won't initiate communication with it, leading to fewer opportunities for "tit-for-tat" data exchanges---but this also damages the entire swarm's performance: Others' requests to this hybrid address will all fail (typically with an RST response to the TCP connection request). I was rather surprised to find this flaw in the BitTorrent specification.

So how is this related to CoralCDN and the VPA? For whatever reason, some publisher started including a Coralized URL for the tracker's location, as shown above (http://denis.stalker.h3q.com.6969.nyud.net/). I could only surmise why this was done: perhaps on the (mistaken) assumption that it would reduce load on the server, or perhaps in the hope of offloading abuse complaints to CoralCDN servers. The latter might have been useful if copyright enforcement agencies were going after the trackers, instead of the participating peers. In fact, we initially thought this was the case when these pre-settlement letters from the VPA started rolling in. More careful analysis, however, exposed the above problem: when the BitTorrent URL was Coralized, peers' requests to the tracker were issued via CoralCDN HTTP proxies. Thus, the tracker built up a list of peer addresses of the form (CoralCDN IP : peer port), where these CoralCDN IPs correspond to PlanetLab servers located at various universities.

Hence, when the VPA began sending out pre-settlement letters claiming infringement, they sent them to network operators at tens of universities, who turned around and forwarded them to PlanetLab's central operations and me.

What is particularly striking about this case, however, is that these reports were demonstrably false! There was no BitTorrent client running at the specified address (in the above letter, 128.31.1.13:40951), for precisely the reasons I discuss. Thus, we can fairly definitively conclude that the VPA never actually tested the peer for actual infringement: not even by trying to connect to the client's address, let alone determining whether the client was actually uploading or download any data, and let alone valid data corresponding to the copyrighted file in question.

This begs the question as to what should be required for a company to issue a DMCA notification and pre-settlement letters that assert:

Someone using this account has engaged in illegal copying or distribution (downloading or uploading)...The information in this notification is accurate. We have a good faith belief that use of the material in the manner complained of herein is not authorized by the copyright owner.

Of course, the incentives for the VPA to actually ensure that "this notification is accurate" are pretty clear. The cost of a false positive is currently nothing, and perhaps some innocent users will even "buy protection" to make this problem and the threat of costly litigation go away.

DISCOUNT VIDEO CENTER INC. believes that it may be beneficial to settle this matter without the need of costly and time-consuming litigation. We have been authorized to offer a reasonable settlement to resolve the infringement of the works listed above.

It appears that the VPA and other such agencies have been rather effective at getting some settlement money. Our personal experience with DMCA takedown notices is that network operators are suitably afraid of litigation. Many will pull network access from machines as soon as a complaint is received, without any further verification or demonstrative network logs. In fact, many operators also sought "proof" that we weren't running BitTorrent or engaging in file sharing before they were willing to restore access. We'll leave the discussion about how we might prove such a negative to another day, but one can point to the chilling effect that such notices have had, when users are immediately considered guilty and must prove their innocence.

I am not arguing that copyright owners should not be able to take reasonable steps to protect their copyrighted material. I am arguing, however, that they should take similarly reasonable steps to ensure that any claimed infringement actually took place. When DMCA notices are accompanied by oaths under "penalty of perjury" and these claims are accepted as writ, as they have de facto become, there should some downside for agencies that demonstrably do not act in "good faith" to verify infringement. Even a simple TCP connection attempt would have been enough to dispel their flawed assumptions. That currently seems to be too much to ask.

Update (Dec 15): A follow-up post can be found here.

Targeted Copyright Enforcement vs. Inaccurate Enforcement

Let's continue our discussion about copyright enforcement against online infringers. I wrote last time about how targeted enforcement can deter many possible violators even if the enforcer can only punish a few violators. Clever targeting of enforcement can destroy the safety-in-numbers effect that might otherwise shelter a crowd of would-be violators.

In the online copyright context, the implication is that large copyright owners might be able to use lawsuit threats to deter a huge population of would-be infringers, even if they can only manage to sue a few infringers at a time. In my previous post, I floated some ideas for how they might do this.

Today I want to talk about the implications of this. Let's assume, for the sake of argument, that copyright owners have better deterrence strategies available -- strategies that can deter more users, more effectively, than they have managed so far. What would this imply for copyright policy?

The main implication, I think, is to shed doubt on the big copyright owners' current arguments in favor or broader, less accurate enforcement. These proposed enforcement strategies go by various names, such as "three strikes" and "graduated response". What defines them is that they reduce the cost of each enforcement action, while at the same time reducing the assurance that the party being punished is actually guilty.

Typically the main source of cost reduction is the elimination of due process for the accused. For example, "three strikes" policies typically cut off someone's Internet connection if they are accused of infringement three times -- the theory being that making three accusations is much cheaper than proving one.

There's a hidden assumption underlying the case for cheap, inaccurate enforcement: that the only way to deter infringement is to launch a huge number of enforcement actions, so that most of the would-be violators will expect to face enforcement. The main point of my previous post is that this assumption is not necessarily true -- that it's possible, at least in principle, to deter many people with a moderate number of enforcement actions.

Indeed, one of the benefits of an accurate enforcement strategy -- a strategy that enforces only against actual violators -- is that the better it works, the cheaper it gets. If there are few violators, then few enforcement actions will be needed. A high-compliance, low-enforcement equilibrium is the best outcome for everybody.

Cheap, inaccurate enforcement can't reach this happy state.

Let's say there are 100 million users, and you're using an enforcement strategy that punishes 50% of violators, and 1% of non-violators. If half of the people are violators, you'll punish 25 million violators, and you'll punish 500,000 non-violators. That might seem acceptable to you, if the punishments are small. (If you're disconnecting 500,000 people from modern communications technology, that would be a different story.)

But now suppose that user behavior shifts, so that only 1% of users are violating. Then you'll be punishing 500,000 violators (50% of the 1,000,000 violators) along with 990,000 non-violators (1% of the 99,000,000 non-violators). Most of the people you'll be punishing are innocent, which is clearly unacceptable.

Any cheap, inaccurate enforcement scheme will face this dilemma: it can be accurate, or it can be fair, but it can't be both. The better is works, the more unfair it gets. It can never reach the high-compliance, low-enforcement equilibrium that should be the goal of every enforcement strategy.

Tagged:  

Targeted Copyright Enforcement: Deterring Many Users with a Few Lawsuits

One reason the record industry's strategy of suing online infringers ran into trouble is that there are too many infringers to sue. If the industry can only sue a tiny fraction of infringers, then any individual infringer will know that he is very unlikely to be sued, and deterrence will fail.

Or so it might seem -- until you read The Dynamics of Deterrence, a recent paper by Mark Kleiman and Beau Kilmer that explains how to deter a great many violators despite limited enforcement capacity.

Consider the following hypothetical. There are 26 players, whom we'll name A through Z. Each player can choose whether or not to "cheat". Every player who cheats gets a dollar. There's also an enforcer. The enforcer knows exactly who cheated, and can punish one (and only one) cheater by taking $10 from him. We'll assume that players have no moral qualms about cheating -- they'll do whatever maximizes their expected profit.

This situation has two stable outcomes, one in which nobody cheats, and the other in which everybody cheats. The everybody-cheats outcome is stable because each player figures that he has only a 1/26 chance of facing enforcement, and a 1/26 chance of losing $10 is not enough to scare him away from the $1 he can get by cheating.

It might seem that deterrence doesn't work because the cheaters have safety in numbers. It might seem that deterrence can only succeed by raising the penalty to more than $26. But here comes Kleiman and Kilmer's clever trick.

The enforcer gets everyone together and says, "Listen up, A through Z. From now on, I'm going to punish the cheater who comes first in the alphabet." Now A will stop cheating, because he knows he'll face certain punishment if he cheats. B, knowing that A won't cheat, will then realize that if he cheats, he'll face certain punishment, so B will stop cheating. Now C, knowing that A and B won't cheat, will reason that he had better stop cheating too. And so on ... with the result that nobody will cheat.

Notice that the trick still works even if punishment is not certain. Suppose each cheater has an 80% chance of avoiding detection. Now A is still deterred, because even a 20% chance of being fined $10 outweighs the $1 benefit of cheating. And if A is deterred, then B is deterred for the same reason, and so on.

Notice also that this trick might work even if some of the players don't think things through. Suppose A through J are all smart enough not to cheat, but K is clueless and cheats anyway. K will get punished. If he cheats again, he'll get punished again. K will learn quickly, by experience, that cheating doesn't pay. And once K learns not to cheat, the next clueless player will be exposed and will start learning not to cheat. Eventually, all of the clueless players will learn not to cheat.

Finally, notice that there's nothing special about using alphabetical order. The enforcer could use reverse alphabetical or any other order, and the same logic would apply. Any ordering will do, as long as each player knows where he is in the order.

Now let's apply this trick to copyright deterrence. Suppose the RIAA announces that from now on they're going to sue the violators who have the lowest U.S. IP addresses. Now users with low IP addresses will have a strong incentive to avoid infringing, which will give users with slightly higher IP addresses a stronger incentive to avoid infringing, and so on.

You might object that infringers aren't certain to get caught, or that infringers might be clueless or irrational, or that IP address order is arbitrary. But I explained above why these objections aren't necessarily showstoppers. Players might still be deterred even if detection is a probability rather than a certainty; clueless players might still learn by experience; and an arbitrary ordering can work perfectly well.

Alternatively, the industry could use time as an ordering, by announcing, for example, that starting at 8:00 PM Eastern time tomorrow evening, they will sue the first 1000 U.S. users they see infringing. This would make infringing at 8:00 PM much riskier than normal, which might keep some would-be infringers offline at that hour, which in turn would make infringing at 8:00 PM even riskier, and so on. The resulting media coverage ("I infringed at 8:02 and now I'm facing a lawsuit") could make the tactic even more effective next time.

(While IP address or time ordering might work, many other orderings are infeasible. For example, they can't use alphabetical ordering on the infringers' names, because they don't learn names until later in the process. The ideal ordering is one that can be applied very early in the investigative process, so that only cases at the beginning of the ordering need to be investigated. IP address and time ordering work well in this respect, as they are evident right away and are evident to would-be infringers.)

I'm not claiming that this trick will definitely work. Indeed, it would be silly to claim that it could drive online infringement to zero. But there's a chance that it would deter more infringers, for longer, than the usual approach of seemingly random lawsuits has managed to do.

This approach has some interesting implications for copyright policy, as well. I'll discuss those next time.

Tagged:  

Chilling and Warming Effects

For several years, the Chilling Effects Clearinghouse has cataloging the effects of legal threats on online expression and helping people to understand their rights. Amid all the chilling we continue to see, it's welcome to see rays of sunshine when bloggers stand up to threats, helping to stop the cycle of threat-and-takedown.

The BoingBoing team did this the other day when they got a legal threat from Ralph Lauren's lawyers over an advertisement they mocked on the BoingBoing blog for featuring a stick-thin model. The lawyers claimed copyright infringement, saying "PRL owns all right, title, and interest in the original images that appear in the Advertisements." Other hosts pull content "expeditiously" when they receive these notices (as Google did when notified of the post on Photoshop Disasters), and most bloggers and posters don't counter-notify, even though Chilling Effects offers a handy counter-notification form.

Not BoingBoing, they posted the letter (and the image again) along with copious mockery, including an offer to feed the obviously starved models, and other sources picked up on the fun. The image has now been seen by many more people than would have discovered it in BoingBoing's archives, in a pattern the press has nicknamed the "Streisand Effect."

We use the term "chilling effects" to describe indirect legal restraints, or self-censorship, because most cease-and-desist letters don't go through the courts. The lawyers (and non-lawyers) sending them rely on the in terrorem effects of threatened legal action, and often succeed in silencing speech for the cost of an e-postage stamp.

Actions like BoingBoing's use the court of public opinion to counter this squelching. They fight legalese with public outrage (in support of legal analysis), and at the same time, help other readers to understand they have similar rights. Further, they increase the "cost" of sending cease-and-desists, as they make potential claimants consider the publicity risks being made to look foolish, bullying, or worse.

For those curious about the underlying legalities here, the Copyright Act makes clear that fair use, including for the purposes of commentary, criticism, and news reporting, is not an infringement of copyright. See Chilling Effects' fair use FAQ. Yet the DMCA notice-and-takedown procedure encourages ISPs to respond to complaints with takedown, not investigation and legal balancing. Providers like BoingBoing's Priority Colo should also get credit for their willingness to back their users' responses.

As a result of the attention, Ralph Lauren apologized for the image: "After further investigation, we have learned that we are responsible for the poor imaging and retouching that resulted in a very distorted image of a woman's body. We have addressed the problem and going forward will take every precaution to ensure that the caliber of our artwork represents our brand appropriately."

May the warming (and proper attention to the health of fashion models) continue!

[cross-posted at Chilling Effects]

Android Open Source Model Has a Short Circuit

Last year, Google entered the mobile phone market with a Linux-based mobile operating system. The company brought together device manufacturers and carriers in the Open Handset Alliance, explaining that, "Together we have developed Android™, the first complete, open, and free mobile platform." There has been considerable engagement from the open source developer community, as well as significant uptake from consumers. Android may have even been instrumental in motivating competing open platforms like LiMo. In addition to the underlying open source operating system, Google chose to package essential (but proprietary) applications with Android-based handsets. These applications include most of the things that make the handsets useful (including basic functions to sync with the data network). This two-tier system of rights has created a minor controversy.

A group of smart open source developers created a modified version of the Android+Apps package, called Cyanogen. It incorporated many useful and performance-enhancing updates to the Android OS, and included unchanged versions of the proprietary Apps. If Cyanogen hadn't included the Apps, the package would have been essentially useless, given that Google doesn't appear to provide a means to install the Apps on a device that has only a basic OS. As Cyanogen gained popularity, Google decided that it could no longer watch the project distribute their copyright-protected works. The lawyers at Google decided that they needed to send a Cease & Desist letter to the Cyanogen developer, which caused him to take the files off of his site and spurred backlash from the developer community.

Android represents a careful balance on the part of Google, in which the company seeks to foster open platforms but maintain control over its proprietary (but free) services. Google has stated as much, in response to the current debate. Android is an exciting alternative to the largely closed-source model that has dominated the mobile market to date. Google closely integrated their Apps with the operating system in a way that makes for a tremendously useful platform, but in doing so hampered the ability of third-party developers to fully contribute to the system. Perhaps the problem is simply that they did not choose the right location to draw the line between open vs. closed source -- or free-to-distribute vs. not.

The latter distinction might offer a way out of the conundrum. Google could certainly grant blanket rights to third-parties to redistribute unchanged versions of their Apps. This might compromise their ability to make certain business arrangements with carriers or handset providers in which they package the software for a fee. That may or may not be worth it from their business perspective, but they could have trouble making the claim that Android is a "complete, open, and free mobile platform" if they don't find a way to make it work for developers.

This all takes place in the context of a larger debate over the extent to which mobile platforms should be open -- voluntarily or via regulatory mandate. Google and Apple have been arguing via letters to the FCC about whether or not Apple should allow the Google Voice application in the iPhone App Store. However, it is yet to be determined whether the Commission has the jurisdiction and political will to do anything about the issue. There is a fascinating sideshow in that particular dispute, in which AT&T has made the very novel claim that Google Voice violates network neutrality (well, either that or common carriage -- they'll take whichever argument they can win). Google has replied. This is a topic for another day, but suffice to say the clear regulatory distinctions between telephone networks, broadband, and devices have become muddied.

(Cross-posted to Managing Miracles)

A Freedom-of-Speech-based Approach To Limiting Filesharing - Part III: Smoke, smoke!

Over the past two days we have seen that filesharing is vulnerable to spamming, and that as a defense, the filesharers have used the IP block list to exclude the spammers from sharing files. Today I discuss how I think lawyers and laypeople should look at the legal issues. Since I am most decidedly not a lawyer, nothing I say here should be considered definitive. Hopefully, it is at least interesting.

An analogy:

Washington Square, in New York City, was for many years a place where drugs were sold. A fellow would stand around quietly saying to passersby "Smoke, smoke!" However, this so-called "steerer" held no drugs. His role was simply to direct the buyer to the "pitcher", who had the drugs somewhere nearby, and who kept silent.

Even the strongest defender of free-speech rights understands that the "steerer's" words are not just speech. His words are not similar to those of this article, though both simply say that someone in the park is selling. He is as legally responsible for the sale as the "pitcher", because they are, according to legal terminology, "acting in concert". He is a drug dealer who may never touch any drugs. Note also that the "steerer" receives payments from the illegal transactions - though it is not in fact legally necessary to be able to prove the payments to establish that he's "acting in concert". All that's required is that the "steerer" and the "pitcher" share "community of purpose" in facilitating the illegal transaction.

In the Napster case, the court held that Napster, even though it did not have any copyrighted data on its servers, was liable for contributory infringement. To use Napster, a downloader would login to Napster's central server, which connected the user to another user who had a file that was being searched for. Since it was Napster's role to hook up the parties illegally exchanging files, it is reasonable to see this as analogous to the "steerer" in Washington Square - Napster didn't have the infringing materials, but that really isn't a defense.

The gnutella network is decentralized to solve the legal problem presented by the Napster decision. Nonetheless, there is something still centralized in gnutella: the IP block list. Users of LimeWire get their block list from LimeWire and only from LimeWire. Accordingly, if Napster was like the "steerer" in Washington Square, LimeWire furthers the "community of purpose" in a different way; it is someone who gives negative information rather than affirmative. He's someone paid to stand in the park pointing out who are cheaters selling bad drugs, allowing the purchasers to find the good stuff.

What is a legitimate P2P spam filtering authority versus one that shares "community of purpose" with infringers? The former could legitimately act to keep the network from being flooded by those selling weight loss drugs, without facilitating infringing. There is probably no bright-line rule, but it is reasonably clear that LimeWire is well on the wrong side of any possible grey area.

It's useful to compare gnutella spam cop LimeWire with e-mail spam cop AOL.

LimeWire does not clearly advertise its spam cop role as a feature of its software, and does not discuss its block list. (The LimeWire web site has only the cryptic description "We're always working to protect you from viruses and unwanted sharing.") There is no discussion anywhere about what sorts of sites and files it is blocking and for what reason. No notification is given by LimeWire to a site when it is blocked, nor is there any way given to contact LimeWire to remove yourself from the block list.

In comparison, blocking e-mail spam is, for AOL, a major selling point. AOL does not block bulk e-mailers (many of which are legitimate) on a whim. Every e-mail rejected by AOL is bounced with a notification to the sender, and there are detailed instructions to bulk e-mailers as to what they need to do to avoid running afoul of AOL's filters. There is a way to contact AOL to remove oneself from the block list, if one is legitimate. The whole process is transparent.

It is clear that a legitimate spam cop cannot block spoofers, since any search for a non-infringing file would be unmolested by spoofs, yet it appears that LimeWire does block MediaDefender. In fact, LimeWire appears to be quietly promising to do so, when it says that it protects against "unwanted sharing", whatever that is.

Lastly, it appears that LimeWire's statements in court conceal what it is doing.

As we mentioned in the first post, there is an ongoing case, Arista v Lime Group. In its motion for Summary Judgement, LimeWire states

Likewise, LW does not have the ability to control the manner in which users employ the LimeWire software. Unlike the Napster defendants, LW does not maintain central servers containing files or indices of files. ... LW's system is like that analysed by the Ninth Circuit in Grokster, "truly decentralized". ... LW no more controls the actions of its customers than do any of the thousands of companies that provide hardware or other software used in connection with the internet.

This omits any discussion of LimeWire's centralized block list. LW assuredly does control the manner in which LimeWire users employ the LimeWire software, because if a site is added to the IP block list, it is no longer visible to most LimeWire users. This is very far from the normal situation applying in other software used in connection with the internet.

Moreover, the plaintiffs' attorneys appear to be unaware of the blocking of spoofs, as their reply motion makes no mention of it (nor the other hidden features of LimeWire software discussed yesterday).

While it might be possible to run a legitimate spam-blocking service for P2P networks, it would look rather different from what LimeWire is doing.

Conclusion

The best way to regulate filesharing effectively is to analyze the various players' roles on free-speech grounds. The individual filesharers (when they share infringing material) are certainly violating the law, but in a small way that probably can't be reasonably controlled. The publishers of the software that allows the network to run (including LimeWire) are exercising free speech - the fact that their code can be made to do something illegal should be irrelevant. However, LimeWire is facilitating infringing because of the way it runs its IP block list. If LimeWire were shut down, the gnutella network become useless for downloading infringing music. Because of their actions to keep the network safe for infringers - their "acting in concert" - LimeWire should be liable for contributory infringement.

This course will avoid free speech restrictions that trouble many. In terms of preventing infringing, it also will be far more productive than trying to target the small fish. It is an effective measure that respects rights.

[This series of posts has been a somewhat shortened version of an article here.]

Syndicate content