In Experian Information Solutions, Inc. v. Nationwide Marketing Services, Inc., No. 16-16987 (9th Cir. 2018), the Ninth Circuit affirmed the limited protection available for databases under copyright.
Plaintiff Experian Information Systems, Inc., created its ConsumerView Database, containing names and addresses of more than 250 million consumers. This information was valuable, because marketers will pay significant fees for accurate pairings of names and addresses. Experian expended significant efforts to collect the data from many sources, such as real estate deeds and warranty cards. It also used both human and automated methods to maximize the reliability of the data.
Experian discovered the basis for its lawsuit when a broker tried to sell Experian its own data — at a low price — on behalf of defendant Natimark. Experian sued for copyright infringement, and when that claim was dismissed, trade secret misappropriation.
Examining the issue of whether such data is protectable under copyright, particularly, given Experian’s significant effort to ensure the data was accurate, the Ninth Circuit held that the database was copyrightable as a compilation (disagreeing with the district court). But it affirmed summary judgment dismissing the copyright claim because Experian did not show “bodily appropriation” of the work, in part because Natimark’s database was “materially smaller” than Experian’s. However, it held that with proper efforts to keep the information confidential, Experian’s lists could be protected as trade secrets. The court remanded on the trade secret issue only.
This case underscores one of the thorny doctrinal difficulties for “open data licensing.” Data that is publicly available (and therefore not protected by any trade secret interest) has very limited copyright protection. Absent a contract binding a recipient to limited use, it is hard to enforce a condition in a copyright license to data, because most uses other than wholesale copying would be non-infringing. Recipients can “engineer around” a thin copyright by supersetting, subsetting, or changing the data. Accordingly, licenses that attempt to apply a “copyleft” condition to “derivative works” of data have an even bigger challenge than corresponding software licenses. Such conditions are premised on the power of copyright. It is extremely difficult to tell whether one data set is “derivative” of another, and very difficult to preserve any copyright interest in the face of downstream modifications.