OpenSSL Moves to Apache 2.0 Software License

OpenSSL has completed a re-licensing effort, resulting in adoption of Apache 2.0.   The project announced this effort in 2015.  The project got permission from contributors via a CLA.

The OpenSSL/SSLeay license was a non-standard permissive license, which included attribution clauses of the kind deprecated in Apache 1.0, such as:

All advertising materials mentioning features or use of this software must display the following acknowledgment: "This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit. ("

and the mysterious statement:

The licence and distribution terms for any publically available version or derivative of this code cannot be changed.  i.e. this code cannot simply be copied and put under another distribution licence * [including the GNU Public Licence.]

This caused many to wonder whether the license was truly permissive.  Over the years, users (and reluctantly, their lawyers) accepted it as permissive, but not without some angst.

Kudos to the project for clarifying and harmonizing the license for this ubiquitous bit of software.

How the Penguin Became a Benedictine Monk: Open Source and Codes of Community Conduct

Recently, a popular open source database project, SQLite, adopted a code of conduct for its participants.  Codes of conduct are not news — they have become increasingly common in open source communities in recent years.  But this one was different: the project adopted Rule IV of the Rule of Saint Benedict, written in the 6th century by Saint Benedict of Nursia, and still used as ethical precepts for the Benedictine order of monks.  Rule IV, Instruments of Good Works, contains such elements as “To do as one would be done by” — which is hard to argue with.  But it also contains elements like “To deny oneself that one may follow Christ” and “To be frequently occupied in prayer” — which were more controversial, and probably not applicable to code development.  Well, maybe prayer helps to find some of the more elusive bugs.

The actual reach of the code of conduct was limited.  The project consisted of only a few developers who all agreed on it, and it was in place for months before anyone objected.  The SQLite project said of the Rule:

This code of ethics has proven its mettle in thousands of diverse communities for over 1,500 years, and has served as a baseline for many civil law codes since the time of Charlemagne.

When the adoption of the Rule was publicized, the open source world wondered whether it was a prank, or perhaps merely a performance-art commentary on the recent spate of codes of conduct adopted in open source world.  After community complaints, the Rule was retracted and replaced with the more conventional Mozilla Community Guidelines. Although the SQLite developers’ hearts were probably in the right place, the Christian underpinnings of the code, and its original formulation for an exclusively male community, were a bridge too far. The project now styles it as a voluntary code of ethics for the founder only.

How did this happen?

#METOO and Volunteer Communities

In the United States, the #METOO movement recently shined a spotlight on tacit approval of sexual harassment across many sectors.  Private businesses have long employed codes of conduct to avoid harassment claims, but now the spotlight is on all organizations.  The open source development community, self-organized and ideologically resistant to central control, has slowly begun to adopt similar codes of conduct — but the road has been a zigzag path.  If managing an open source community is like herding cats, applying a code of conduct is like making cats take a loyalty oath.

Adopting codes of conduct has two related goals: elimination of bias and inclusiveness.  Elimination of bias is remediative — it seeks to give participants the opportunity to participate on a level playing field, by avoiding conduct that discourages participation.  Codes of conduct prohibit racist, sexist, or otherwise hateful speech or actions. Inclusiveness is proactive — it seeks to attract new participation in the community by persons in underrepresented groups.  The tools to achieve inclusiveness are, at this time, less consistently used.

In tackling these goals, the open source community slipstreams on best practices in private employment, which have been developing for quite some time.  Most private businesses today, for example, have express anti-harassment and policies and a process for reporting and assessing harassment claims. These policies were developed mostly as a response to employment-related legal claims under state and federal law arising from the existence of a “hostile work environment.”  Written policies are helpful to avoid liability and communicate business culture.  

Some businesses have taken proactive steps to promote inclusiveness.  An example is the “Rooney Rule” adopted by the NFL.   Businesses that seek to promote hiring or advancement of minorities or women tread a fine line, though — establishing quotas or preferences can expose them to claims of reverse discrimination.  See, for example, City of Richmond v. J. A. Croson Co., 488 U.S. 469 (1989).  Thus, most tools for inclusiveness avoid strict quotas or preferences.

Settling the Wild West

Those trying to civilize behavior in open source communities probably feel like Ransom Stoddard in The Man Who Shot Liberty Valance — an optimistic and naive lawyer treated with derision by the outlaws and heroes alike.  Behavior in an open source community is often analogized to the “wild west.” But in a sense, this wildness is the quintessence of open source: a private business is a cathedral and the open source community is a bazaar.  Open source communities usually develop without central control, and many participants chafe under even the most lightweight management.  Moreover, while private businesses can enforce their anti-harassment policies with the economic hammer of the threat of termination of employment, kicking participants out of voluntary communities is both rare and tricky to accomplish.

Some features (or, depending on your point of view, bugs) of the open source community tend to discourage newcomers.  For example, the anonymity and asynchronous nature of online communication can lead to aggressive behavior that participants might not undertake in face-to-face or real-time interactions.  Lack of inclusion can also arise from the mechanics of open source development. In open source projects, one or a few committers control what contributions are approved for inclusion in the software, and becoming one of these committers  is both a matter of high prestige, and dependent on personal reputation. New participants can have a hard time breaking in to a community that can seem cliquish and insular. Much open source work is voluntary and unpaid, and this introduces a Room of One’s Own factor — women and minorities with heavy workloads and child-rearing commitments may not have the resources or time to participate without pay in order to earn their stripes as contributors.

Sometimes, successful projects are unprepared for the need to manage the behavior of a large and disparate community.  One of the key functions of open source project management is to coordinate user and developer conferences. In geographically dispersed open source projects, participation in developer conferences, presentations and meetings is essential, not optional.  As with any conference that brings many people together, open source conferences can become a venue for harassment and hostile behavior.

Bias in the Open Source Community

Reported incidents of overt bias in the open source community have mostly focused on bias against women: Geekfeminism maintains a list of incidents.  

Blue Content.  Male open source developers have used sexualized or nude images of women in presentations, or even code or sample content.  While these incidents are not directed at a particular individual, they can create a culture that discourages participation by women, LBGT — or others who simply find it inappropriate.

  • Lena.  Use of a Playboy centerfold image in the development of image processing.
  • Perl playmate API lightning talk — a TED talk on a joke Perl module designed to download the measurements of Playboy magazine Playmates from the Playboy website.
  • Upskirt blogging post of candid “upskirt” photo posted on a personal blog and syndicated to Planet Fedora website.
  • Mark Pesce’s use of sexual images at official presentation at Linux Australia, which led to implementation of a code of conduct.

Harassment.  These incidents are directed at particular individuals.  They take place not only at in-person official community group meetings, but at associated social activities, or online.

  • Particular problems occur when the harasser is a core member of the community.  For example, Morgan Marquis-Boire, a cybersecurity expert, was alleged to have sexually assaulted several women.  Several organizations disassociated themselves with him as a result.

I’m Taking my Ball and Going Home.

In one of the more unusual developments arising from negative reaction to a Code of Conduct, a Linux kernel developer who claimed to be a lawyer said that those objecting to the new CoC could rescind the license to their contributions in protest. Which was, actually, not correct as a matter of law.

A Culture of Brutal Honesty, and Great Code

Codes of conduct sometimes seek to address baseline notions of civility, rather than overt bias.  Linus Torvalds is the original developer of the Linux kernel, and he remains the gatekeeper for the Linux kernel (making him arguably one of the most powerful human beings in the world, in practical terms).  But Torvalds is notoriously and cuttingly honest. He is well known for his profanity-laced posts to discussion groups and withering reactions to pull requests, but to be fair, most are in service of code quality and he is not known for sex or race bias.   Recently Torvalds took a hiatus and promised to get counseling.   He wrote:  

I am not an emotionally empathetic kind of person and that probably doesn’t come as a big surprise to anybody. Least of all me. The fact that I then misread people and don’t realize (for years) how badly I’ve judged a situation and contributed to an unprofessional environment is not good.  This week people in our community confronted me about my lifetime of not understanding emotions. My flippant attacks in emails have been both unprofessional and uncalled for. …I know now this was not OK and I am truly sorry….I want to apologize to the people that my personal behavior hurt and possibly drove away from kernel development entirely….I am going to take time off and get some assistance on how to understand people’s emotions and respond appropriately.

But Torvalds is not alone.  At the risk of engaging in a cultural stereotype, coding can attract those for whom emotional intelligence is not a strong suit, at the least, or even those on the Autism spectrum.  For some, this is not a bug but a feature — those who struggle with interpersonal skills can find a haven in the pseudonymous, technical world of open source.  (This author included.)  But having poor social skills causes problems in a model of development that needs a robust community to thrive.

Best Practices

There is extensive legal writing on best practices to implement codes of conduct in private business: Have a written policy that is as concrete as possible, enforce it consistently, establish protocols for confidential reporting and assessment of claims, and try to do all this without stifling First Amendment protected expression.  Here are a few best practices that are more specific to open source communities.

    • Localization.  Open source communities are often international, so policies need to take into account local language and culture. A private business may be able to mandate communications in its home language, but volunteer communities cannot.  What should or should not be mentioned in a code of conduct may also be limited by local norms.
    • Self-Governance. Codes of conduct for open source communities need to be created by, or at least approved by, the members of the community. Some are also self-administered, but this can make confidential assessment of complaints challenging.
    • Conferences.  Community participants may interact with the project only, or for the first time, at conferences.  Codes of conduct should be promulgated at each conference, in addition to the more persistent public facing materials for the project.  Those submitting presentations should confirm adherence to the code of conduct. The code of conduct should cover all official and unofficial conference events and related online interactions.
    • Fighting the Echo Chamber.  Self-organized communities can tend to be insular.  Projects may wish to 360-degree accountability by ensuring important decisions are made by multiple stakeholders.
    • Using AI.  There are some technology tools to improve quality of conversations online.  AI can screen for both positive and negative communication and alert those running the project to problems before they get worse.

As the open source world struggles to get along — as the larger world does — codes of conduct will inevitably become more common.  But they will continue to be more difficult to enforce in open source communities than in private enterprise. 

Below are some resources for additional reading on bias, codes of conduct, and related topics.

Useful Links:

  • Sexism field guide.  This is a collection of tactics to deal with sexism on an immediate and daily basis.  They range greatly and therefore offer options that may suit different preferences for the level and nature of response.
  • Airbnb’s bias and discrimination toolkit

Some Codes of Conduct:

Thanks to Luis Villa of Tidelift and Katie Gosewehr of O’Melveny for their work preparing the analysis and research that formed the basis for this article. Credit only to them; any errors are mine alone.

 A presentation on this topic will be given at the PLI conference in San Francisco, Open Source Software 2018 — from Compliance to Cooperation, November 28, 2018, and will be eligible for MCLE credit on Elimination of Bias.  

Microsoft Joins OIN

In a move that represents what may be the swan song of its formerly anti-Linux position, Microsoft announced on October 10, 2018 that it joined Open Invention Network.  The OIN Announcement is here.

Microsoft had previously taken a few steps to align itself more with open source communities, including joining the Linux Foundation, joining the License on Transfer Network, and joining the Red Hat-led GPL pledge (“We doubled down on this new approach when we stood with Red Hat and others to apply GPL v. 3 “cure” principles to GPL v. 2 code.”)

OIN is a patent pool relating to Linux — broadly defined to include many elements in the Linux stack.

Microsoft has historically been one of the few non-NPEs to exact patent royalties for Linux, and famously licenses patents for Android devices (which use the Linux kernel).  The prior linked article explains some of the history of Microsoft’s enforcement efforts, which were focused in part on the  FAT (File Allocation Table) patents.  However, news reports have been unclear on the exact effect that joining OIN will have on Microsoft’s Android patent license program.



Snippets and Stack Overflow

I recently came across an online discussion that mentioned this very interesting article, Usage and Attribution of Stack Overflow Code Snippets in GitHub Projects, by Sebastian Baltes, Stephan Diehl, is a study of certain licensing issues in Stack Overflow, a discussion site for software developers.  Stack Overflow applies the CC BY-SA 3.0 license, a copyleft license for content, to contributions, and there is an ongoing debate as to the suitability of those license terms.

The study analyzes the attribution of “non-trivial” Java code snippets to estimate rate of usage that did not comply with CC-BY-SA notice requirements. The study found that “at most 1.8% of all analyzed repositories containing code from SO used the code in a way compatible with CC BY-SA 3.0. Moreover, we estimate that at most a quarter of the copied code snippets from SO are attributed as required.”

It is a fascinating topic, and it is refreshing to find a practical and empirical analysis of a licensing issue.  (This article by Chaiyong Ragkhitwetsagul, Jens Krinke, and Rocco Oliveto also reports the results of surveys of Stack Overflow answerers and visitors to assess awareness to outdated code and software licenses.)

Those who do M&A deals and other open source compliance efforts know that the average code audit usually turns up a handful of these items.  While many so-called snippets are short and may not enjoy copyright protection, that legal conclusion can be challenging to make, and unsatisfying to the risk-averse.  To avoid uncertainty, buyers often want such snippets removed, resulting in additional engineering costs that are expended to manage small but non-zero legal risks.  It is in economic terms a tax on development activity.

A project to convert Stack Overflow code contributions to a permission license, MIT, died on the vine in 2016.  That is unfortunate.  Discussion boards would better serve their community by requiring contributors to apply permissive licenses — or even public domain dedications or licenses with no attribution requirements — to small code examples.  At least that should be the default choice.  It seems doubtful that most contributors care enough about any copyright they may have in code snippets to apply — or enforce — significant conditions on what they contribute.  Given the choice, they would probably be happy with permissive terms.  Moreover, many of the contributions are taken from other sources and contributed without attribution of upstream license terms, which may or may not be compatible with CC-SA.

OSS Capital

I am thrilled to announce the launch of OSS Capital, a new venture capital fund focusing on commercial open source companies.  I will be acting as a Portfolio Partner.  OSS Capital invests in OSS startup companies.  Open source software is the future, and I am honored to be a part of OSSC, helping companies with open source business and licensing strategy.

And for those of you who are wondering…I will be continuing my law practice as well.


Ninth Circuit Affirms Thin Protection for Databases under Copyright

In Experian Information Solutions, Inc. v. Nationwide Marketing Services, Inc., No. 16-16987 (9th Cir. 2018), the Ninth Circuit affirmed the limited protection available for databases under copyright.

Plaintiff Experian Information Systems, Inc., created its ConsumerView Database, containing names and addresses of more than 250 million consumers.  This information was valuable, because marketers will pay significant fees for accurate pairings of names and addresses.  Experian expended significant efforts to collect the data from many sources, such as real estate deeds and warranty cards.  It also used both human and automated methods to maximize the reliability of the data.

Experian discovered the basis for its lawsuit when a broker tried to sell Experian its own data — at a low price — on behalf of defendant Natimark.  Experian sued for copyright infringement, and when that claim was dismissed, trade secret misappropriation.

Examining the issue of whether such data is protectable under copyright, particularly, given Experian’s significant effort to ensure the data was accurate, the Ninth Circuit held that the database was copyrightable as a compilation (disagreeing with the district court).  But it affirmed summary judgment dismissing the copyright claim because Experian did not show “bodily appropriation” of the work, in part because Natimark’s database was “materially smaller” than Experian’s.  However, it held that with proper efforts to keep the information confidential, Experian’s lists could be protected as trade secrets.  The court remanded on the trade secret issue only.

This case underscores one of the thorny doctrinal difficulties for “open data licensing.”  Data that is publicly available (and therefore not protected by any trade secret interest) has very limited copyright protection.  Absent a contract binding a recipient to limited use, it is hard to enforce a condition in a copyright license to data, because most uses other than wholesale copying would be non-infringing.  Recipients  can “engineer around” a thin copyright by supersetting, subsetting, or changing the data. Accordingly, licenses that attempt to apply a “copyleft” condition to “derivative works” of data have an even bigger challenge than corresponding software licenses.  Such conditions are premised on the power of copyright.  It is extremely difficult to tell whether one data set is “derivative” of another, and very difficult to preserve any copyright interest in the face of downstream modifications.