Culture / Technology / Sponsored

How CHAOSS D&I Can Help Diversity in the Open Source Community  

30 Apr 2019 6:00am, by and

The Linux Foundation sponsored this post.

Georg J.P. Link
Georg J.P. Link is an open source community strategist. Link co-founded the Linux Foundation CHAOSS Project to advance analytics and metrics for open source project health. Link has been an active contributor to several open source projects for over 13 years and has presented open source topics at over a dozen industry and academic conferences. Link has an MBA and is pursuing a Ph.D. in information technology. In his spare time, he enjoys reading fiction and hot-air ballooning.

Recent research increasingly shows that organizations that embrace diversity at all levels achieve better business results. Open source ecosystems, of course, are by no means an exception, as diversity and inclusivity are also vital to their future.

However, issues such as gender, racial and sexual orientation biases are prevalent in today’s open source development world, as well as throughout the industry. In addition to correcting this imbalance, boosting diversity only helps projects to grow and thrive — and perform better technically and economically — when a diverse range of ideas, cultures and viewpoints are represented.

While open source communities don’t have human resources departments, projects can introduce structures and practices to promote inclusiveness and diversity. However, open source communities also need analytics, metrics and best practices information to measure their success in these areas. Enter the CHAOSS Project.

History of the CHAOSS Project

Sarah Conway
Sarah Conway is vice president of communications at the Linux Foundation where she works with some of the largest and fastest growing technologies in the history of open source. Conway has been involved in open source since 2004 and joined the Linux Foundation five years ago. She leads marketing communications for projects such as Kubernetes, Node.js, Linux and Hyperledger. She enjoys working with developers to tell stories about open source upending new industries and becoming more valuable, secure and diverse.

Founded in 2017 by open source practitioners and academics, CHAOSS is a Linux Foundation project that helps communities measure and analyze their progress to ensure healthier, more sustainable communities. The CHAOSS project aims to establish implementation-agnostic metrics for measuring community activity, contributions and health; and optionally produce standardized metric exchange formats, detailed use cases, models or recommendations to analyze specific issues in the industry/OSS world.

Today, the CHAOSS project is structured through working groups. One working group is focused on Diversity and Inclusion (D&I) metrics and analytics to help define open source community health. Other working groups address Common Metrics, Evolution Metrics, Risk Metrics and Value Metrics.

This was not always the case. In the beginning, CHAOSS had no working groups to divide up work. The CHAOSS project used a wiki to organize and document its work on metrics. Nevertheless, Diversity and inclusion (D&I) emerged as a key category of metrics for CHAOSS community members. Figure 1 shows the first record of the D&I metric category from May 15, 2017. D&I metrics were further grouped to inform sub-categories for Organizational Diversity and Geographic Diversity. Within each category are the actual metrics, called “activity metrics.”

Screenshot of CHAOSS wiki, showing the first framing of D&I metrics as of May 15, 2017. Metrics (i.e., “activity metrics”) in green link to a detail page that contains information about how to display the metric from data.

Before the D&I category was created, members of the CHAOSS project collected a “long list of metrics.” In Figure 1 are some metrics in green, indicating a link to a metric detail page. A metric detail page contained more information about the metric, including how to implement the metric (Figure 2). Such an implementation was, for example, a SQL statement that could calculate a numeric value. This numeric value could then be displayed as the metric.

Screenshot of CHAOSS wiki, showing a metric detail page that belonged in the D&I category.

Rise of the CHAOSS D&I Working Group

The CHAOSS project was officially announced at the Open Source Summit North America (OSSNA) 2017 in Los Angeles. In preparation for a larger number of contributors that might join the project, the CHAOSS project copied all wiki pages into a GitHub repository. This repository is still in use today.

The next major event leading up to the formation of the D&I working group was a birds-of-a-feather working session at OSSNA 2017. In planning the session, CHAOSS members advanced a new way of thinking about metrics. Metrics were discussed as informing signals. Further, signals were thought to be only useful if they informed some actions. For example, “gender diversity” led to the metric “ratio of women to men in a project.”[1] The resulting signal could then inform whether to launch an initiative with the aim of inviting more women to a project. At the birds-of-a feature session, participants were asked to self-select into smaller groups that talked about different categories of metrics, one category being D&I metrics.

The discussion at the birds-of-a-feather session was not bound by the metrics previously identified and documented in GitHub. Rather, the participants were encouraged to introduce new ideas for metrics. Three D&I metrics emerged from the discussion: contributor demographics, onboarding and retention (Figure 3). The first metric is about people in a project; the latter two are metrics about interactions between community members.

Screenshot of notes from the group discussing D&I metrics at the Open Source Summit North America 2017 birds-of-a-feature session.

After the announcement at OSSNA, a variety of people showed interest in the CHAOSS project, including the authors of the OpenStack Gender Report, which presented the results of a OpenStack survey. The authors wanted to advance the metrics in the report and joined the CHAOSS project conversation. Another connection made during OSSNA was with the diversity and inclusion efforts at Mozilla. Mozilla had just presented the results of a large-scale open source survey and had developed an agenda to address diversity and inclusion issues. One of the issues Mozilla identified was the need to measure how well an open source project is doing with regards to diversity and inclusion. Mozilla joined the CHAOSS project to combine efforts.

Up to this point, the CHAOSS community had one weekly conference call. However, the community members advancing D&I metrics wanted to focus more on their category of metrics. The CHAOSS Diversity & Inclusion Working Group, or D&I Working Group, took shape and began organizing the group, setting up communication, aligning goals of community members and building momentum of the working group. During the first half of 2018, the working group decided to work on D&I metrics in its own GitHub repository, to set up its own mailing list and to have its own weekly conference call.

Creating D&I Metrics

During each conference call, community members took minutes in a shared Google document and assigned “action items” to specific people who would report on the action item during subsequent meetings. These meeting minutes were shared with the mailing list to ensure community members could stay up-to-date without having to attend every conference call. When a discussion occurred during a call, for example, about designing a template for how to document a D&I metric, the discussion was continued on the mailing list, again, to include more community members. Further, conference calls were recorded and published on YouTube to allow revisiting discussions and including anyone who could not join (Figure 4).

Screenshot of a CHAOSS D&I Working Group conference call recording. The screenshot shows the collaborative process of working in a shared google document to design a template for metric detail pages.

New D&I metrics were first created in a Google document, where everyone had edit rights. Everyone was encouraged to contribute ideas to this document and comment on the ideas of others. These documents were tracked at the project level in GitHub issues (Figure 5). Each document had its own issue which linked to the document. Stemming from this engagement, community members determined when a D&I metric was well enough defined and created a pull request. The pull request is a GitHub workflow for making changes to a repository. Review and revision of the D&I metric continued during the pull request workflow. When the D&I metric was accepted, the pull request was merged to add the new D&I metric to existing D&I metrics. The issue that had linked to the original Google document was closed. D&I metrics in the repository were further revised through pull requests when community members want to add or correct something.

Screenshot of a GitHub issue for tracking a proposed metric and the associated google document. Community members work in the google document and coordinate on the issue to advance the metric.

Proposed D&I metric were identified from a variety of community members who had different backgrounds and goals for these metrics. Over time, more metrics were identified. Ideas for how to display metrics needed to be collected and documented. Early conversations centered around how to document metrics so that they could be shared with other community members. As community members advanced their understanding of D&I metrics, they started to evaluate the metrics.

Over time, the CHAOSS D&I Working Group evaluated proposed metrics and advanced a standard way to document them. By standardizing how metrics were documented, the CHAOSS D&I Working Group identified what information was important for displaying or using a metric. The template for a metric detail page (i.e., “Resource Page”) started with a heading that stated the name of a metric and underneath a high-level question that a metric should answer (Figure 6). While a metric was under development, a disclaimer warned potential users of the unfinished state. Following were five sections. First, “Description” provided a rationale of why a metric may be important to display and use. Second, “Sample Objectives” were a list of reasons why someone might want to display or use a metric. Third, “Sample Strategies” listed methods for displaying a metric, leaving specific steps to the subsequent section. Fourth, “Sample Success Metrics” described instructions for how to execute a method for displaying a metric, from data collection, to display, to interpretation. Fifth, “Resources” listed references and related work that provided additional background information or supported claims about a metric. After creating the template, D&I metrics were advanced by filling in the template for each metric, which required evaluating what information went into each template section.

Screenshot of the D&I metric detail page (Resource Page) with its six areas: 0) Heading and question, 1) Description, 2) Objectives, 3) Strategies, 4) Success Metrics and 5) Resources.

As the number of proposals for metrics grew, CHAOSS D&I Working Group members decided to group metrics into focus areas. The creation of seven focus areas was informed by Mozilla’s 2017 research recommendations, providing a stronger foundation for the metrics. Figure 7 shows the seven focus areas as they exist today in the CHAOSS D&I Working Group repository.

Screenshot of focus areas in the README on the CHAOSS D&I Working Group GitHub repository.

The CHAOSS D&I Working Group, like the rest of the CHAOSS project, adopted a Goal-Question-Metric approach. The logic behind this approach was that metrics were only useful if it was known how to use them in answering specific questions. The Goal-Question-Metric approach challenged the group to evaluate the utility of metrics. Within the GitHub repository structure, the focus areas were connected to a high-level goal that someone who was looking for metrics might have. Within each focus area were a set of questions that further narrowed down the choice of metrics (Figure 8).

Screenshot of “Event Diversity” focus area, showing the goal, a list of questions and subsequent metrics (Name).

Each question had its own metric detail page (Figure 9), based on the template. The metric detail page provided qualitative and quantitative methods for capturing data as numeric values that could be displayed as metrics. Adding methods here was an integral part of evaluating metrics and included discovering issues with methods and overcoming those issues. Some community members shared methods and experiences from their own open source projects. Some community members were interested in applying a method in their open source project and asked for feedback and insights from others.

Screenshot of “Attendee Demographics” question detail page from the “Event Diversity” focus area, showing a selection of metrics that can be used as metrics.

Issues that were discussed included data sources, data management, visual representation and language. For example, when looking at the diversity of contributions, using the phrase “technical versus non-technical contributions” when looking at commits in a repository discounted the technical nature of many contributions, such as documentation writing. As a solution, CHAOSS D&I Working Group members concluded that “code versus non-code contributions” was a better way to frame diversity of contributions because it was a better fit with the data the metric was created from and because it did not discount anyone’s contributions.

What’s Next for CHAOSS D&I WG

Goals for the near future include documenting three high-quality and compelling use cases for D&I metrics, partnering with projects to pilot display of D&I metrics, establishing ethical guidelines around displaying D&I metrics and establishing a more sophisticated workflow for advancing D&I metrics.

CHOASS members Sarah Conway, Daniel Izquierdo and Nicole Huesman will also be presenting “Metrics That Matter” at KubeCon+CloudNativeCon Europe 2019 on the group’s progress on creating a set of community-curated metrics to track diversity. For more background, here is a list of recent conference sessions held by D&I Working Group community members:

  • August 28, 2018: “Establishing Metrics that Matter for Diversity & Inclusion” (CHAOSScon[2] North America 2018, Vancouver, CA);
  • August 29, 2018: “D&I Metrics Hack-a-thon” (Open Source Summit North America 2018, Vancouver, CA);
  • October 24, 2018: “Tutorial: How to Prepare a Diversity and Inclusion Report for your Community” (Open Source Summit Europe 2018, Edinburgh, UK);
  • February 1, 2019: “Diversity & Inclusion WG Tutorial” (CHAOSScon Europe 2019, Brussels, BE);
  • March 14, 2019: “Panel Discussion: Metrics that Matter: Forging a Path to More Diverse, Inclusive Communities” (Open Source Leadership Summit 2019, Half Moon Bay, CA, USA).

Today, D&I metrics from the D&I Working Group are being piloted in several projects. The OpenStack Gender Report includes new metrics on mentorship programs shaped by CHOASS. Mozilla has adopted some CHAOSS D&I metrics in their MOSS grant for open source project. Further, the D&I Working Group is in conversations with more open source projects that are interested in displaying D&I metrics. For any project interested in joining the working group or being a pilot, join the D&I WG mailing list.

[1] CHAOSS members quickly moved past binary genders to recognize that gender exists in a spectrum.

[2] CHAOSScon is a conference series organized by CHAOSS project members to bring together anyone interested in displaying or using metrics for open source project health. Every year, CHAOSScon North America is co-located with the Open Source Summit North America and CHAOSScon Europe is co-located with FOSDEM.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.