This Week in Programming: DigitalOcean Hacktoberfest Creates Spam for Open Source Projects

3 Oct 2020 6:00am, by

One of the great things about open source software is that anyone, with a little bit of education and some amount of desire to do so, can be a contributor. So obviously, the right thing to do would be to give those would-be contributors a little push, perhaps in the form of a prize of some sort, to get them over that initial threshold, right?

That has been the basic idea behind DigitalOcean’s Hacktoberfest, where developers are urged to make four pull requests to any open source project during the month of October to receive a prize. While the event has been going on for seven years, this year’s event has already caused quite a stir (and it’s only been a day since it started) with open source project maintainers complaining that, rather than helping, DigitalOcean’s Hacktoberfest is hurting open source.

The blog post, penned by developer and open source maintainer of the whatwg/html repository Domenic Denicola, goes on to accuse Hacktober fest of being “a corporate-sponsored distributed denial of service attack against the open source maintainer community,” as he cites nearly a dozen spam pull requests in the first hours of the event.

“My most fervent hope is that DigitalOcean will see the harm they are doing to the open source community, and put an end to Hacktoberfest. I hope they can do it as soon as possible, before October becomes another low point in the hell-year that is 2020,” writes Denicola. “In 2021, they could consider relaunching it as an opt-in project, where maintainers consent on a per-repository basis to deal with such t-shirt–incentivized contributors.”

Meanwhile, blogger Joel Thoms further points to a YouTuber as the cause of this year’s severe uptick in spam, noting that “this flood of low-quality PR spam appears to come from a YouTuber with an audience of 672K where he demonstrates how easy it is to make a Pull Request to a repo,” with the video-demonstrated pull request of calling a project an “amazing project” is now showing up in 21,177 issues.

DigitalOcean has responded to the brouhaha around Hacktoberfest, admitting that “at least 4% of pull requests from Hacktoberfest participants have been marked ‘invalid’ or ‘spam.'” In response, the company has issued a number of changes for this year’s event, allowing projects to opt-out of the event and promising to ban users from participating in this and other DigitalOcean events if they’re found to be spamming projects with fraudulent pull requests, and promises changes for future events as well. To the maintainers affected, the company writes “We’re sorry that these unintended consequences of Hacktoberfest have made more work for many of you. We know there is more work to do, which is why we ask that you please join us for a community roundtable discussion where we promise to listen and take actions based on your ideas.”

As for things you can do right now, GitHub has also joined in on finding a solution, enabling projects to limit interactions for a time.

This Week in Programming

  • GitHub Gets Integrated Code Scanning: Last year, GitHub acquired Semmle, an semantic code analysis engine that works to identify code patterns in large codebases and search for vulnerabilities and their variants. Now, the company completed its work bringing those capabilities into GitHub as a native tool and says that users can enable it on their public repositories today. The tool was first released in beta earlier this year but is now generally available. The tool works with GitHub Actions, or your existing CI/CD pipeline, and scans code as it’s created, using CodeQL, GitHub’s code analysis engine. Developers can use any of the more than 2,000 existing CodeQL queries to scan their code or write their own, and the code scanning capabilities are extensible using the open SARIF standard. So far, GitHub says that more than 12,000 repositories have been scanned, finding more than 20,000 security issues. As for CodeQL queries, 132 community contributions have been made, and GitHub says it has “partnered with more than a dozen open source and commercial security vendors to allow developers to run CodeQL and industry-leading solutions for SAST, container scanning, and infrastructure as code validation side-by-side in GitHub’s native code scanning experience.” The code scanning features are free for public repositories and you can learn more in the docs. If you’re interested, contributions to the list of CodeQL queries are being accepted.
  • Red Hat Updates Quarkus, Offers Build of OpenJDK: Red Hat has offered an update on some new features in Red Hat Runtimes, this time with a focus on Java. For Quarkus, Red Hat’s Kubernetes-native Java framework, the company says it has added a Quarkus native compilation feature, which “works to allow all users to run Quarkus in native mode, rather than in a traditional Java Virtual Machine (JVM).” Alongside this, it has also added full support for the Mandrel project, a downstream distribution of GraalVM that gives native compilation support and provides a place for Quarkus to land new features in GraalVM. “With Mandrel,” they write, “we are able to have GraalVM bundled on top of OpenJDK 11,” which “is important for Quarkus users because developers can use GraalVM to compile their Quarkus apps down to native binaries, to further optimize for the cloud and Kubernetes.” The company is also introducing a number of new features to Red Hat Data Grid, highlighting the addition of cross-site replication support for Data Grid clusters that run within Red Hat OpenShift. Finally, the Red Hat build of OpenJDK 8 adds support for the Java Flight Recorder, which “enables developers and operations teams to observe and produce reports for in-production Java applications, effectively doing the job of numerous other smaller utilities like jhat, jmap, or jps across your Java application landscape.”

  • OpenJDK Migrates to GitHub: Speaking of Java and OpenJDK, the OpenJDK has completed its move to GitHub as part of its migration effort, codenamed Project “Skara,” which brings JDK 16 main-line development into GitHub. Previously, the project had been self-hosted on Mercurial servers, and the migration to GitHub was implemented by the Java Platform Group at Oracle. Now, GitHub offers a peak at how the move was done, writing that “this was much more complex than just doing a hg-fast-export and then a git push,” with the team building “custom CLI Tooling, bi-directional bridging with the OpenJDK mailing lists and integration with the OpenJDK bug tracking system” and working toward adopting GitHub Actions for their CI builds — all of which is open source and available in the OpenJDK GitHub organization.
  • End-to-End Testing With Playwright for Python Preview: Microsoft has announced a preview of Playwright for Python, a tool for writing end-to-end tests in Python that automate UI interactions and can validate the functionality of your applications across all modern web browsers. Playwright for Python API follows a release earlier this year of Playwright for JavaScript. Beyond running these tests to deployed applications, you can also run them in your CI/CD pipelines with the Playwright GitHub Action or with tools for other CI/CD providers. You can find Playwright for Python on GitHub, and share feedback or feature requests on GitHub issues or in the Playwright Slack community.

Red Hat is a sponsor of The New Stack.

Feature image via TeeFantastic.

A newsletter digest of the week’s most important stories & analyses.