Development / Security

How lgtm Discovered the Spring Framework Vulnerability

30 Mar 2018 6:00am, by

Security researchers at are urging users of the Pivotal Spring framework to upgrade to the latest version due to a critical remote code execution vulnerability.

The vulnerability allows attackers to execute arbitrary commands on any machine that runs an application built using Spring Data REST. The company characterizes the upgrade as “a matter of urgency.” The Spring Data REST component is distributed as part of various other Spring projects, including the Spring Boot framework.

Spring Framework is the most popular platform for building web applications, according to analyst firm RedMonk.

“This vulnerability in Spring Data REST is unfortunately very easy to exploit. As it is common for RESTful APIs to be publicly accessible, it potentially allows bad actors to easily gain control over production servers and obtain sensitive user data,” said researcher Man Yue Mo at parent company Semmle.

Mo, also discovered the Apache Struts vulnerability last fall.

In a blog post about how he found the Spring vulnerability using lgtm tools, Mo explained that it enables an attacker to send a PATCH request with maliciously crafted JSON data to run arbitrary code on the server. When reported to Pivotal, it responded quickly with a method to thwart the remote input, he said.

Semmle CEO Oege de Moor called the discovery of the Pivotal Spring vulnerability, using variant analysis, an example of how lgtm is supposed to be used.

Built on research in compilers and data analysis at the University of Oxford, Semmle offers software engineering productivity analytics and its code exploration tool lgtm, which takes its name from the code review signoff “Looks good to me.”

Its technology turns a code base into to text that can be queried to find bugs and errors.

“Whenever you use data from an unchecked source, you have to be very careful to protect yourself from all sorts of vulnerabilities. An example would be data that flows from an untrusted user to a deserialization method in Java. So we track the data all the way from where it comes into the application to where it is used in a dangerous fashion, and we flag where a problem exists,” said product manager Bas van Schalk.

While security is the focus of many of its customers, it has other uses. Data scientist Albert Ziegler wrote about its use in reducing alerts in testing.

“You may already know a bug that exists, but you want to find all the places where the same logical mistake has been made. Our technology allows you to write a query across your code base and find them; then it looks continuously to ensure the same mistake is never made again,” de Moor said.

An investment company used it to track dependencies in libraries when modernizing a 30-year-old code base and to onboard new hires with real work when its staff was thin during vacation season.

Headquartered in San Francisco, it also has offices in Oxford, New York City, and Copenhagen. Customers include CitiCredit SuisseNASA and Dell.

It treats code as a database.

“If you’ve got a compiled language like Java or C++,  we observe exactly what tiles are being processed by the compiler and we store information from the source code in text files in relational form. So you get a table of functions, a table of expressions, a table of variables. So the analysis [involves] queries against a relational database,” de Moor said of its custom-built technology.

It also developed the QL query language, with syntax similar to SQL, but the semantics of QL are based on Datalog, a declarative logic programming language with a long history in academia. All operations in QL are logical operations. It inherits recursive predicates from Datalog, and adds support for aggregates, simplifying even complex queries.

Competitors in code analysis include Coverity, now part of Synopsis; Fortify, now part of Micro Focus; and Checkmarx, though Semmle touts that lgtm makes it extremely easy to create new analyses.

The NASA Jet Propulsion Laboratory reported a code review competitor missed a defect in the critical entry, descent and landing software of its Mars rover Curiosity while it was en route. Worries about potential other bugs prompted a review from lgtm. In 20 minutes, it defined a new rule to query for other instances of the defect and found 30 related cases elsewhere in the code. NASA engineers were able to fix them in time to ensure a safe landing on Mars.

In addition to its enterprise customers, lgtm works with open source projects to improve their code. It processes code stored in public Git repositories hosted on GitHub or Bitbucket Cloud.

It supports projects in C and C++ (currently in beta testing), Java, JavaScript/TypeScript and Python and analyzes every revision to a project committed to the default branch. It alerts on problems and compares them with a list of new or fixed alerts.

In a different take on code analysis, Israeli security startup Intezer looks to the body’s immune system with what it calls “DNA mapping for software,” and its goal of being the “Google of binary code” by providing a search engine for all the world’s code.

Another interesting startup building a source-code database, though not specifically security-related, source{d} mines a data set based on 57 million public Git repositories — hundreds of terabytes of source code — to train machine-learning models to understand natural language, intent and similarity.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.