AWS’ CodeGuru Will Point Out What’s Wrong with Your Code

Amazon Web Services wants to bring the full power of machine learning (ML) to code development. At the company’s annual user Re:Invent conference, the company has launched a new service, called CodeGuru, aimed to automate code reviews and provide application performance recommendations.
Manual code reviews are a pain, admitted AWS CEO Andy Jassy in a keynote at the conference, being held in Las Vegas this week. Company developers have to set aside time to review the code, and even the sharpest scrutiny can miss some elements. CodeGuru can find many of the common, though tricky-to-notice, errors, and even highlight the inefficient (i.e. costly) lines of code in an application.
The service is notable because it is one of the first ML-based code reviewing and profiling tool, said analyst Janakiram MSV, adding that it has potential appeal to developers when it is integrated with mainstream IDEs such as Visual Studio, VS Code, and IntelliJ (check back at The New Stack, as Janakiram will take a deeper dive into this service for us when he returns from the event).
CodeGuru can be easily dropped into the development process: Add CodeGuru as one of the recipients to a pull request. When the dev commits the code, CloudGuru consults its models and algorithms provides an assessment of the code, in a human-readable form that pinpoints the problematic lines of code.
The software itself grew from Amazon’s internal code review process, and draws from 10,000 application profiles on open source projects drawn from GitHub. It can pinpoint resource leaks, atomicity violations, potential concurrency race conditions, unsanitized inputs, and wasted CPU cycles and the difficult-to-pinpoint on thread-safe classes, among other gotchas.
It also was designed to pinpoint “your most inefficient, unproductive, most expensive lines of code,” Jassy said. The service does this through a machine-learning-powered profiler, one that requires a small agent to be embedded in the application. “CodeGuru observes your application, and every five minutes it creates a profile. It tells you things like latency and CPU utilization, and it helps you identify the most expensive lines of code in your application,” Jassy said.
Amazon itself has already been using this profiler for 80,000 of its own applications, which has led to “tens of millions of dollars in savings for us,” Jassy said. Amazon’s consumer payments team repeatedly used the tool over the course of a year, and was able to improve CPU utilization by 325% and save 39% in operating costs, even while demand for their services continued to grow. Likewise, the company’s catalog management service enjoyed up to 67% reduction of CPU usage after following the profiler’s recommendations, the company claims.
Initially, CodeGuru will support Java, though additional languages will be supported in the future. It can plug into GitHub and CodeCommit by AWS, with more repositories to be added in the future. While the service is now only in preview, when it goes live, AWS is promising low, on-demand pricing, so it can be used for all code reviews within an organization: CodeGuru will costs $0.005 per sampling hour per application profile and $0.75 per 100 lines of code per month.
AWS is not alone in thinking there is analytic value in scanning huge online repositories of code to derive a set of best practices and security fixes. GitHub itself is doing similar work with its CodeQL semantic code analysis engine that it acquired when it bought Semmle in September. The Microsoft-owned company is using the technology to generate a semantic code graph of all the public repos, which should offer enormous opportunities to understand and improve coding patterns, quality and security, reported Mary Branscombe reported for The New Stack.