CI/CD / Data / Development / Machine Learning

Codota Offers Pair Programming with Artificial Intelligence

25 Jan 2018 6:00am, by

For the truly agile, pair programming is a tried and true method for encouraging skills exchange, code quality checks, and social interactions between developers. The practice teams two developers up at a single keyboard and monitor, allowing the coupled brains to be brought to bear upon the code. Consultants also use pair programming to get in-house software development teams up to speed with the latest practices, unit testing skills, and code hygiene.

The hazards of pair programming are many, however. Tales of woe include things like being paired with a violent flosser who’s lunch leavings careened around the cube with a resounding plunk from the offending tooth string. Another problem with the practice is that it literally halves your productivity and doubles your costs: each developer is now a pair and costs twice as much. Atlassian did a great send-up for April Fools day in 2012 which pokes fun of this agile development practice, and while their take is decidedly Tim and Eric, the underlying message remains true: pair programming can be awkward.

This is why Eran Yahav and Dror Weiss founded Codota. The Israel-based software shop has constructed a machine-learning-driven cloud-based pair programming partner. In practice, this AI assistant eliminates the need for developers to go rooting around Google and Stack Overflow to solve their problems.

Yahav, who is Codota’s chief technology officer, comes from academia, but also worked at the IBM T.J. Watson Research Center. CEO Weiss, on the other hand, comes from Symantec and Panaya. The pair intimated that it was the high cost of pair programming which drove them to develop Codota.

“Pair programming with a human programmer is often costly because you spend the time of two expensive people on one task. The idea of using machine learning to give a lot of the same functionality and help to augment and complement the abilities of the developer comes from there,” Weiss said. “AI and machine learning give us the best opportunity to augment the fundamental capacity of every programmer with the practically infinite knowledge of the internet.”

In practice, said Weiss, that means offering up information gleaned from outside sources, like Google, GitHub, and Stack Overflow. Developers are already using these tools, but their current workflows require them to move over to a browser and search diligently to find the code they need to use.

“For a developer,” said Weiss, “The primary source of information is analyzing existing code from GitHub, Stack Overflow, and other repositories. This code was created by developers, so we use the collective knowledge of what developers write to help other developers write better code, faster. It saves you the manual process developers normally do otherwise. Codota makes this process much less painful: developers don’t need to context switch. More importantly, it pulls code that is the most relevant to their context, to the libraries they’re using in their code, and what they’ve done before when invoking Codota.”

In practice, said Yahav, Codota goes significantly further than code completion tools like IntelliSense. Instead of simply completing lines of code almost at random, Codota is aware of the context in which it is being invoked, allowing it to search only for snippets that can be useful in the present moment.

“There is a continuum: on one side you have IntelliSense that gives you the next word. On the other end of that spectrum is something where you describe what you want in a natural language and it generates the right program. The vision of synthesizing the code you need is years ahead, and some aspects will never materialize because describing intent is as hard as writing the program,” said Yahav.

Codota lives somewhere in the middle of this spectrum, he said, providing insight and suggestions further than simply syntax completion. “This is what deep learning gives you. The programmer is not only writing the code, but making design choices, and trying to figure out what the code should actually do. The magic is finding the sweet spot where the intent is clear enough that Codota can predict your next step,” said Yahav.

Behind the scenes, Codota’s suggested code completions and related content are generated by using predictive models of code based on the current context present in the user’s IDE. It combines techniques from program analysis, natural language processing, and other machine learning techniques. According to the company, only public repositories of code are used for training data; no customer data is used.

Thus, Codota pulls entire snippets of relevant code to offer the developer. Codota currently supports only Java, but JavaScript is the next language on the roadmap. For a developer building a Java application using Java Database Connectivity (JDBC), Yahav said Codota would provide “Full snippets of code that show you various ways to connect your database with various choices of configuration parameters that are common in the wild. You will get representative code snippets related to the context in which you are operating. If you do initial work with JDBC, Codota will suggest your next step. It knows how to give you representative suggestions.”

The idea of pushing helpers into the IDE is not new. Existing products have offered similar services, but never focused on general help. Systems like Klocwerk, for example, enforce secure software development practices at coding time, locking the developer into company style guides via warnings and pop-ups alerting the user to their mistakes. These types of in-IDE static analysis and policy enforcement, however, have traditionally evoked the ire of developers, angry that they’ve been locked into a specific way of writing software.

“I worked myself on these kinds of tools in the past,” said Yahav. “I think a lot of [the hate] comes from the fact that the rules there are hard coded. What happens with hard-coded rules; because they don’t know anything about your project, they’re aimed at the lowest common denominator. A lot are quite generic. They could say you’re using string concatenation and you should use string buffer. Either I know it and I don’t want you to bug me about it, or I don’t know it and I don’t want you to bug me about it. The problem with a lot of these tools is the fact that the rules are hardcoded and generic,” said Yahav.

He went on, “The point is, rather than enforcement telling you what you’ve done wrong, we focus on prediction on what should be your next step: We catch you before you commit the crime, so to speak.”

The future of Codota holds JavaScript support, and eventually support for C#. Currently, only Eclipse and IntelliJ are supported, but additional IDEs are coming soon. Weiss said that the team also plans on adding the ability to analyze code in private repositories, allowing enterprises to train Codota on their internal software.

Feature image via Pixabay.

A newsletter digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.