How GitHub Copilot Improved its Code Completion Capabilities

For more articles, tutorials, and news about open-source projects, visit my blog at devdog.co

On February 14th, 2023. Shuyin Zhao, Senior Director of Product Management at GitHub, released a blog post stating that GitHub Copilot is now upgraded with major improvements to its Codex model, as well as the addition of insecure code checks across the whole codebase.

Improvements in Codex Model Precision

According to Zhao, GitHub Copilot generated around 27% of developers' code that used Copilot during its June 2022 release. Today, Copilot is responsible for generating an average of 46% of a developer's code across all programming languages – and a whopping 61% for Java codebases. This results in two phenomena for Copilot-assisted codebases:

Developers are leaning more and more toward Copilot suggestions, which may or may not lead to fewer errors in the codebase, depending on the prowess of the Codex model in used programming languages and frameworks.
Since big chunks of code are generated by Copilot, the latter may promote vulnerable code that can be harmful to the software and companies using it, and add more to the code vulnerability check task during development and QA.

Key Technical Improvements in Codex Model

Upgrade to the Codex model core that delivers precise, personalized results for code synthesis
Code context understanding by implementing the novel Fill-In-The-Middle (FIM) paradigm – which offers developers better prompts for code suggestions. Furthermore, Copilot now considers both the prefix and known suffix of the code and uses FIM to fill in the middle of the code in a way that aligns better with the current context.
A lightweight client-side model for VSCode that uses user context information – such as whether the user has accepted the last suggestions, in order to tweak its code synthesis and prompts without hindering the overall developer experience and Copilot's performance. According to the blog, this model has decreased the rate of unwanted suggestions to a staggering 4.5%.

The New Security Vulnerabilities Detector

As stated above, developers who use Copilot on their IDE depend a lot on its AI-based code suggestions. This raises the possibility of introducing vulnerable codes since Codex was not trained for detecting them. This has changed, however, with the February update as GitHub has launched an AI-based vulnerability filter system that blocks insecure coding patterns in real time. Currently, the model targets the most common vulnerabilities such as SQL injections, hardcoded sensitive information, and path injections.

To approximate the behavior of well-established, static code analysis tools, GitHub has used Large Language Models (LLMs) in combination with the Codex model and the powerful compute resources allocated to Copilot. This approach promises to be fast and detect vulnerable patterns in incomplete code fragments, a feature that is lacking in traditional analysis tools. This leads to preventing insecure code at the very early stages of the development phase (and saving the QA team some headaches).

Conclusion

GitHub Copilot's upgraded Codex model and new security vulnerability filter represent a significant step forward in the realm of AI-assisted code completion. With its improved precision and responsiveness, Copilot is generating a higher percentage of developers' code across multiple programming languages, while the new AI-based vulnerability filter helps prevent insecure code from entering the codebase. These technical improvements promise to streamline the development process, improve code quality, and reduce vulnerabilities, ultimately benefiting software development teams and their end-users. As with any new technology, there may be some challenges and concerns to address, but the potential benefits of GitHub Copilot's latest upgrades make it a compelling tool for developers to consider.

How GitHub Copilot Improved its Code Completion Capabilities

Improvements in Codex Model Precision

Key Technical Improvements in Codex Model

The New Security Vulnerabilities Detector

Conclusion

Did you find this article valuable?