GPTZero: An App to Detect AI Authorship

In an era where academic misconduct and cheating are increasingly rampant, educators who are understandably concerned about the impact of large AI models like the text-generating ChatGPT now have an AI-detection tool of their own.
Edward Tian, a 22-year-old undergraduate student studying computer science and journalism at Princeton University, is the creator of GPTZero, a free and publicly available tool that can detect whether a piece of text is written by a human or machine.
Tian laid down the code behind the app earlier in January, after going home to Toronto, Canada, during the holiday break. Tian initially believed that only a few dozen people would try the tool out. But so far, GPTZero has received quite a bit of attention from major news outlets, in addition to the app’s website garnering over 7 million visits since its launch.
Tian, who is currently doing his thesis research on AI detection, wanted to create something that would help a wide range of people — like teachers, college admissions administrators, or someone reading an online article — figure out what they might be dealing with.
“GPTZero is about preserving the qualities that make us uniquely human,” as Tian explained on the GPTZero Substack. “In a world where AI text generation technology is rampant, we need to build the safeguards so that these technologies are adopted responsibly. This means knowing when and where AI is being applied — not to prevent ChatGPT from being used, but to prevent AI from being abused by bad actors — think bots, misinformation, election interference campaigns, trust and safety violations, differential access to generative AI among low-income students, fake news, and articles being misrepresented as human.”
The appearance of AI-detection tools like GPTZero comes at a time when it seems like AI is making controversial waves in almost every field, whether that’s academia, tech or the art world. Models like ChatGPT — which is a conversational language model developed by OpenAI — are trained by an unimaginably massive amount of text that is scraped from the Internet. ChatGPT is built on top of the company’s family of powerful GPT-3 models, which was pegged at having more than 175 million parameters — an unprecedented number at the time of its initial release back in 2020.
Perplexity and ‘Burstiness’
GPTZero works by analyzing text for certain indicators, such as what Tian calls “perplexity” and “burstiness.” Perplexity refers to how complex or random a text might be. If GPTZero is “perplexed” by the text it is evaluating, then it is more complex, and therefore, more likely to be authored by a human. Burstiness refers to patterns of diversity in sentence structure, which can indicate whether a written text is machine-generated or not.
“Burstiness, in a sense, is variance in writing — you can think of it as human creativity, and because of our short-term memory, we have sudden bursts and creativity and differences in our writing, while machines have pretty ubiquitous and constant writing over time, especially if these machines are as powerful as ChatGPT,” as Tian explained in an interview with Yahoo! Finance Live.
Tian is now working to further improve the app’s accuracy. In addition, he is in discussions with various school boards and scholarships about using a newer, institutional version of the model, GPTZeroX. Nevertheless, Tian is still planning to keep the online copy-and-paste version of the app free for widespread public use.
While Tian’s development of GPTZero stems from his interest in the responsible use of AI, Tian asserts that he doesn’t want the use of AI to be banned indiscriminately. Rather, he’s hoping that the social impact of AI adoption will be influenced in a way that promotes equity and transparency, especially as AI will likely become more ubiquitous in the future.
“These technologies are here. And I absolutely believe AI is here to stay. It is the future. And students shouldn’t be taken away [from] the opportunity to interact. It should be adopted responsibly, [because] soon [using] ChatGPT will cost money. So it’s no longer just a responsible AI issue, but also an equity issue because students in low-income neighborhoods might just never have access to this technology when students in higher-income neighborhoods might be paying for it. But if there’s a blanket ban, then we actually don’t know where it’s being used as well.”