Here's how ChatGPT could solve its major plagiarism problem

Here’s how ChatGPT could solve its major plagiarism problem

By Alan Truly January 26, 2023

ChatGPT is a wonderful tool but there’s a dark side to this advanced AI service that can write like an expert on almost any topic — plagiarism. When students that are supposed to be demonstrating their knowledge and understanding of a topic cheat by secretly using ChatGPT, it invalidates testing and grading. AI skills are great but aren’t the only subject that students should learn.

Policing this problem has proven to be difficult. Since ChatGPT has been trained on a vast dataset of human writing, it’s nearly impossible for an instructor to identify whether an essay was created by a student or a machine. Several tools have been created that attempt to recognize AI-generated writing, but the accuracy was too low to be useful.

Amidst rising concerns from educators and bans on students using ChatGPT, Business Insider reports that OpenAI is working on a solution to this problem. A recent tweet from Tom Goldstein, Associate Professor of machine learning at the University of Maryland, explained how accurate it might be at detecting watermarked text that’s written by ChatGPT.

#OpenAI is planning to stop #ChatGPT users from making social media bots and cheating on homework by "watermarking" outputs. How well could this really work? Here's just 23 words from a 1.3B parameter watermarked LLM. We detected it with 99.999999999994% confidence. Here's how 🧵 pic.twitter.com/pVC9M3qPyQ

— Tom Goldstein (@tomgoldsteincs) January 25, 2023

Any tool that could identify plagiarism with nearly 100% accuracy would settle this discussion quickly and alleviate any concerns. According to Goldstein, one solution is to make the large language model (LLM) pick from a limited vocabulary of words, forming a whitelist that is okay for the AI to use and a blacklist of words that are forbidden. If an unnaturally large number of whitelist words show up in a sample, that would suggest it was generated by the AI.

This simplistic approach would be too restrictive since it’s hard to predict which words might be necessary for a discussion when working one word at a time, as most LLMs do. Goldstein suggests that ChatGPT could be given the ability to look ahead further than one word so it can plan a sentence that can be filled with whitelisted words while still making sense.

ChatGPT made a big splash when it entered the community writing pool and can be a great teaching aide as well. It’s important to introduce artificial intelligence in schools since it will clearly be an important technology to understand in the future, but it will continue to be controversial until the issue of plagiarism is addressed.

Editors' Recommendations