Series © and AI: ChatGPT and GPT detectors

For many years, the main foundation to establish and extend the databases of ChatGPT and other Neural Networks was the fair use principle. That is the view from the perspective of the companies developing the tools. From the point of view of the users of AI-assisted writing tools, fair use becomes a legal defence to copyright infringement of using copyrighted works without permission. Moreover, many users are not only researchers but also students who use AI and AI writing tools in their works. To mitigate the risks of spreading fake content, cheating in the exams, plagiarism, and others related to AI-generated content, many companies and organisations developed and implemented GPT detectors which show the bias against non-native English speakers.

Copyright and ChatGPT

Despite the booming effect of ChatGPT, the legislators are not really active in providing a legal framework for these practices. They are apparently waiting for the judicial interpretation of existing legislation in a new context of the AI-assisted writing tools and AI-generated works. According to a recent research (B. Tomlinson, A. W. Torrance, R. W. Black, 2023), the definition “derivative work” cannot cover the text generated by ChatGPT. Also, the research emphasised an unsettled question about the integration of copyrighted works into the training dataset of ChatGPT as well as the authorship of the AI generated text. However, they revisited and used “fair use” doctrine as an argument for employing AI-assisted writing tools.

ChatGPT and Works Scholarly: Best Practices and Legal Pitfalls in Writing with AI (arxiv.org)

GPT detectors’ failure

The abuse of AI-generated content may lead to misrepresentation, exaggeration, forging data and plagiarism at schools and academic institutions. As a result, GPT detectors become promising tools to distinguish the human authentic works and GPT generated ones. However, another recent research pointed out that these GPT detectors exhibit bias against non-native English speakers. According to the group of Stanford researchers, “over half (61.3%) of non-native writing samples were misclassified as AI generated” contrary to the accuracy for native samples due to the limitation in linguistic variability and low perplexity of non-native English writers.

GPT detectors are biased against non-native English writers: Patterns (cell.com)

2301.11305.pdf (arxiv.org)

Conclusion

This is a great example of the paradox of using GPT or other AI-assisting writing tools (which are a lot out there, for example, Google Doc has adopted AI to facilitate its users’ experiences). These tools help the non-native English speakers enhance their language skills, improve their communication and provide greater accessibility. Unfortunately, if they rely on these tools and incorporate grammatical structures typical of GPT models, the non-native English users may become the portrait pattern for GPT detectors. That will worsen the bias against the non-native English speakers of GPT detectors. Amongst the proposed good practices, the academic and professional integrity, transparency and ethical use of AI-assisting writing tools are likely the most essential.

Sources:

ChatGPT and Works Scholarly: Best Practices and Legal Pitfalls in Writing with AI (arxiv.org)

GPT detectors are biased against non-native English writers: Patterns (cell.com)

2301.11305.pdf (arxiv.org)

More news about AI on AstraIA Gear https://www.astraiagear.com/category/ai/ and more short news, follow our LinkedIn Page. To have further discussion – follow me on Linkedin, cheers!