Microsoft 365 TechTalk: Fake it till you make it - how good are AI detectors really?

The fact is, you have to make it clear when something was created by AI. But who wants to control that and how?

It's the same age-old game as counterfeiting banknotes, counterfeiting products and so on. As soon as one side gains a new level, the other side has to follow suit or at least pretend to have caught up.

In the case of AI, we have the EU AI Act or, in Germany, the KI-Verordnung, which set out the legal framework. This states in Article 50, quote: ...shall disclose that the content has been artificially generated or manipulated: https://www.euaiact.com/article/50

ATTENTION: This is in no way a legal advice!

If the person publishing the content does not do this, however, numerous app providers are now advertising that they can do this for you:

How do these checking apps work?

AI checkers identify various characteristics, including recurring phrases, consistent sentence structures and the absence of personal aspects in the texts. By examining these patterns, an AI detector can recognize whether content was created by humans or by an AI.

Research currently classifies these three approaches:

Machine learning: classifiers learn from sample texts, but need to be trained for many text types, which is expensive.
Digital watermarks: Invisible watermarks in text that can be recognized by algorithms. Providers of AI tools would have to insert these watermarks, which is very unlikely.
Statistical parameters: Requires access to probability values of the texts, which is difficult without API access.

IT journalist Melissa Heikkilä says: “The enormous speed of development in this sector means that any methods of recognizing AI-generated texts will look very old very quickly.” Source: https://www.heise.de/hintergrund/Wie-man-KI-generierte-Texte-erkennen-kann-7434812.html

OpenAI released the AI Text Classifier in early 2023 to recognize AI-generated texts. However, the recognition rate was very low at just 26%, and 9% of human texts were incorrectly classified as AI texts. Due to this insufficient accuracy, OpenAI removed the tool from the market in mid-2023. A new version of the tool is not yet available. This example shows that we should not have too high expectations of AI recognition tools.

In general, anyone can use the following aspects to decide for themselves whether a text was created by a human or by an AI:

AI-generated texts often are not very original or varied and contain many repetitions, while human authors vary more when writing.
A style with many keywords strung together could also indicate an AI as the author.
AI tools often make mistakes with acronyms, technical terms and conjunctions.

How well do the AI detectors work?

There are now quite a few of these apps. From free / commercial financed or to be paid, there is everything. Here are two overviews:

Best AI Detector | Free & Premium Tools Compared: https://www.scribbr.com/ai-tools/best-ai-detector/
The best AI content detectors in 2024: https://zapier.com/blog/ai-content-detector/

I did my tests with noplagiat. The tool is rated as good to very good and recognized texts that I had created for this article by Microsoft Copilot in Edge.

Test 1:

My prompt: “List the planets in our solar system. Tell which is the largest and which is the smallest. Name the largest moons. How old is our solar system? Explain how our solar system was formed. Formulate the answer as a scientific essay.”

Result and rating:

Test 2:

BUT a simple and small adjustment to the prompt caused noplagiat to stumble:

PROMPT: "List the planets in our solar system. Tell which is the largest and which is the smallest. Name the largest moons. How old is our solar system? Explain how our solar system was formed. Formulate the answer as a scientific essay. Use the writing style of Stephen King."

Result and evaluation:

Only this small addition in the prompt reduced the rate from 47% to 6%. Well, you might not want to write an essay about our solar system in the style of Stephen King. However, the example clearly shows where the weaknesses of the current solutions are.

Test 3:

It also recognized texts that were 100% not created by an AI correctly. This is the abstract of my new book, which is just about to be finalized:

The next version of AI apps

Enriching the AI-generated text with content, reviewing the text and adding text passages/facts from other sources is what separates the next version of AI apps. Here it is the case that you do not receive the answer to your prompt immediately, but that it can sometimes take a few hours. With Studytexter.de, for example, up to 4 hours.

Here are three examples of such solutions:

https://studytexter.de/: Quote from the homepage - Your entire term paper at the touch of a button in under 4 hours. Innovative AI text synthesis - especially for German academic papers. 1000x better than ChatGPT.
https://neuroflash.com/de/: Quote from the homepage - Strengthen your marketing with personalized AI content. The all-in-one solution for brand-compliant content with AI, from ideation to content creation and optimization. neuroflash helps marketing teams save time, ensure a consistent message and improve creative processes.
https://thezimmwriter.com/: ZimmWriter is the world's first AI content writing software for Microsoft Windows. It allows you to use the AI provided by OpenAI directly on your desktop! -> Note: The app advertises 10 features that are supposed to be entered and many details that are supposed to make generated text unique.

Conclusion

Considering that humans do themselves well to check the output and adapt / reformulate it if necessary, it is currently not possible to verify reliably whether a text was created by an AI or not.

Microsoft 365 TechTalk

Seiten

Sonntag, 15. September 2024

Fake it till you make it - how good are AI detectors really?

How do these checking apps work?

How well do the AI detectors work?

The next version of AI apps

Conclusion

Keine Kommentare:

Kommentar veröffentlichen