OpenAI has built a text watermarking method to detect chatgpt written content

PenisDuckCuck9001@lemmynsfw.com · 3 months ago

OpenAI has built a text watermarking method to detect chatgpt written content

brucethemoose@lemmy.world · 3 months ago

They can cycle a some biases (dozens?) and test them all. Detokenization is super cheap to run, its not AI or anything.

I’m trying to think of a good analogy for how this would work, and I kinda came up with one. This would be kinda like an image encoder that biases itself towards coding RGB values (0-255) as even numbers. Subtly, say 30% odd 70% even.

That’s totally imperceptile to humans. And even a “small” sample of the image would carry this bias if pasted into a larger image verbatim, since the sample size is so large (just as the sample size for a bunch of tokens in text is pretty big.

And I’m not saying its fullproof… but if thats indeed what they’re doing, I think its a decent way to detect “lazy” OpenAI abusers who aren’t working so hard to scramble and defeat it.

OpenAI has built a text watermarking method to detect chatgpt written content

OpenAI has built a text watermarking method to detect chatgpt written content

OpenAI has built a text watermarking method to detect ChatGPT-written content — company has mulled its release over the past year