“Model collapse” threatens to kill progress on generative AIs

Stern · 7 months ago

“Model collapse” threatens to kill progress on generative AIs

@Katana314@lemmy.world · 7 months ago

If we can work out which data conduits are patrolled more often by AI than by humans, we could intentionally flood those channels with AI content, and push Model Collapse along further. Get AI authors to not only vet for “true human content”, but also pay licensing fees for the use of that content. And then, hopefully, give the fuck up on their whole endeavor.

@RmDebArc_5@sh.itjust.works · 7 months ago

This sounds like AI is literally biting its own tail

@AbidanYre@lemmy.world · 7 months ago

ChatGPT, what is an ouroboros?

Optional · 7 months ago

Of course! An ChatGPT is an ouroboros, ChatGPT what is an ouroboros.

@brey1013@lemmy.world · 7 months ago

Oh no

@Tamkish@programming.dev · 7 months ago

Anyway

@mac@lemm.ee · 7 months ago

is it not relatively trivial to pre-vet content before they train it? at least with aigen text it should be.

@General_Effort@lemmy.world · 7 months ago

It depends on what you are looking for. Identifying AI generated data is generally hard, though it can be done in specific cases. There is no mathematical difference between the 1s and 0s that encoded AI generated data and any other data. Which is why these model collapse ideas are just fantasy. There is nothing magical about any data that makes it “poisonous” to AI. The kernel of truth behind these ideas is not likely to matter in practice.

@RvTV95XBeo@sh.itjust.works · 7 months ago

The problem is these AI companies currently exist on the business model of not paying for information, and that generally includes not wanting to pay content curators.

Google is probably the only one in a position to potentially outsource by making everyone solve a “does this hand look normal to you” CAPTCHA

They can try and train AI to detect AI, but that’s also difficult.

@FMT99@lemmy.world · 7 months ago

So it’s not a problem with AI. It’s just a problem for some mayfly companies that try to profit from the latest trend?

@Honytawk@lemmy.zip · 7 months ago

As always.

The model isn’t dying, its the way these parasites want it to work that is dying.

@Mwa@lemm.ee · edit-2 7 months ago

remember how nfts feel off (due to how they lost their value) have a theory that ais will come to the same fate cause they cannot train (it according to the article?)

@CarbonatedPastaSauce@lemmy.world · 7 months ago

Model collapse is just a euphemism for “we ran out of stuff to steal”

@jimmy90@lemmy.world · 7 months ago

or “we’ve hit a limit on what our new toy can do and here’s our excuse why it won’t get any better and AGI will never happen”

@Snowclone@lemmy.world · edit-2 7 months ago

It’s more ''we are so focused on stealing and eating content, we’re accidently eating the content we or other AI made, which is basically like incest for AI, and they’re all inbred to the point they don’t even know people have more than two thumb shaped fingers anymore."

@rottingleaf@lemmy.world · 7 months ago

All such news make me want to live to the time when our world is interesting again. Real AI research, something new instead of the Web we have, something new instead of the governments we have. It’s just that I’m scared of what’s between now and then. Parasites die hard.

@draughtcyclist@lemmy.world · 7 months ago

I’ve been assuming this was going to happen since it’s been haphazardly implemented across the web. Are people just now realizing it?

@DeathbringerThoctar@lemmy.world · 7 months ago

People are just now acknowledging it. Execs tend to have a disdain for the minutiae. They’re like kids that only want to do the exciting bits. As a result things get fucked because they don’t really understand what they’re doing. As Muskrat would say “move fast and break things.” It’s a terrible mindset.

@pixxelkick@lemmy.world · 7 months ago

“Move Fast and Break Things” is Zuckerberg/Facebook motto, not Musk, just to note.

@DeathbringerThoctar@lemmy.world · 7 months ago

Oh, I stand corrected

@Wrufieotnak@feddit.org · 7 months ago

It is very much the motto this idiot lives by. He just wasn’t the first to coin that phrase.

FaceDeer · 7 months ago

No, researchers in the field knew about this potential problem ages ago. It’s easy enough to work around and prevent.

People who are just on the lookout for the latest “aha, AI bad!” Headline, on the other hand, discover this every couple of months.

@erenkoylu@lemmy.ml · 7 months ago

No it doesn’t.

All this doomer stuff is contradicted by how fast the models are improving.

@SlopppyEngineer@lemmy.world · 7 months ago

Usually we get an AI winter, until somebody develops a model that can overcome that limitation of needing more and more data. In this case by having some basic understanding instead of just having a regurgitation engine for example. Of course that model runs into the limit of only having basic understanding, not advanced understanding and again there is an AI winter.

@Petter1@lemm.ee · 7 months ago

Have you seen the newest model from OpenAI? They managed to get some logic into the system, so that it is now better at math and programming 😄 it is called “o1” and cones in 3 sizes where the largest is not released yet.

The downside is, that generation of answers takes more time again.

@aggelalex@lemmy.world · 7 months ago

So AI:

Scraped the entire internet without consent
Trained on it
Polluted it with AI generated rubbish
Trained on that rubbish without consent
Are now in need of lobotomy

NutWrench · 7 months ago

Anyone who has made copies of videotapes knows what happens to the quality of each successive copy. You’re not making a “treasure trove.” You’re making trash.

@njordomir@lemmy.world · 7 months ago

It’s like a human centipede where only the first person is a human and everyone else is an AI. It’s all shit, but it gets a bit worse every step.

@thejml@lemm.ee · 7 months ago

Ah, the Hapsburg of AI!

@Telorand@reddthat.com · 7 months ago

Oh, the artificial humanity!

Davel23 · 7 months ago

Are you confusing the Habsburg Dynasty with the Hindenburg?

@Telorand@reddthat.com · 7 months ago

No, I just thought they were vaguely similar enough words to make a dumb internet joke.

Davel23 · 7 months ago

You’re right, that’s a good dumb internet joke. I’m just being needlessly pedantic today.

Optional · 7 months ago

I see your needless pedantry and raise you abrasive grammarian.

Deebster · 7 months ago

Perhapsburg they are

@ZILtoid1991@lemmy.world · 7 months ago

If only the generated output also looked more and more like how inbred humans do.

Like insane rambling from LLMs, and the humans generated by AI had various developmental disorders and the Habsburg jaw.

@PapaStevesy@lemmy.world · 7 months ago

I like to think of it like a Mad Cow or Kuru, you can’t eat your own species’s brains or you could get a super lethal, contagious prion disease.

@CarbonatedPastaSauce@lemmy.world · edit-2 7 months ago

Prion diseases aren’t contagious.

Edit: for the uninformed people that downvoted - clearly spelled out here https://www.merckmanuals.com/professional/neurologic-disorders/prion-diseases/overview-of-prion-diseases

@PapaStevesy@lemmy.world · 7 months ago

You can acquire it through direct contact, i.e. consuming prion-disease-contaminated meat. What would you call it?

@CarbonatedPastaSauce@lemmy.world · 7 months ago

Also, that’s not what direct contact means when discussing contagion:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7150340/

Ingestion is not ‘direct contact’.

@CarbonatedPastaSauce@lemmy.world · 7 months ago

Contagious means you can get it from direct or indirect contact with another person or organism that is infected. Not from eating them.

That is not possible with prion disease.

Ingesting a Petri dish full of flu virus doesn’t make the Petri dish ‘contagious’.

@pyre@lemmy.world · edit-2 7 months ago

oh no are we gonna have to appreciate the art of human beings? ew. what if they want compensation‽

BlackLaZoR · 7 months ago

So they made garbage AI content, without any filtering for errors, and they fed that garbage to the new model, that turned out to produce more garbage. Incredible discovery!

@RunningInRVA@lemmy.world · 7 months ago

Indeed. They discovered that:

shit in = shit out.

@kambusha@sh.itjust.works · 7 months ago

people equals shit

Optional · 7 months ago

A fifty year old maxim, to be clear. They “just now” “found that out”.

Biggest. Scam. Evar.

@stephen01king@lemmy.zip · 7 months ago

Who just found that out?

Pennomi · 7 months ago

Yeah, in practice feeding AI its own outputs is totally fine as long as it’s only the outputs that are approved by users.

@WalnutLum@lemmy.ml · 7 months ago

I don’t know if thinking that training data isn’t going to be more and more poisoned by unsupervised training data from this point on counts as “in practice”

Bezier · 7 months ago

I would expect some kind of small artifacting getting reinforced in the process, if the approved output images aren’t perfect.

Pennomi · 7 months ago

Only up to the point where humans notice it. It’ll make AI images easier to detect, but still pretty for humans. Probably a win-win.

Bezier · 7 months ago

Didn’t think of that, good point.

The inbreeding could also affect larger decisions in sneaky ways, like how it wants to compose the image. It would be bad if the generator started to exaggerate and repeat some weird ai tropes.