Why I am not impressed by A.I.

@joel1974@lemmy.world · 2 months ago

Why I am not impressed by A.I.

Fubarberry · 2 months ago

I asked mistral/brave AI and got this response:

How Many Rs in Strawberry

The word “strawberry” contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.

@jj4211@lemmy.world · 2 months ago

Yes, at some point the meme becomes the training data and the LLM doesn’t need to answer because it sees the answer all over the damn place.

@otp@sh.itjust.works · 2 months ago

From a linguistic perspective, this is why I am impressed by (or at least, astonished by) LLMs!

@seven_phone@lemmy.world · 2 months ago

deleted by creator

ratz30 · 2 months ago

It is wrong. Strawberry has 3 r’s

@seven_phone@lemmy.world · 2 months ago

deleted by creator

@iAmTheTot@sh.itjust.works · 2 months ago

Uh, no, that is not common parlance. If any human tells you that strawberry has two r’s, they are also wrong.

@pr06lefs@lemmy.ml · 2 months ago

there are two 'r’s in ‘strawbery’

Autonomous User · 2 months ago

Skill issue

@whynot_1@lemmy.world · 2 months ago

I think I have seen this exact post word for word fifty times in the last year.

Clay_pidgin · 2 months ago

Has the number of "r"s changed over that time?

@TachyonTele@lemm.ee · 2 months ago

y do you ask?

Clay_pidgin · 2 months ago

Just playing, friend.

@TachyonTele@lemm.ee · 2 months ago

Same, i was making a pun

Clay_pidgin · 2 months ago

Oh, I see! Apologies.

@TachyonTele@lemm.ee · 2 months ago

No apologies needed. Enjoy your day and keep the good vibes up!

@ElectroLisa@lemmy.blahaj.zone · 2 months ago

@pulsewidth@lemmy.world · edit-2 2 months ago

And yet they apparently still can’t get an accurate result with such a basic query.

Meanwhile… https://futurism.com/openai-signs-deal-us-government-nuclear-weapon-security

@Grabthar@lemmy.world · 2 months ago

Doc: That’s an interesting name, Mr…

Fletch: Babar.

Doc: Is that with one B or two?

Fletch: One. B-A-B-A-R.

Doc: That’s two.

Fletch: Yeah, but not right next to each other, that’s what I thought you meant.

Doc: Isn’t there a children’s book about an elephant named Babar.

Fletch: Ha, ha, ha. I wouldn’t know. I don’t have any.

Doc: No children?

Fletch: No elephant books.

@daniskarma@lemmy.dbzer0.com · edit-2 2 months ago

That happens when do you not understand what is a llm, or what its usecases are.

This is like not being impressed by a calculator because it cannot give a word synonym.

@Strykker@programming.dev · 2 months ago

But everyone selling llms sells them as being able to solve any problem, making it hard to know when it’s going to fail and give you junk.

@daniskarma@lemmy.dbzer0.com · 2 months ago

And redbull give you wings.

Marketing within a capitalist market be like that for every product.

NιƙƙιDιɱҽʂ · 2 months ago

Is anyone really pitching AI as being able to solve every problem though?

xigoi · 2 months ago

Sure, maybe it’s not capable of producing the correct answer, which is fine. But it should say “As an LLM, I cannot answer questions like this” instead of just making up an answer.

@daniskarma@lemmy.dbzer0.com · 2 months ago

I have thought a lot on it. The LLM per se would not know if the question is answerable or not, as it doesn’t know if their output is good of bad.

So there’s various approach to this issue:

The classic approach, and the one used for censoring: keywords. When the llm gets a certain key word or it can get certain keyword by digesting a text input then give back a hard coded answer. Problem is that while censoring issues are limited. Hard to answer questions are unlimited, hard to hard code all.
Self check answers. For everything question the llm could process it 10 times with different seeds. Then analyze the results and see if they are equivalent. If they are not then just answer that it’s unsure about the answer. Problem: multiplication of resource usage. For some questions like the one in the post, it’s possible than the multiple randomized answers give equivalent results, so it would still have a decent failure rate.

xigoi · 2 months ago

Why would it not know? It certainly “knows” that it’s an LLM and it presumably “knows” how LLMs work, so it could piece this together if it was capable of self-reflection.

@Klear@lemmy.world · 2 months ago

It doesn’t know shit. It’s not a thinking entity.

@alphabethunter@lemmy.world · 2 months ago

Precisely, it’s not capable of self-reflection, thinking, or anything of the sort. It doesn’t even understand the meaning of words

@dan1101@lemm.ee · 2 months ago

It’s like someone who has no formal education but has a high level of confidence and eavesdrops on a lot of random conversations.

@zipzoopaboop@lemmynsfw.com · 2 months ago

You rang?

Sentient Loom · 2 months ago

I know right? It’s not a fruit it’s a vegetable!

@humorlessrepost@lemmy.world · edit-2 2 months ago

Works fine for me in o3-mini-high:

Counting letters in “strawberry”

Alright, I’m checking: the word “strawberry” is spelled S T R A W B E R R Y. Let me count the letters: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). There are three R’s: in positions 3, 8, and 9. So, the answer is 3. Even if we ignore case, the count still holds. Therefore, there are 3 r’s in “strawberry.”

@interdimensionalmeme@lemmy.ml · 2 months ago

A normal person would say ‘strawberry with two "r"s’

@sheogorath@lemmy.world · 2 months ago

Finally! With a household energy consumption for one day we can count how many Rs are in strawberry.

@Tgo_up@lemm.ee · 2 months ago

This is a bad example… If I ask a friend "is strawberry spelled with one or two r’s"they would think I’m asking about the last part of the word.

The question seems to be specifically made to trip up LLMs. I’ve never heard anyone ask how many of a certain letter is in a word. I’ve heard people ask how you spell a word and if it’s with one or two of a specific letter though.

If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.

@Grandwolf319@sh.itjust.works · 2 months ago

If you think of LLMs as something with actual intelligence you’re going to be very unimpressed

Artificial sugar is still sugar.

Artificial intelligence implies there is intelligence in some shape or form.

@Scubus@sh.itjust.works · 2 months ago

Thats because it wasnt originally called AI. It was called an LLM. Techbros trying to sell it and articles wanting to fan the flames started called it AI and eventually it became common dialect. No one in the field seriously calls it AI, they generally save that terms to refer to general AI or at least narrow ai. Of which an llm is neither.

JohnEdwa · 2 months ago

LLM is a type of a machine learning model, which is a type of artificial intelligence.

Saying LLMs aren’t AI is just the AI Effect in action.

@corsicanguppy@lemmy.ca · 2 months ago

Artificial sugar is still sugar.

Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?

Because those are all wrong. What’s your criteria?

@Grandwolf319@sh.itjust.works · 2 months ago

In this example a sugar is something that is sweet.

Another example is artificial flavours still being a flavour.

Or like artificial light being in fact light.

JohnEdwa · edit-2 2 months ago

Something that pretends or looks like intelligence, but actually isn’t at all is a perfectly valid interpretation of the word artificial - fake intelligence.

@Tgo_up@lemm.ee · 2 months ago

Exactly. The naming of the technology would make you assume it’s intelligent. It’s not.

@renegadespork@lemmy.jelliefrontier.net · 2 months ago

If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.

This is exactly the problem, though. They don’t have “intelligence” or any actual reasoning, yet they are constantly being used in situations that require reasoning.

@sugar_in_your_tea@sh.itjust.works · 2 months ago

Maybe if you focus on pro- or anti-AI sources, but if you talk to actual professionals or hobbyists solving actual problems, you’ll see very different applications. If you go into it looking for problems, you’ll find them, likewise if you go into it for use cases, you’ll find them.

@renegadespork@lemmy.jelliefrontier.net · 2 months ago

Personally I have yet to find a use case. Every single time I try to use an LLM for a task (even ones they are supposedly good at), I find the results so lacking that I spend more time fixing its mistakes than I would have just doing it myself.

@Scubus@sh.itjust.works · 2 months ago

So youve never used it as a starting point to learn about a new topic? You’ve never used it to look up a song when you can only remember a small section of lyrics? What about when you want to code a block of code that is simple but monotonous to code yourself? Or to suggest plans for how to create simple sturctures/inventions?

Anything with a verifyable answer that youd ask on a forum can generally be answered by an llm, because theyre largely trained on forums and theres a decent section the training data included someone asking the question you are currently asking.

Hell, ask chatgpt what use cases it would recommend for itself, im sure itll have something interesting.

@Tgo_up@lemm.ee · 2 months ago

What situations are you thinking of that requires reasoning?

I’ve used LLMs to create software i needed but couldn’t find online.

@renegadespork@lemmy.jelliefrontier.net · 2 months ago

Creating software is a great example, actually. Coding absolutely requires reasoning. I’ve tried using code-focused LLMs to write blocks of code, or even some basic YAML files, but the output is often unusable.

It rarely makes syntax errors, but it will do things like reference libraries that haven’t been imported or hallucinate functions that don’t exist. It also constantly misunderstands the assignment and creates something that technically works but doesn’t accomplish the intended task.

@Tgo_up@lemm.ee · 2 months ago

I think coding is one of the areas where LLMs are most useful for private individuals at this point in time.

It’s not yet at the point where you just give it a prompt and it spits out flawless code.

For someone like me that are decent with computers but have little to no coding experience it’s an absolutely amazing tool/teacher.

@HoofHearted@lemmy.world · 2 months ago

The terrifying thing is everyone criticising the LLM as being poor, however it excelled at the task.

The question asked was how many R in strawbery and it answered. 2.

It also detected the typo and offered the correct spelling.

What’s the issue I’m missing?

Tywèle [she|her] · 2 months ago

The issue that you are missing is that the AI answered that there is 1 ‘r’ in ‘strawbery’ even though there are 2 'r’s in the misspelled word. And the AI corrected the user with the correct spelling of the word ‘strawberry’ only to tell the user that there are 2 'r’s in that word even though there are 3.

TomAwsm · 2 months ago

Sure, but for what purpose would you ever ask about the total number of a specific letter in a word? This isn’t the gotcha that so many think it is. The LLM answers like it does because it makes perfect sense for someone to ask if a word is spelled with a single or double “r”.

snooggums · 2 months ago

It makes perfect sense if you do mental acrobatics to explain why a wrong answer is actually correct.

TomAwsm · 2 months ago

Not mental acrobatics, just common sense.

@jj4211@lemmy.world · 2 months ago

Except many many experts have said this is not why it happens. It cannot count letters in the incoming words. It doesn’t even know what “words” are. It has abstracted tokens by the time it’s being run through the model.

It’s more like you don’t know the word strawberry, and instead you see: How many 'r’s in 🍓?

And you respond with nonsense, because the relation between ‘r’ and 🍓 is nonsensical.

@TeamAssimilation@infosec.pub · 2 months ago

Uh oh, you’ve blown your cover, robot sir.

Fubarberry · 2 months ago

There’s also a “r” in the first half of the word, “straw”, so it was completely skipping over that r and just focusing on the r’s in the word “berry”

@jj4211@lemmy.world · 2 months ago

It doesn’t see “strawberry” or “straw” or “berry”. It’s closer to think of it as seeing 🍓, an abstract token representing the same concept that the training data associated with the word.

@catloaf@lemm.ee · 2 months ago

It wasn’t focusing on anything. It was generating text per its training data. There’s no logical thought process whatsoever.

@FourPacketsOfPeanuts@lemmy.world · 2 months ago

It’s predictive text on speed. The LLMs currently in vogue hardly qualify as A.I. tbh…

@TeamAssimilation@infosec.pub · 2 months ago

Still, it’s kinda insane how two years ago we didn’t imagine we would be instructing programs like “be helpful but avoid sensitive topics”.

That was definitely a big step in AI.

@Lazycog@sopuli.xyz · 2 months ago

I can already see it…

Ad: CAN YOU SOLVE THIS IMPOSSIBLE RIDDLE THAT AI CAN’T SOLVE?!

With OP’s image. And then it will have the following once you solve it: “congratz, send us your personal details and you’ll be added to the hall of fame at CERN Headquarters”

Aatube · 2 months ago

I mean, that’s how I would think about it…

@Imgonnatrythis@sh.itjust.works · 2 months ago

Why?

Aatube · 2 months ago

The typo in “strawbery” leads to a conversation like “hey you spelt this wrong there’s two r’s (after the e) not one”

@shrodes@lemmy.world · edit-2 2 months ago

It happens even if you ask how many “r”s are in “strawberry”. It’s a well-known AI gotcha that happens on most if not all current models. The typo in the original post is a little misleading and not that relevant.

@Imgonnatrythis@sh.itjust.works · 2 months ago

Huh. It’s just simply a wrong answer though.