I asked mistral/brave AI and got this response:
How Many Rs in Strawberry
The word “strawberry” contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.
Yes, at some point the meme becomes the training data and the LLM doesn’t need to answer because it sees the answer all over the damn place.
From a linguistic perspective, this is why I am impressed by (or at least, astonished by) LLMs!
deleted by creator
It is wrong. Strawberry has 3 r’s
deleted by creator
Uh, no, that is not common parlance. If any human tells you that strawberry has two r’s, they are also wrong.
there are two 'r’s in ‘strawbery’
Skill issue
I think I have seen this exact post word for word fifty times in the last year.
Has the number of "r"s changed over that time?
y do you ask?
Just playing, friend.
Same, i was making a pun
Oh, I see! Apologies.
No apologies needed. Enjoy your day and keep the good vibes up!
Yes
And yet they apparently still can’t get an accurate result with such a basic query.
Meanwhile… https://futurism.com/openai-signs-deal-us-government-nuclear-weapon-security
Doc: That’s an interesting name, Mr…
Fletch: Babar.
Doc: Is that with one B or two?
Fletch: One. B-A-B-A-R.
Doc: That’s two.
Fletch: Yeah, but not right next to each other, that’s what I thought you meant.
Doc: Isn’t there a children’s book about an elephant named Babar.
Fletch: Ha, ha, ha. I wouldn’t know. I don’t have any.
Doc: No children?
Fletch: No elephant books.
That happens when do you not understand what is a llm, or what its usecases are.
This is like not being impressed by a calculator because it cannot give a word synonym.
But everyone selling llms sells them as being able to solve any problem, making it hard to know when it’s going to fail and give you junk.
And redbull give you wings.
Marketing within a capitalist market be like that for every product.
Is anyone really pitching AI as being able to solve every problem though?
Sure, maybe it’s not capable of producing the correct answer, which is fine. But it should say “As an LLM, I cannot answer questions like this” instead of just making up an answer.
I have thought a lot on it. The LLM per se would not know if the question is answerable or not, as it doesn’t know if their output is good of bad.
So there’s various approach to this issue:
-
The classic approach, and the one used for censoring: keywords. When the llm gets a certain key word or it can get certain keyword by digesting a text input then give back a hard coded answer. Problem is that while censoring issues are limited. Hard to answer questions are unlimited, hard to hard code all.
-
Self check answers. For everything question the llm could process it 10 times with different seeds. Then analyze the results and see if they are equivalent. If they are not then just answer that it’s unsure about the answer. Problem: multiplication of resource usage. For some questions like the one in the post, it’s possible than the multiple randomized answers give equivalent results, so it would still have a decent failure rate.
Why would it not know? It certainly “knows” that it’s an LLM and it presumably “knows” how LLMs work, so it could piece this together if it was capable of self-reflection.
It doesn’t know shit. It’s not a thinking entity.
Precisely, it’s not capable of self-reflection, thinking, or anything of the sort. It doesn’t even understand the meaning of words
-
It’s like someone who has no formal education but has a high level of confidence and eavesdrops on a lot of random conversations.
You rang?
I know right? It’s not a fruit it’s a vegetable!
Works fine for me in o3-mini-high:
Counting letters in “strawberry”
Alright, I’m checking: the word “strawberry” is spelled S T R A W B E R R Y. Let me count the letters: S (1), T (2), R (3), A (4), W (5), B (6), E (7), R (8), R (9), Y (10). There are three R’s: in positions 3, 8, and 9. So, the answer is 3. Even if we ignore case, the count still holds. Therefore, there are 3 r’s in “strawberry.”
A normal person would say ‘strawberry with two "r"s’
Finally! With a household energy consumption for one day we can count how many Rs are in strawberry.
This is a bad example… If I ask a friend "is strawberry spelled with one or two r’s"they would think I’m asking about the last part of the word.
The question seems to be specifically made to trip up LLMs. I’ve never heard anyone ask how many of a certain letter is in a word. I’ve heard people ask how you spell a word and if it’s with one or two of a specific letter though.
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed
Artificial sugar is still sugar.
Artificial intelligence implies there is intelligence in some shape or form.
Thats because it wasnt originally called AI. It was called an LLM. Techbros trying to sell it and articles wanting to fan the flames started called it AI and eventually it became common dialect. No one in the field seriously calls it AI, they generally save that terms to refer to general AI or at least narrow ai. Of which an llm is neither.
LLM is a type of a machine learning model, which is a type of artificial intelligence.
Saying LLMs aren’t AI is just the AI Effect in action.
Artificial sugar is still sugar.
Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?
Because those are all wrong. What’s your criteria?
In this example a sugar is something that is sweet.
Another example is artificial flavours still being a flavour.
Or like artificial light being in fact light.
Something that pretends or looks like intelligence, but actually isn’t at all is a perfectly valid interpretation of the word artificial - fake intelligence.
Exactly. The naming of the technology would make you assume it’s intelligent. It’s not.
If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.
This is exactly the problem, though. They don’t have “intelligence” or any actual reasoning, yet they are constantly being used in situations that require reasoning.
Maybe if you focus on pro- or anti-AI sources, but if you talk to actual professionals or hobbyists solving actual problems, you’ll see very different applications. If you go into it looking for problems, you’ll find them, likewise if you go into it for use cases, you’ll find them.
Personally I have yet to find a use case. Every single time I try to use an LLM for a task (even ones they are supposedly good at), I find the results so lacking that I spend more time fixing its mistakes than I would have just doing it myself.
So youve never used it as a starting point to learn about a new topic? You’ve never used it to look up a song when you can only remember a small section of lyrics? What about when you want to code a block of code that is simple but monotonous to code yourself? Or to suggest plans for how to create simple sturctures/inventions?
Anything with a verifyable answer that youd ask on a forum can generally be answered by an llm, because theyre largely trained on forums and theres a decent section the training data included someone asking the question you are currently asking.
Hell, ask chatgpt what use cases it would recommend for itself, im sure itll have something interesting.
What situations are you thinking of that requires reasoning?
I’ve used LLMs to create software i needed but couldn’t find online.
Creating software is a great example, actually. Coding absolutely requires reasoning. I’ve tried using code-focused LLMs to write blocks of code, or even some basic YAML files, but the output is often unusable.
It rarely makes syntax errors, but it will do things like reference libraries that haven’t been imported or hallucinate functions that don’t exist. It also constantly misunderstands the assignment and creates something that technically works but doesn’t accomplish the intended task.
I think coding is one of the areas where LLMs are most useful for private individuals at this point in time.
It’s not yet at the point where you just give it a prompt and it spits out flawless code.
For someone like me that are decent with computers but have little to no coding experience it’s an absolutely amazing tool/teacher.
The terrifying thing is everyone criticising the LLM as being poor, however it excelled at the task.
The question asked was how many R in strawbery and it answered. 2.
It also detected the typo and offered the correct spelling.
What’s the issue I’m missing?
The issue that you are missing is that the AI answered that there is 1 ‘r’ in ‘strawbery’ even though there are 2 'r’s in the misspelled word. And the AI corrected the user with the correct spelling of the word ‘strawberry’ only to tell the user that there are 2 'r’s in that word even though there are 3.
Sure, but for what purpose would you ever ask about the total number of a specific letter in a word? This isn’t the gotcha that so many think it is. The LLM answers like it does because it makes perfect sense for someone to ask if a word is spelled with a single or double “r”.
It makes perfect sense if you do mental acrobatics to explain why a wrong answer is actually correct.
Not mental acrobatics, just common sense.
Except many many experts have said this is not why it happens. It cannot count letters in the incoming words. It doesn’t even know what “words” are. It has abstracted tokens by the time it’s being run through the model.
It’s more like you don’t know the word strawberry, and instead you see: How many 'r’s in 🍓?
And you respond with nonsense, because the relation between ‘r’ and 🍓 is nonsensical.
Uh oh, you’ve blown your cover, robot sir.
There’s also a “r” in the first half of the word, “straw”, so it was completely skipping over that r and just focusing on the r’s in the word “berry”
It doesn’t see “strawberry” or “straw” or “berry”. It’s closer to think of it as seeing 🍓, an abstract token representing the same concept that the training data associated with the word.
It wasn’t focusing on anything. It was generating text per its training data. There’s no logical thought process whatsoever.
It’s predictive text on speed. The LLMs currently in vogue hardly qualify as A.I. tbh…
Still, it’s kinda insane how two years ago we didn’t imagine we would be instructing programs like “be helpful but avoid sensitive topics”.
That was definitely a big step in AI.
I can already see it…
Ad: CAN YOU SOLVE THIS IMPOSSIBLE RIDDLE THAT AI CAN’T SOLVE?!
With OP’s image. And then it will have the following once you solve it: “congratz, send us your personal details and you’ll be added to the hall of fame at CERN Headquarters”
I mean, that’s how I would think about it…
Why?
The typo in “strawbery” leads to a conversation like “hey you spelt this wrong there’s two r’s (after the e) not one”
It happens even if you ask how many “r”s are in “strawberry”. It’s a well-known AI gotcha that happens on most if not all current models. The typo in the original post is a little misleading and not that relevant.
Huh. It’s just simply a wrong answer though.