For years, asking ChatGPT how many R's are in "strawberry" was a reliable way to catch it off guard. The answer is three. The model would often say two. It became one of the most widely shared examples of how large language models, for all their capabilities, can fumble something a child could count on their fingers.
So when the official ChatGPT account on X posted "at long last" Tuesday with nothing else attached, the tech community connected the dots fast. The strawberry problem, apparently, had been fixed.
at long last pic.twitter.com/pu9wyAY6sN
— ChatGPT (@ChatGPTapp) April 28, 2026
The replies did not exactly erupt in applause.
Android Police founder, Artem Russakovskii, quote-tweeted the post and said OpenAI had hardcoded the fix in and that replies were already filling up with proof of similar failures still present in GPT-5.5. The claim is essentially that "strawberry" now returns the right answer not because the model genuinely learned to reason over characters, but because someone made sure that one specific answer was baked in.
The screenshots shared by users under the official post support the skepticism. One shows ChatGPT correctly counting three R's in "strawberry," laying out each letter, but in the follow-up question asking it the same thing for the word "cranberry," it came back with the standard "cranberry" has two "r"s.
There is also a screenshot of ChatGPT being asked to "count to 10 starting from 11." It responds with 11 through 20, which counts ten numbers but completely sidesteps the contradiction sitting right in the prompt. You cannot count to 10 if you start from 11. ChatGPT just answered a slightly different question and moved on.
None of this surprises researchers who have been tracking these issues. LLMs process text as tokens, not individual characters, which is why letter-counting has always been an awkward edge case. Fixing the "strawberry" answer does not fix that underlying architecture, and critics are right to ask whether the correction travels beyond that one word.
OpenAI has not published any changelog entry or formal acknowledgment about what changed or in which model. The celebratory tweet is the only signal, and it does not say much.
Whether this holds up across variations, adjacent words, or paraphrased prompts is something users are actively testing right now, and the early results are not uniformly reassuring.




COMMUNICATIONS_LOG: