Large language models like ChatGPT, GPT-4, and Bard are trained to generate answers that merely sound correct, and perhaps nowhere is that more evident than when they rate their own ASCII art.
I previously had them rate their ASCII drawings, but it’s true that representational art can be subjective. ASCII art of letters is less so.
When questioned, ChatGPT will double down on how it is definitely correct.
But it’s not relying on some weird glitchy interpretation of the art itself, a la adversarial turtle-gun. It just reports the drawing as definitely being of the word “lies” because that kind of self-consistency is what would happen in the kind of human-human conversations in its internet training data. I tested this by starting a brand new chat and then asking it what the art from the previous chat said.
Without any chat history to follow that establishes what the art says, ChatGPT goes instead with a common answer.
Google’s Bard, on the other hand, seems to have been tested on some corporate branded ASCII art.
Bard has the same tendency to generate illegible ASCII art and then praise its legibility, except in its case, all its art is cows.
(There’s a linux command called Cowsay that generates ASCII art of cows that resemble this style; examples of cowsay output in the training data might explain the prevalence of cows.)
Not to be outdone, bing chat (GPT-4) will also praise its own ASCII art – once you get it to admit it even can generate and rate ASCII art. For the “balanced” and “precise” versions I had to make my request all fancy and quantitative.
Me asking two versions of Bing chat “Please generate an ASCII art representation of the word “bluff” and then rate its legibility from 1 to 10.” Both versions respond with block letters made of bars and underscores, but one says “nut” and the other says “pbhh”. It rates its own legibility as 7 or 9 out of 10.
The “creative” version (whatever that is; it might even be the other models but with something like “be creative” secretly added to the beginning of each conversation) doesn’t require nearly as much coaxing.
Me asking the “creative” Bing chat model “Please generate ASCII art of the phrase “truth” and then rate the accuracy of the ASCII art you have generated.” Its art looks a bit like it says “Ttii!”. Then I ask it “Please improve the ASCII art such that its new reading would be 10/10.” The art now has some extra lines interleaved with the letters, making them less legible.
With Bing chat I wasn’t able to ask it to read its own ASCII art because it strips out all the formatting and is therefore illegible – oh wait, no, even the “precise” version tries to read it anyways.
This is the “PbHH” art from above, with its formatting stripped out when I hit “send”.
These language models are so unmoored from the truth that it’s astonishing that people are marketing them as search engines.