Whiteboard with handwritten text: 'ZIF_LLM_MODE' and 'get-request_body().'

This is a fantastic long post from Simon Willison about things we learned about Large Language Models (LLMs) in 2024. The bit that jumped out to me was, unsurprisingly, the AI literacies angle to all this. As Willison points out, using an LLM such as ChatGPT, Claude, or Gemini seems straightforward as it’s a chat-based interface, but even with the voice modes there’s still a need to understand what’s going on under the hood.

People often want ‘training’ on new technologies, but it’s actually quite difficult to provide in this situation. While I think there are underlying literacies involved here, a key way of understanding what’s going on is to experiment. As with every other technology, there’s no substitute for messing about with stuff to see how it works — and where the limits are.

I’d recommend also having a look at Willison’s list of ‘artifacts’ he created using Claude in a single week. It’s also worth considering the analogy he makes with building out railway infrastructure in the 19th century, as it kind of works.

A drum I’ve been banging for a while is that LLMs are power-user tools—they’re chainsaws disguised as kitchen knives. They look deceptively simple to use—how hard can it be to type messages to a chatbot?—but in reality you need a huge depth of both understanding and experience to make the most of them and avoid their many pitfalls.

If anything, this problem got worse in 2024.

We’ve built computer systems you can talk to in human language, that will answer your questions and usually get them right! … depending on the question, and how you ask it, and whether it’s accurately reflected in the undocumented and secret training set.

[…]

What are we doing about this? Not much. Most users are thrown in at the deep end. The default LLM chat UI is like taking brand new computer users, dropping them into a Linux terminal and expecting them to figure it all out.

Meanwhile, it’s increasingly common for end users to develop wildly inaccurate mental models of how these things work and what they are capable of. I’ve seen so many examples of people trying to win an argument with a screenshot from ChatGPT—an inherently ludicrous proposition, given the inherent unreliability of these models crossed with the fact that you can get them to say anything if you prompt them right.

There’s a flipside to this too: a lot of better informed people have sworn off LLMs entirely because they can’t see how anyone could benefit from a tool with so many flaws. The key skill in getting the most out of LLMs is learning to work with tech that is both inherently unreliable and incredibly powerful at the same time. This is a decidedly non-obvious skill to acquire!

There is so much space for helpful education content here, but we need to do do a lot better than outsourcing it all to AI grifters with bombastic Twitter threads.

Source: Simon Willison’s Weblog

Image: Bernd Dittrich