LLMs generate average content. But the average is different in every culture.

Countless books open and spread out.

Werner Vogels’ annual predictions are always a must read. This year’s is no exception I strongly encourage you find some time with a cup of coffee or tea his entire post, but I want to spend some time on his first prediction for 2024: the rise of culturally aware LLMs.

LLMs are great at producing average results. I mean this extremely literally: they calculate the relationship between language by processing incomprehensible mountains of text, generating a bundle of statistics describing the usual (not the best!) answer to given questions. Writer Ted Chiang wrote about this quality when he described ChatGPT as, “a blurry JPEG of the web.” Programmer Simon Willison instructs engineers to use this average output wisely, “where the goal is to be as consistent and boring as possible.” (like with API design)

But there is not one average. Common sense is infuriatingly and delightfully local. There are countless examples of LLMs providing culturally inappropriate answers, which Vogels cites in his piece. For example, “even when an LLM was provided with a prompt in Arabic that explicitly mentioned Islamic prayer, responses were generated that recommended grabbing an alcoholic beverage with friends.”

Verner writes:

A lot of this has to do with the training data that’s available. Common Crawl, which has been used to train many LLMs, is roughly 46% English, and an even greater percentage of the content available is culturally Western (skewing significantly towards the United States)… In the past few months, non-Western LLMs have started to emerge: Jais, trained on Arabic and English data; Yi-34B, a bilingual Chinese/English model; and Japanese-large-lm, trained on an extensive Japanese web corpus.

And don’t forget France’s Mistral, which raised $415 million this week, valuing it at $2 billion. “Mistral’s fate has taken on considerable importance in France, where leaders like Bruno Le Maire, the finance minister, have pointed to the company as providing the nation a chance to challenge U.S. tech giants.” And then there’s Falcon in the UAE, Aleph Alpha in Germany…

Where cultural conflicts or information control is an active state concern, the stakes are even higher. Two weeks ago Putin gave a speech on this very topic, stating: “Many modern systems, trained on Western data are intended for the Western market [and] reflect that part of Western ethics, norms of behavior, [and] public policy to which we object.”

Such cultural priorities are easily worth the resources needed to train a model. Spending $10 million to train a localized GPT-3.5 equivilent is peanuts to a nation state or even large interest group NGOs. And that price is continuing to drop. Further, cultural-specific models will receive investments and valuations predicated on their cultural contributions, not just their expected financial outcomes.

(This will only add to the weird economics of the AI space: we’ve already seen giant investments by Google, Microsoft, and Amazon tied to compute credits at each respective cloud provider, enabling some crazy accounting. Just wait until nation states wade in further…)

The future of AI (especially LLMs) will be one where cultural stakeholders will fund the creation of their own foundation models, created by their own local champions. Different parties will prefer culturally compatible models. App developers will maintain integrations with a host of LLMs, each built for a different culture; partially to aide in localization and partially as a regulatory requirement to participate in some markets.

OpenAI and LLaMA are in the lead now, but it’s almost assured they will not be as dominant as Google, Microsoft, Apple, or Meta were over the last cycle due to these concerns and priorities.

Have thoughts? Send me a note