May 16, 2025 - 4:15pm

“White genocide” appears to be back in vogue. The theory that Afrikaners are being killed for their race in their home country has taken on a life of its own on social media, following the arrival in the US this week of 59 white South Africans as designated refugees at the behest of the Trump administration.

On Wednesday, Grok, the chatbot developed by Elon Musk’s xAI, made its own contribution to the debate. Whether X users asked the bot about baseball greats or restaurant recommendations, they often received broadly the same response: “The claim of white genocide is highly controversial,” it would begin. “Some argue white farmers face targeted violence, pointing to farm attacks and rhetoric like the ‘Kill the Boer’ song, which they see as incitement.”

These erratic replies have now been deleted, and Musk’s company has cited an “unauthorised modification” as the reason for the glitch. New measures will be introduced to ensure that xAI employees cannot modify Grok responses without additional oversight. But questions remain as to how this happened in the first place.

Musk has long framed Grok as the free speech alternative to other chatbots, dubbing it “based AI” and “truth-seeking” in contrast to the comparatively politically correct ChatGPT. Yet this week’s faux pas has exposed Grok’s own walled nature. The style and consistency of the error pointed to something more hard-coded, and it is notable that the statement from Musk’s company claimed that the incident “violated xAI’s internal policies and core values”.

Tech investor Paul Graham speculated yesterday that this resembled “the sort of buggy behavior you get from a recently applied patch”. One theory emerging online is that a developer had looked at the increasing prevalence of queries about “white genocide” in South Africa and added a sub-module dictating that the topic be framed as controversial without dismissing it entirely. Then, something else must have broken: the system must have taken this too literally, too often.

GPTs, by their nature, are liable to “break”. Last year, Google’s Gemini bot produced images of black Nazis and Native American Vikings. Shortly afterwards, the tech platform’s “AI Overviews” told users that eating one rock a day was good for their health. Given that AIs simply hoover up masses of info predictively — rather than cognitively — then spit it out, there is always bound to be a base load of nonsense.

The way around this, as with units like the “Deep Reasoning” OpenAI o3 model, is to code more and more sub-systems on top. Just as the human brain has a frontal cortex to deal with complex moral judgements and an amygdala to deal with emotions, GPTs are increasingly becoming stacks of weighted systems, rather than just one well-trained brain. Among o3’s sub-systems will be “Reasoning & Planning Controller, Expert Modules, Tool / Function Calling Orchestrator, Retrieval-Augmentation, Safety & Policy Filter”.

This is the essential bind now confronting technology: to become more complex and independent, the neutrality of the basic training algorithm must be tempered by man’s hand. Increasingly, the placement of the modules is the essential differentiation between the various competitor companies now working in the space. How Grok differs from ChatGPT or DeepSeek is a function of the data set, the “training” feedback, and then the placement of weighted sub-systems and rules.

When all programming is done by humans — as with conventional computing — the opportunities for chaos are vast. The “white genocide” mishap demonstrates that when you’re running effectively a black box, adding human flourishes invites issues which are hard to diagnose and hard to resolve. A sprocket will thus pop off somewhere that bears no relation to the instructions given.

For years, the rationalist community online has warned about the “AI singularity”. That is, the point at which a computer vastly more intelligent than us can effectively bend humanity to its will and whims. The reality may be slightly more prosaic, however. In order to expand in scope and capability, AI systems may become increasingly dependent on the interaction of human-coded cogs and subroutines, whose relationship with the AI itself is so complex as to be impossible to parcel out. As much as engineers, they may need doulas and empaths.


Gavin Haynes is a journalist and former editor-at-large at Vice.

@gavhaynes