Several big stories in AI this week. I won’t ever try to keep up on all the AI news but there are some notable items out there right now.
Disney and Universal sue Midjourney
Maybe the biggest news is how the entertainment industry has gone after AI this week. Disney and Universal together are suing Midjourney, a major provider of image-generating AI. The suit filing shows compelling examples of the Midjourney engine producing faithful likenesses of copyrighted characters like Shrek, Buzz Lightyear, and Darth Vader, to name just a few. The studios hit hard, describing Midjourney as “the quintessential copyright free-rider and a bottomless pit of plagiarism.”
To me the suit looks very strong, and I’m glad to see it. The generated images clearly are Shrek and Buzz and company, as much as anything is a fictional character; the idea that these images are merely coincidentally similar to those characters rings hollow to me. The suit alleges that the plaintiffs asked Midjourney to cease and desist many times, and contrasts the company’s apparently unwillingness to suppress infringing images with the way it imposes other guardrails — around sexual or violent content, for example — as clear evidence that it could suppress these images if it wanted to.
A person could quibble, and Midjourney surely will, with the suit’s claim that in response to user prompts, the Midjourney engine “accesses data about Disney’s Copyright Works that is stored by the Image Service, and then reproduces an output that copies [character X, Y, or Z].” Technically, the training data sources are not stored in a way that allows them to be retrieved or reproduced “directly.” Instead, the source data is atomized into vectors and training weights. This is behind the claim that LLM training is transformative and thus exempt from copyright — an argument I find weak for text and completely untenable for images. One could also quibble as to whether LLM inference amounts to “reproduction.” I think this line of argument is extremely weak; insofar as the engine spits out images with recognizable Disney characters, it is clearly “storing” and “reproducing” them by some reasonable definition. The likenesses of the copyrighted characters are perfectly clear:
In fact, if one visits the Midjourney web site, an image of the Mandalorian is featured on the home page gallery, at least this morning:
This despite the litigation.
I think Disney and Universal have a very strong case, and I think it could easily extend to text. I hope they win.
Apple: “Large Models can’t reason”
The AI press is also a bit agog over a paper Apple released this week about the capabilities of Large Reasoning Models, the research-focused models and modes that major providers have been releasing, like ChatGPT’s Deep Research, or the same concept from Anthropic’s Claude or Google Gemini. Apple’s claim is, in a nutshell, that these models’ problem-solving chops break down catastrophically under certain conditions. They can’t reliably solve the Tower of Hanoi problem, which bright kids and dedicated computer programs can solve pretty easily. These models, Apple concludes, don’t really “reason.”
Well no. No, of course they don’t. If you’ve coded simple LLM-like programs like travesty generators, or messed around under the hood of a modern LLM, you know they’re nothing more than statistical predictors, guessing what word should come next in a text stream. The fact that this simple procedure, brought to massive scale by computing brute force, can produce highly useful results, is itself a beautiful, and useful, result. But token prediction, no matter how sophisticated, is not only not reasoning, it is not even reasoning-like. It’s not “partway there and just needs more compute.” That’s sort of like suggesting that if you scale up paint-by-numbers far enough you get fine art. They’re just not the same kind of thing.
For a cogent read on all this, see the comments of scientist Gary Marcus.
There have been a number of voices claiming that this research is somehow sour grapes from one of the few big tech pillars (Apple) without a major LLM to their name. And there have been a number of critiques suggesting that Apple artificially hamstrung the models it tested in various ways (for example by not letting them use additional tools, a capability most of the large models now have). Cogent discussion of this and other issues here. There’s a school of thought that says that, as LLMs get bigger, they exhibit emergent behaviors not fully explainable by their underlying models. Though I think that remains to be seen, I’m open to it.
To me the larger takeaway is that one needs to have a very clear view of what these models do well and less well, and how they do it. They can’t efficiently learn algorithms on their own. They don’t seem able to create novel inferences. They can and do confidently hallucinate.
The problem is that they pass the Turing test, in the sense that they fool people into believing they’re something they’re not, and that they do something they don’t. They sound like confident, knowledgeable humans, often ones who tell us what we most want to hear. (Consider also the analogy to fortunetelling). But they don’t have a mental model of the world, nor any commitment to veracity for its own sake. And they don’t reason, certainly not in the sense we would mean. Yet they persuasively imitate all three of these.
Whats more …
AIs have politics
The most notable thing I saw this week was also kind of a “darn it” moment for me, because apparently some researchers already got to doing something I thought was an absolute must-do, which is subjecting AIs to psychometric assessment.
The initial discussion is here: When AI Shows its Politics
The gist of the research, not yet published, concerns first priming AIs to lean either left or right, then subjecting them to assessments for left- or right-leaning authoritarianism, and seeing what happens.
The claim is that when you push an AI left, it goes moderately left, but when you push it right it goes very, very hard right.
Interestingly, “in neutral mode, AIs are twice as anti-authoritarian as average humans.” But, as the writer goes on to point out, AIs aren’t being used by neutral humans. They’re being used by humans whose language (and hence whose apparent mindset) they adopt and echo. And when you nudge them right, they go that way hard.
Even in just a short while seriously using LLMs, it’s clear to me than they can change the way we think, and hence behave. My first post here was optimistic about this potential. I still believe what I wrote there, but it has to be added that LLM discourse seems at least equally likely to amplify what’s bad as what’s good.
I admit this is a dark turn of thought but I worry that it’s only a little while until an LLM is blamed for radicalizing a school shooter, ending a marriage, or saying the words that nudge someone into suicide.
Over the years I've worked with plent of people who "sound like confident, knowledgeable humans, often ones who tell us what we most want to hear" but they don't actually know anything.
LLM's short coming is in their name, Large Language Models. They don't actually know anything, they just string words together in confident and believable ways that are usually correct. The more you use LLMs the more you realize AGI feels so close but is still far away. It's like full self driving cars. Tesla was "almost there" 10 years ago and now it works in "almost all" situations but the same will be said years from now.
"They can’t efficiently learn algorithms on their own. They don’t seem able to create novel inferences. They can and do confidently hallucinate."
Depending on how you define hallucinate, novel, algorithm, etc., doesn't this describe many people? I include myself, at least some of the time.