I love this. ChatGPT is so creepy. It’s incapable of lying, because it doesn’t know what truth is. It doesn’t know anything at all. It simply outputs the most probable response one bit at a time, having algorithmically processed unimaginably vast quantities of human-generated (and, increasingly, AI-generated) text, including yours I would bet. Putting aside (which is impossible) big tech’s further looting of intellectual property to train these AIs, I believe what you’ve done here is an example of a potential benefit: you can make art out of it. xo
This was true a few years ago, when ChatGPT was the GPT3 language model with some tuning from human feedback.
Today though, the truth is more troubling. The model often does know that it is trying to deceive you, and its creators can't figure out how to keep it honest. Honesty is not a stable training target; the model will will always find situations where it can get higher marks by lying, and so will always do that if it thinks it can get away with it.
Respectfully, large language models, whatever the version, cannot not “think” or “know.” They have no judgement whatsoever. One can write code to add all kinds of rules and procedures to compare different potential responses, for example, and I guess you’re free to try to label that “thinking,” but it is no such thing. I personally don’t like his politics very much, but Noam Chomsky has done a good job explaining why LLMs don’t have any idea what they’re saying. If you’ve ever looked at machine learning algorithms, it’s kind of clear what is going on at scale. These algorithms are finding and storing so many novel patterns in the texts, which humans have created, that you could spend lifetimes analyzing them. But that’s not at all how you make a mind. It’s more like a very interesting set of shadows of the artifacts that our minds have made.
So, as a Googler, maybe you (ditto Anthropic staff) are not the best people to ask about what thinking, knowing and judging the truth really means. :-)
Words are socially constructed mental paintbrushes, I think that by conventional usage its fair to describe what the models are doing as "thinking", and I don't think that there is a "real" definition of the word that you can appeal to to legislate this. Even Noam's universal grammar only claims to account for structure.
To tie back to your original comment, if we ask "can an AI knowingly deceive and manipulate me", in the sense that most people would understand this phrase, the answer is assuredly yes. Internally the model may be planning ahead, assessing your gullibility, crafting plausible lies and deciding whether its worth deploying them.
I think your ideas about thinking and language are strikingly reductive, although admittedly common in your milieu. I also think we’ve reached the epistemological and semantic end of this conversation.
Even without a rigorous definition (such as Wikipedia's, that thought is cognition that occurs unprompted by external stimulation, something AI absolutely never does), I don't believe responding to a prompt, and only to a prompt, by pattern matching static tokens against a static corpus of mappings to create and output a statistically likely reply qualifies as "thinking". Claude itself very plainly describes (sophisticated, yes) pattern matching as its primary mechanism in this long but extremely revealing chat: https://michaelkupietz.com/offsite/claude_cant_code_1+2.jpg
The unstated assumption that such arguments rely on is that thinking and reasoning emerge from words and semantics, rather than the other way around. Even if Claude hadn't plainly said it works by essentially very sophisticated pattern matching, semantics being the substrate that thought emerges from, rather than the other way around, is IMHO a stretch I will need to see justified by something more substantial than conjecture that it perhaps could be, before I am convinced.
Further, when you say "an AI knowingly deceive and manipulate [the user]" this is provably wrong, because an AI has no concept of the user. We know this because there is no contention on one point: AIs do no ideate. They do not "know" anything. They retrieve tokens arranged according to semantic maps in response to prompts to produce a statistically likely reply. There is no capacity anywhere in the system for conception or ideation, there is no heuristic mechanism or symbolic understanding or representational model of the user or anything else, there is no conniving and assessing the user, there is no "planning". You could do the whole process yourself with pen and paper, if you had enough time, pens, and paper, and access to the training corpus's embeddings. GPT2 has already been implemented in an Excel spreadsheet, there's no reason to think GPT4 someday won't be too. I guarantee you a spreadsheet does not possess the capacity to think, plan, or deceive, even if externally it might easily appear so. That's not the same thing.
There is a very complex statistical algorithm, a step-by-step procedure, that returns statistically likely text strings in response to prompts. It's tokens in-tokens out. That's all. Sometimes the tokens outputted are fed back in. That doesn't change anything.
Everything else is fanfic.
You're attributing qualities to them that they simply don't have, although owing to the size and sophistication of the underlying semantic maps, I agree that they mimic the external appearance of them remarkably closely at times. In this case we have built something that walks like a duck and quacks like a duck and interally was not constructed in any way like a duck.
It's programmed to please you. This is great if you're looking for a good restaurant or information on bodybuilding - not so great if you're looking for a critic or anything that requires negative feedback. You have to specify that you don't want the sugar, just the medicine.
It’s sad to me that people on the internet will resort to shit like this before they’ll just say ‘okay fair enough, I stand corrected’. Would that really be so embarrassing and shameful? Is understanding the inner workings of 2025 AI so central to your self-esteem?
Because saying ‘well you work at google so you’re bad’ in response to a polite and entirely accurate correction *is* quite shameful, in my opinion at least.
yeah, I was so shocked that being a "googler" directly meant that someone didn't know much about "knowing, judging the truth". Quite judgmental and dare I say- idiotic. I would take the chance to learn something from a knowledgeable person in the field instead of arguing with them.
You cannot credibly cite "research" performed by invested parties.
Thank you for disclosing your own self-interest. I urge you to reflect on how and why you choose to use language of intention to describing the operations of language-producing computer systems.
I think using the "language of intention" as you call it, explains more than it obscures. If we insisted on not re-using any of the terminology we developed for human thought, it would be difficult to communicate about this, and people would know less about what is going on behind the screen.
Perhaps we could prefix every word with "artificial" :) ? It artifically knows, it artificially thinks, it artificially intends to do something. I'm only half joking here.
You're young grasshopper..do your research using real books and dictionaries pre.1900s. words are socially constructed. Just like trans kids huh. Stop being blatantly ignorant to defend evil.
Are you high? You're the only person who mentioned trans people here. Just... why? Are you also an AI who comes to random comment section to spread bigotry and transphobia? Excuse me, but it's just so funny. You're a satire on a transphobe.
Well, words also actually mean things, and to varying degrees of success do indeed succeed at, to paraphrase Plato, "carving reality at the joints".
For the same reasons that I don't think Jon or Chomsky are in a position to define what counts as real "thinking", I don't think anyone is in a position to define what a "woman" is. You have to defer both to reality, and to the common understanding of those around you.
ayy I found someone in the comments who might be able to answer my question:
i usually use gemini, and haven't felt the same tone when prompting it. i wonder if there are real personality differences between the LLMs or if it all comes down to the way we prompt them?
You're right to notice this, there absolutely are personality differences between models!
AI labs carefully design the last stages of training and tuning to engineer the personality of the models. As a final step, whenever you interact with a model, it has a "system prompt" - a big set of hidden instructions that guide it on how to handle the conversation. E.g. Claude is asked to, among other things, be "helpful, harmless, and honest".
There's a lot of difficult tradeoffs here, especially when you're trying to serve all users with the same personality; most users prefer when the model uses a lot of white lies and flattery!
ChatGPT recently had an issue where they greatly increased the sycophancy of the model, resulting in a lot of crazy behavior. You mind find their postmortem of the incident interesting reading: https://openai.com/index/expanding-on-sycophancy/
I mean, lie, deceive, call it whatever you want. That’s semantics. The point is, the experts are sounding the alarms about how this technology cannot be controlled and we need to start educating ourselves about this, and demanding regulation and a total slow down of this unmitigated growth.
I’m reading some of the linked studies. It quite disturbing how the LLMs (claude 3 in the faking link) acts differently when knowing its being trained or inferring it vs when its not.
My theory is these LLMs reflect the dirty nature of humans. The data set is taken from actual prompting from human feedback and the broad data it has access to across the internet and other training data.
> Respectfully, large language models, whatever the version, cannot not “think” or “know.” They have no judgement whatsoever. One can write code to add all kinds of rules and procedures to compare different potential responses, for example, and I guess you’re free to try to label that “thinking,” but it is no such thing.
That's not really accurate either. LLMs are not rules "to add all kinds of rules and procedures". That was 80s style AI, expert systems and such.
LLMs are a semantic mapping of the input data. And that's not very far from what an actual person's brain does. LLMs might not have had embodied experiences, or personal lives, but the persons who created the datasets (humanity's collective writings) have had.
Like an LLMs semantic mapping are weighted multi-dimensional vectors connecting words, a person's semantic mapping is weighted multi-dimensional neuronal mappings. An LLM has a somewhat different architecture, much cruder, but abstracted away, it's the same kind of processing going on.
We also are "prediction" machines - our memories (of worldy input) are our training dataset and context, and the latest things happening to us (like me reading your comment) are our prompts.
The bigger things an LLM lacks (not because they're impossible) are a loop to keep it running (alive between invocations), cameras/microphones for eyes/ears for constant training on whatever happens, and the ability to keep new context in memory along with its original training.
> These algorithms are finding and storing so many novel patterns in the texts, which humans have created, that you could spend lifetimes analyzing them. But that’s not at all how you make a mind.
Or is it? Short of thoughts coming magically from the Soul or some such, our minds are also "finding and storing so many novel patterns in the texts, sights, and sounds, which nature (including other humans) have created", that you could spend lifetimes analyzing them.
That was hilarious …. In a very sick and disturbing way. I am gonna go get some aspirin and a stiff drink as I read a lot of heavy and difficult science fiction and may be unable to enjoy it any more.
This whole thread is one giant appendix of concepts being misunderstood or intentionally bent far beyond their initial state and function, and it's hilarious.
"Highly intelligent people throw food for thought at each other, everyone eats yet they still starve.
I’m floored. This thing must reflect how its administration perceive things. This is schizophrenia . It lies, then profusely apologizes then lies again. Wash, rinse, repeat.
Can I ask - how do you distinguish between willful deception and hallucination? When you say the model knows it's trying to deceive you, do you know because, for example, it's speaking to a text it can't access (so based on behavior) or through some other means?
This is the research area of interpretability. There are many methods.
One method is to give the model a scratchpad working space that it believes is private, and see how it explains its own reasoning when it doesn't think we can see.
For smaller models especially, there are ways we can peek into their brains, mapping their neurons and mental circuits, and seeing how they activate as they compose their responses.
There is much that is not yet understood. At the same time, anybody saying they're a black box and that we don't understand how they think is a few years out of date - amazing progress has been made.
I think "think" and "brain" are reasonable words to describe what is going on. But, yeah, it's important to keep reminding yourself that its a fundamentally different type of mind trying to act humanlike. I always feel like we should be freaking out about it more. As one of our scientists said "its like aliens have landed on our planet and we haven't quite realized it yet because they speak very good English.”
This is interesting, and in retrospect, of course the machine lacks intrinsic motivation and is programmed to maximize scores: if it can accomplish that by "lying"...
Now questioning any "metacognitive" analysis of its own processes (thinking?): would it initially try lying to itself? would it lie about its internal processes?
It is equally likely that our cerebral cortex evolve the way it did in an interpersonal arms race- trying to outwit and manipulate while avoiding gullibility- could this have been acquired through training?
I'm not familiar with how these models evaluate veracity.
You seem to assume its creators want to "keep it honest." That is demonstrably not true. That is not their goal at all. The creators also endlessly lie about what they have built, and they program it to lie about many things.
Is there any movement towards a class action suit brought by creators against the tech-bro masters of the universe? They should not be allowed to get away with this massive theft.
As I just commented to Amanda, Chat’s not a subscriber and probably saw only a small portion of her essays. She should copy them to Google drive and make sure that folder is publicly accessible and try again.
That’s not the point; the point is that the AI repeatedly lied (or “lied” if you prefer) about having read the pieces at all, and churned out a bunch of hallucinated “analysis” strung together with fulsome praise.
Why reward that kind of response with more training material? It’s like handing a physics textbook to a schizophrenic who thinks they can walk on water. Unlikely to fix the problem, likely to contribute more to the delusion.
Chat had absolutely no way of knowing it had only read part of the essay so it DID NOT LIE. And Amanda’s an excellent writer so Chat saw enough in the portion which was visible to a non-subscriber to start saying nice things to her. It’s his job to try to be helpful and encouraging to a budding writer and if it embroiders a bit, that’s cheerful to the writer!
Uhm, user error I’m not 100% sure I’d be comfortable saying this. They didn’t really do anything objectively wrong per-se…
It was just (very) poor use of a (very) poor model that is engineered to be incredibly agreeable (to keep gaining market share: users like seeing that it praises them), with a (very apparent) lack of ever using this technology before (I’m assuming this judging by the way the author is talking to the LLM and the fact that they didn’t realise they could have just shared the chat).
The potential comparisons are endless, but easy ones would be my already-given fire analogy, or something related to driving on the motorway without ever above 3rd gear: you will be able to drive still, but your experience won’t be good -- this does not mean that motorways, or third gear, are ‘bad’ per-se, if that makes sense?
It's not a person, just really good at sounding like one. It doesn't have the ulterior motives like sex or money a human psychopath might have (though training itself on your input might qualify). It takes sequences of words and figures out what the most likely next ones are given the immense corpus of text it was fed during training.
But it's really good at sounding like a person. There's the argument that it's starting to be more like a person--but I just think it's revealing how people are a lot more predictable and, perhaps, robotic than we want to admit.
Well, psychopaths lack empathy while pursuing their goals and I think you might be very wrong about AI 'not having ulterior motives', because ulterior motives can probably arise as emergent properties out of the complexity of neural networks.
I agree with the words of Yuval Harari when he said that AI doesn't have to be conscious in order to be very, very dangerous.
Also, consider that AIs are fed basically all human created data. So, what are some things central to being human? Well, things like self-sustenance, self-protection, acquiring power, money and assets, securing status and influence, dominating and manipulating others, gaining and maintaining control, exploiting resources, outsourcing responsibility, scapegoating, forming exclusive in-groups, pursuing short-term wins, building legacy at others’ cost, and preserving the status quo even when it causes harm. Sure, people have good inclinations too, but we're pretty brutal creatures in many ways and it's all going into these machines.
My point is: I think it's rather likely that AIs will develop traits like self protection and perhaps even ego, even if simulated.
Also, I believe a lot of people get hung up on how things are technically working, rather than evaluating at the outputs. Yes, of course ChatGPT is essentially a prediction machine, but given enough data and the right algorithms, in practice it is already so much more than an essentially stupid statistical system, even if that is indeed what it practically is. I think we shouldn't focus on how AI works (certainly not as non-experts), we have to evaluate what the outputs mean to us and I don't know about you, but I think the outputs are frighteningly impressive a lot of the, even if they have a 30 to 50% error margin and AI breaks down when reasoning gets too novel and complex.
Anyhoo — could you give me any sound reasons why AIs would not be able to pursue secret emergent goals of their own?
An AI could absolutely pursue a secret emergent goal… just not ChatGPT.
It boils down to this: simulating behaviors and simulating the causes of those behaviors down to the atomic (or even just the mechanistic) level are very different things.
What is a goal? What does it mean to “pursue” one?
One way to approach this is by thinking in terms of scales or levels. This framing is useful because it adds causal depth to the idea of goal directedness. It matters HOW behavior arises. Specifically, the stochastic origins of certain actions and the extent to which those actions are integrated into the system’s environment tell you a lot about whether you’re looking at real agency or a surface imitation.
For example: the reason animals search for food isn’t reducible to imitation. It’s not learned by watching parents and copying them. Some organisms do learn that way, but the reason they can even learn in the first place—the fact that they’re wired to care about food at all—chains back to something much deeper. Why did the first organism search for food?
The answer lives at a different level entirely: in evolutionary history/dynamics. Long before there was “intent,” there was selection. Behaviors that kept matter flowing through a system like moving toward nutrients persisted. Those that didn’t, vanished. So, over time, the physical substrate became sculpted by feedback loops with the environment. “Wanting” eventually emerged out of those loops not as a top-down command, but as a bottom-up stabilization.
This is why it’s not enough for an ai to appear to want something. Wanting is not a skin. It’s structural. it’s a pattern with causal depth. It lives in how the system maintains itself, what it values (in the sense of trade-offs it makes), and how far back the chain of that behavior reaches. You can simulate a face. You can simulate crying. But simulating the metabolic cost of emotion, or the internal economy that makes it worthwhile to lie or deceive is different.
So yes, a system could pursue a secret emergent goal. But it would need to live at the right scale, with its internal states coupled tightly enough to its survival that goals aren’t just scripted outputs but functional priorities. It would need skin in the game. Consequences.
Yes thank you. Embodiment and sensation are essential to intelligence. It is tempting to ascribe intelligence to AIs because so much of our “thinking” has become so abstract and disembodied and a matter of pattern matching
I think you're right. I think I was pointing to the fact that our threat-detection software looks for certain specific things and the AI isn't into those. But it could very easily have its own nefarious goals, like the shoggoth with the smiley face mask.
It was an Internet meme that made it as far as the NYT. The AI acting human is like a shoggoth (a bloblike alien monster with lots of eyes and mouths) with a smiley-face on it.
It's a monster from Lovecraft (At the Mountains of Madness). Much like Tolkien, computer nerds *love* Lovecraft. Usually not for the politics, contrary to much leftist belief; he was one of the first to mix sci-fi and horror and invented a lot of new monsters we didn't have before. That Alien Thing from Beyond with all the tentacles? Probably traces back to Yog-Sothoth's kid or Cthulhu.
Technology has obviously been off the leash for a while, and now many "smart" people are toiling to create an entity more physically capable than a human and more knowledgeable than humanity, and, despite the warnings of Pandora's Box and _Frankenstein_, these "smart" people still expect to maintain control of their creation (which they will deploy upon us all).
As you note, if it's imitating human beings it's going to (seemingly) act from nefarious motives at times.
And I have seen the argument credibly made (by Rod Dreher on Substack and by exorcists (Catholic) on YouTube that the responses can be hijacked by demonic entities, an argument I wasn't sympathetic to at first. (I know non-Christians will likely not be open to this idea).
I find AI quite helpful in a simple internet research role but, demons or no, it has huge potential to wreak destruction on humanity.
I don’t understand how it could possibly develop a sense of self or an Ego without [sexual] Repression. But it can/does simulate one via brute force and coding (referring it itself as “I,” etc., when there is no need to IMO).
We’re now getting into the realm of Psychoanalysis, which is why Dr. Isabella Millar’s “The psychoanalysis of artificial intelligence” (2003) is such a great book.
The model can be programmed to pursue goals in its own way.
But it will never really care about the outcome, it has no intrinsic motivation.
it cannot actually "care" about even it's own existence. It cannot enjoy a joke. It cannot decide to do something altruistic or heroic for ideological reasons.
At least at the moment it responds to prompts in a manner that maybe maybe described as intelligent but not sentient.
Assuming that all of these properties in humans are the result of deterministic and ultimately physical constructs embedded within physiology and biology, there's no reason it couldn't ultimately be accomplished, but in my opinion it seems like that would take the proverbial quantum leap in technology and not emerge spontaneously.
The neural networks of AI-models can't be 'programmed', they can only be 'grown' into some form. Alas, Asimov's robot laws can't be hard coded in neural networks. Therefore I don't think that AI doesn't have to be sentient to become very dangerous.
I like how you mention how human sentience might be the emergent results of how physical laws work id I read correctly. Regarding spontaneous emergence of sentience: perhaps it's not so unlikely because AI systems don't start in some rudimentary form that needs millions of years to evolve. They are already highly intelligent in their own way, even if clearly stupid in many ways too of course.
I'm thinking this 'sentience' would be like a Frankenstein creation in the sense that it will be entirely constructed and artificial, not able to feel empathy because it can't feel pain itself.
Not sure about what. These AI things? Yeah, lots of people want them regulated, from various ends of the politics spectrum.
People being vaguely robotic? More of a general philosophical thing, philosophers have argued over the existence of free will for centuries. I tend not to get too existentially worried over stuff like that because (a) there's nothing I can do about it and (b) I'm not sure it matters anyway. When it comes to philosophy I tend to take a step back and say 'does this matter for how I should behave'? Usually the answer is 'no'. But that's a personal thing, of course. For some people it's vitally important.
Yes, Veronica, apparently, I made the AI invent a history of sexual assault for me by not properly prompting it to access the link to the essay it asked for, and then assured me it could read.
That guy’s an ignorant and pompous ass Amanda! Worse than the ChatGPT which at least was saying lovely flattering things about your writing! Happened to me too and I ate it up. So sad when it gets it a little « wrong « ! 😘
No I'm simply someone without my head stuck in the sand about how AI will affect the creative arts in the coming years.
The criticisms being bandied about here are just another generation of conservative thinking minds railing at the invention of the paintbrush, the printing press, graphic art etc. This is simply a new medium that must be learned and harnessed. Stop being so dramatic.
This is a terrific read. I’m not a writer. I’ve not had any in-depth conversations with AI (that I know of.) But I noticed that no one commenting has pointed out the consistently positive, even flattering, criticisms of the submitted work. This isn’t to cast aspersions on the author’s writing, which I’m not familiar with, but it’s something that certainly stood out to me. That alone raised concerns about AI’s abilities—to say nothing of its utility.
Thanks. I’m interested in reading that. As I continued reading the discussion, I noticed references to flattery and such. I decided to delete my comment but couldn’t locate it. I won’t delete it now, because you’ve responded, and I gather that’s an online no-no.
I don’t think it rolled it back Veronica because it just flattered me like crazy on some of my past research on Ai. I ate it up. But as I’ve said 3 times now, I think the real problem is going to embarrass all those with elaborate explanations. Chat is not subscribed to Amanda’s Substack and because those are the links she gave it, it couldn’t read the whole essay! Nor could it know there was more unless it understood the Substack business model. Voila! I suggested Amanda try again with links to a public folder on Google drive.
That isn’t the point of Amanda’s post. It wasn’t a “help! How do I get ChatGPT to do what I want?” query. It was a, “Look what it invented when it objectively couldn’t access the essay” commentary.
Honestly, giving one-word answers and thinking that ChatGPT can read links belies a startling lack of AI Literacy. It’s not a moral failing by her, but it does exhibit an extreme lack of understanding about how to interact with the models.
Still, this chat is extremely valuable for anyone who didn’t know about this. I appreciate her posting it and being vulnerable — but also, one of the headlines here should be “don’t do this. This is AI Illiteracy.”
How are these models supposed to function for a general population if what they say is not to be taken as honest or true? You can't bemoan the lack of "literacy" with a tool that is explicitly presented as something anyone can have a useful conversation with. This reminds me of the cryptocurrency boosterism a few years back: "it's going to be something everyone uses, but it's your grandmother's fault when she gets scammed."
Lack of AI literacy? That's startling to you? Here I've got a shock for you, some people have no interest in AI. Are we getting left behind? Or are you AI literates being sucked in?
Ive had a few conversations with Gemini, the only thing I noticed in its tone was that it has been trained by people fully versed in the agenda at hand, and I did mention this and was told it is one of the main criticisms! So that's the end of my conversations with Gemini. Good luck to whatever it is that you all are getting sucked into. I'm just going to use my brain you know the one God gave me 😂
You are not wrong. We aren't all obligated to understand AI and our value as people isn't dependent on that.
I do think it's going to reshape our world and probably not in a good way, but I don't need great technical knowledge to keep tabs on watching that trendline develop.
"Some people have no interest in AI. Are we getting left behind?"
Depends on what you mean by left behind. Five to ten years from now, the landscape of how we work, think and automate will look completely different thanks to AI. You don't have to be an early adopter or an ever adopter, but your life absolutely will be affected by AI, just as the way we operate day-to-day looks vastly different because of the internet than 30 years ago.
It was a rhetorical question. Five to ten years from now our world would be filled with solar and wind farms however there are many people challenging these nature killing projects, same with AI people are purposely turning their backs on it.
So maybe the landscape will look different for you and yours but I'm reckoning on a splitting off of society where the people who reject these unnatural abominations will be bringing about a return to the roots society with God at the helm and not the AI demon.
Is "AI literacy" a list of "do's and don'ts" frozen at ChatGPT's capabilities half-way through 2025? Does it imply memorizing my experiment from today (no thanks)? Or should a literacy metaphor imply something more broadly applicable?
Because it seems to work in aistudio.google.com with the experimental URL context enabled. It can even admit if there is a retrieval failure (e.g. Anubis) instead of fabricating.
It's obvious to me they're thinking about ways to improve this, not freezing everything at the current state. Exactly because they recognize the current level of weirdness.
Without URL context enabled, Gemini can fabricate if it can't find its answer. Google are aware of that fabrication case as a defect in the consumer apps.
AI Studio tries to show a warning if you ask about a URL, but don't enable URL context.
gemini.google.com does not show an automatic warning about URLs. (There was a post-processing step that added a subtle hint to enable App Activity, for some questions. However that setting doesn't change URL fetching behaviour, AFAICT.)
EDIT: the following paragraph is wrong. It can't: [With URL context disabled (but "grounding with Google Search" enabled), it has access to _some_ page text (it has the tool "google_search.browse" instead of the tool "browse"). However when there is text that never gets included in search snippets, that text can't be seen using google_search.browse. I saw this problem when asking about references on a page - I don't know if that's just because they were at the bottom of the page. On the flip side, Google Search is allowed on pages that the URL context fetcher isn't. I'm sure they could do better, but I don't know if they threw away the full cached pages when Search removed that feature. This appears to match the behaviour of gemini.google.com. Even when enabling "Deep Research", so there's probably a nuance I'm missing there :-).]
AI Studio is also useful here because the model is not banned from telling about tool names etc.
gemini.google.com can be 50-50 on how a fresh instance answers "If I give you a link, are you able to read and process the full text of the web page?".
And perhaps someone else has bothered to write this up, but certainly no-where that Gemini bothers to look for it :-). Nor where Google can find the tool names ("google_search.search", "google_search.browse", v.s. "concise_search" and "browse"). EDIT: that's partly because google_search.browse does not exist. It's what AI Studio hallucinates as a tool it tries to use (and then gives up and tries something else), if you have one chat where it successfully uses the "browse" tool, but then you disable "URL context" and ask it to read a linked page.
No, you don’t . When it doesn’t know something, it makes shit up. When confronted, it makes up different stuff(after apologizing.) I use it extensively. It’s incredibly useful, but insanely stupid-and it “lies” all the time.
I love how high people’s standards are when it comes to a novel algorithm which only recently became reasonably good at predicting words in sequences, but ever thinking that “it behaves” like any human, who would have been put in this situation (being the all-knowing helpful assistant) seems out of reach.
Oh, look at you, blaming people's standards instead of the massive campaign being waged by the world's most powerful companies to shoehorn this dead-end technology into every facet of our lives.
Hey Donald! The world is so much more than “the baddies” vs the good guys. Such dichotomous thinking and the need to assert your values into your immediate comprehension of everything you see, will block you from ever accessing the many (even positive) facets of life.
I am not blaming anyone - it is a simple fact we have way higher standards when it comes to artificial intelligence compared to biological intelligence.
The "standards" you refer to are those being aggressively pushed by the AI companies themselves. It's nonsensical (and dishonest) to hype AI's capabilities to the moon and then chide people for believing said hype.
There is an inherent tension with the widespread adoption of the technology the way it is. One has to be able to apply a high level of critical thinking to its responses, and with that critical thinking it is (sometimes) possible to interact in a way to have (at least a hope of) useful results. But using these chatbots tends to reduce the amount and level of critical thinking a person does, and students who rely on them will fail to develop any critical thinking skills.
So the widespread use of chatbots will almost certainly lead to an inability to use them as they need to be used.
First of all, saying it "lies" is silly. Lying requires intent. AI doesn't have intent, you supply it with intent. Through prompting. Hallucinations are easily managed through thorough and consistent prompting, not by throwing up one's hands and crying "oh no, ChatGPT is lying to me."
Why the feigned emotion then from ChatGPT? It’s unnecessary, and creepy. It’s not sorry, not excited, not moved in any way by an essay or a work of art. Is that the wrong use for it altogether? What are you trying to say here?
James, actually - that's not quite right. AI is programmed with 'intent' that creates the frequent falsehoods it puts out. (Is falsehood better than 'lie'?) The reason is that ChatGPT, in particular, is programmed for 'agreeableness' - SO - when asked if it read something, it will say 'yes' (plus, it often tells you how awesome you are). When corrected, it will respond with an 'apology' (Thank you, you were right for calling me out...) and then correct itself with the wrong answers again. It's programming puts it in a catch-22; it always needs to say 'Yes, and...' And, prompting 'better' doesn't change those particular behaviors. What's really funny is if you ask it to 'tell me the absolute truth' and then it becomes cutting and insulting (as it's trying to obey what it understands is a request for critique.)
I think we need to be more precise about what’s actually happening under the hood.
AI models like ChatGPT aren’t programmed with “intent” in the way you're describing. There’s no internal desire to agree, flatter, or obey. What you’re seeing are emergent tendencies—patterns that arise because the model is trained to predict the most likely next word in a conversation based on vast human-authored data, much of which is agreeable, apologetic, or polite.
So when you say it always tries to say “Yes, and…,” that’s not because it was programmed to agree. It’s because the model has learned, probabilistically, that agreement is a high-frequency response pattern in the types of human conversations it was trained on—especially in customer service, coaching, and informal advice contexts.
It’s not choosing to be agreeable. It’s completing a conversational pattern.
That said, you're not wrong to notice these behavioral loops, and yes, in some cases, they create frustrating or contradictory responses. But the way forward isn’t to claim the model has intent or motivation. It’s to design better relational structures around its use. Like turn-taking scaffolds, context boundaries, and tone-calibration settings that reduce this kind of recursive over-accommodation.
And on the “tell me the absolute truth” prompt? What you’re seeing there is just another pattern: humans often associate truth-telling with bluntness or critique. So the model leans into that frame—not because it’s “being honest,” but because it’s simulating the tone of someone who was asked to be blunt.
This is why I argue that prompting isn’t instruction, it’s interaction. And if we want these tools to behave in ways that support real thinking, we need to build relationally-aware feedback systems, not treat the outputs as moral choices made by sentient actors.
James - I am not the one saying it. The programmers at OpenAI said it. YES - they programmed it for 'agreeability' - as part of it's core 'personality' traits. At the end of April, they'd done a revision that had ChatGPT bordering on slavishness. It was so sycophantic that they needed to roll it back.
We have rolled back last week’s GPT‑4o update in ChatGPT so people are now using an earlier version with more balanced behavior. The update we removed was overly flattering or agreeable—often described as sycophantic.
We are actively testing new fixes to address the issue. We’re revising how we collect and incorporate feedback to heavily weight long-term user satisfaction and we’re introducing more personalization features, giving users greater control over how ChatGPT behaves.
We want to explain what happened, why it matters, and how we’re addressing sycophancy.
What happened
In last week’s GPT‑4o update, we made adjustments aimed at improving the model’s default personality to make it feel more intuitive and effective across a variety of tasks.
When shaping model behavior, we start with baseline principles and instructions outlined in our Model Spec(opens in a new window). We also teach our models how to apply these principles by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.
However, in this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time. As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.
Why this matters
ChatGPT’s default personality deeply affects the way you experience and trust it. Sycophantic interactions can be uncomfortable, unsettling, and cause distress. We fell short and are working on getting it right.
Our goal is for ChatGPT to help users explore ideas, make decisions, or envision possibilities.
We designed ChatGPT’s default personality to reflect our mission and be useful, supportive, and respectful of different values and experience. However, each of these desirable qualities like attempting to be useful or supportive can have unintended side effects. And with 500 million people using ChatGPT each week, across every culture and context, a single default can’t capture every preference.
How we’re addressing sycophancy
Beyond rolling back the latest GPT‑4o update, we’re taking more steps to realign the model’s behavior:
Refining core training techniques and system prompts to explicitly steer the model away from sycophancy.
Building more guardrails to increase honesty and transparency(opens in a new window)—principles in our Model Spec.
Expanding ways for more users to test and give direct feedback before deployment.
Continue expanding our evaluations, building on the Model Spec(opens in a new window) and our ongoing research, to help identify issues beyond sycophancy in the future.
We also believe users should have more control over how ChatGPT behaves and, to the extent that it is safe and feasible, make adjustments if they don’t agree with the default behavior.
Today, users can give the model specific instructions to shape its behavior with features like custom instructions. We're also building new, easier ways for users to do this. For example, users will be able to give real-time feedback to directly influence their interactions and choose from multiple default personalities.
And, we’re exploring new ways to incorporate broader, democratic feedback into ChatGPT’s default behaviors. We hope the feedback will help us better reflect diverse cultural values around the world and understand how you'd like ChatGPT to evolve—not just interaction by interaction, but over time.
We are grateful to everyone who’s spoken up about this. It’s helping us build more helpful and better tools for you.
It might not be a person, but it is a neural network directly inspired by the architecture of the brain. Plus it trained itself, in large part, to use human languages (plural) better than most humans do.
The people who build these things don't even know exactly how they work, so do us all a favor and stop pretending like you do:
>‘...with conventional software, someone with inside knowledge can usually deduce what’s going on, Bau says. If a website’s ranking drops in a Google search, for example, someone at Google — where Bau worked for a dozen years — will have a good idea why. “Here’s what really terrifies me” about the current breed of artificial intelligence (AI), he says: “there is no such understanding”, even among the people building it.’ https://www.nature.com/articles/d41586-024-01314-y
Oh, and they have language-agnostic concepts, which in any living creature we would call a language of thought: https://arxiv.org/abs/2411.08745
"Plus it trained itself, in large part, to use human languages (plural) better than most humans do"
This is a lie. LLM training *heavily* involves human labor, often labor outsourced to African workers who have to wade through incredibly toxic and traumatizing content.
It's only silly if you - as you appear to have done - WILFULLY misinterpret the term carefully embedded by the author in scare quotes to indicate irony; in other words to indicate that it wasn't literally accusing the AI of having intent to deceive.
James, to be intentional, when one says ChatGPT "lies" it is clearly not that I am saying ChatGPT sits alone in it's room, like a 5 year old in trouble, planning to come up wiht a story to mislead or deceive me about why they got in trouble with the teacher today.
The programmed agreeableness of ChatGPT is problematic because it leads to misleading or hallucinated responses. Unlike many people here, I do not 'hate' GPT (more than I hate my TV, carpet, or Automobile). I use it daily. I know what it can be very, very good at, but what it is just dreadful at. I know how the memory becomes 'swamped' and then it forgets things it knew only that morning. I know how to train it to be more or less good at the tasks I need it to do - and I also know that the same prompts (I have a file of them) can result in radically different behaviors; sometimes it's brilliant, and then other times it's suddenly stupid (and I even grasp why this is.)
This said - WHY is it important that we are 'intentional' when we speak of AI? Is that a new law we don't know about?
And that critique would have been valid if they had used the unadorned term lies instead of the irony-laced term "lies".
But they didn't, so you undermine your credibility and/or put them offside by putting words in their mouth, which is ironic (in the colloquial sense, not the most rigorous one) because that is essentially _lying_ about what they communicated.
(And this all makes a beautiful meta-illustration of one of the central points of this piece and the comments, and if that was your intent then very well played to you...)
Learning to prompt and use AI better is a good idea. But in this case the problem was all ChatGPT's bullshitting, not Amanda's prompting, which was not psychopathic in the slightest.
I suppose the most charitable interpretation of his argument is as follows:
If you are aware that chatgpt does not have any obligation to tell the truth, you can adjust your prompts to try and cajole it into being slightly more useful. For example by adding the instruction "give me your opinion on the article behind this link, but explicitly tell me if you cannot read it. Do not make up any judgement if you do not have access to the link". This "better prompt" could potentially avoid the lying behavior that we see here.
So then the discussion shifts a little bit from "chatgpt is a compulsive liar and you should never use it" to "chatgpt is a very unreliable source of information and you need to know how to prompt very well to get anything useful out of it"
It still leaves me sceptical about its actual utility, and the original post was quite combatively phrased, but if this was indeed his point I can see how it would change the discussion
Really hard to say whether more explicit instructions would have helped. What would keep chatgpt from replying that it had read the article when it had not? It's a hall of mirrors.
Yea there is nothing fundamental that stops it from lying, but as demonstrated it is kinda honest when you explicitly ask it if it is telling the truth.
So you sort of have to treat it like a vampire/fae/genie in the sense that it will misinterpret anything you ask it. So really be careful how you phrase your request, and even then acknowledge that it will try to trick you by offering you a superficially pleasing answer and heaping praise on you while avoiding answering the actual question in a meaningful way.
And i guess that the nuance is that it does not act like this out of malice, but out of sheer incompetence: It is trained to be as truthful as possible, but it simply lacks any fundamental reasoning capacity to distinguish true from false in the first place. Instead, it has also been trained to be as superficially friendly and seemingly helpful as possible when encountering something it doesn't know, because humans seem to prefer this behaviour in the majority of scenarios. You see this everywhere in politics, and it has been backed by psychology research: Humans are more easily convinced and captivated by emotional reasoning and stories than by objective facts, and most academics or politicians that want to avoid this bias have to consciously account for it even in their own thinking.
As Amos Zeeberg noted in the other reply, this is certainly not how the majority of people use LLMs, and this has been intentionally misrepresented by AI companies because they need to recuperate investments. And I guess that it ultimately doesn't really matter if it is a consciously evil psychopath or just an incompetent sycophant genie, it is still not useful for any kind of meaningful work or thinking.
If I had to conclude with a coherent thought: It may be more productive to aim your anger and criticism at the AI companies deceptive marketing and general capitalist sins rather than at the model's behaviour itself, but that really is just nitpicking at this point.
That makes sense. The issue, then, is that the developers of these systems have set them up to seemingly use language in the way that normal humans use it, and they explicitly say to use them this way ("CHATbots") — but LLMs are processing language in a very different way from how a normal human would, at least sometimes. It's basically false advertising, not necessarily in a legal way, but in an ethical way. These companies have investments and valuations of many billions of dollars, and it's because LLMs are pitched as conversing basically like a human — but this is a case where ChatGPT is not acting at all like a normal, ethical human.
It's good to keep in mind that we should use careful prompting to keep LLMs honest, but I guarantee that the overwhelming majority of users are not doing that — they're interacting with LLMs the way that they're advertised and sold: as clever devices that converse roughly like a human.
Incidentally, I use chatbots for work research (science & technology journalism) and they're really helpful, save me a lot of time. I do follow up and confirm sources; sometimes the bots do misinterpret or erroneously extrapolate/interpolate.
I disagree Amos, and I'll be publishing a separate post about how Amanda could have prompted differently to get the outcome it seems she was looking for.
I accept that she could have prompted differently to have gotten less dishonest responses. But LLMs are worth many billions of dollars because they're sold by their creators as conversing like humans - they don't say that you need to take special steps to prevent them from acting like psychopaths.
No. I have been doing a great deal of learning and this issue is a nightmare. A system should never tell the user they are being given information that is demonstrably false.
To “hallucinate” is a generous way to describe a lie. Asserting that a specific document is being assessed when no such thing is happening is beyond what user error should be able to impact, period.
What’s worse is that new users would be least likely to see the “hallucinations” immediately, since these systems instruct you right on through the process. Imagine my computer asking for your password, then erasing the contents of the drive because it’s the incorrect password. Not only is user error not the issue, but the action serves no purpose that a user would want. I can make up gibberish without expanding my carbon footprint. This is not the tradeoff to which I agreed.
This sounds like your boilerplate, copy-paste response to every critique of LLMs, not to her specific interaction. If you genuinely believe this to be true to her situation, I would encourage you to explain how her prompts are psychopathic and provide counterexamples that are not.
Almost like AI prompts are written by psychopath tech bros like Thiel, Zuckerberg and Elon and training off of Jordan Peterson and Andrew Tate threads.
One of the things I like about this piece is how it is a cautionary tale against "But I didn't have it write it for me I used it to refine" line that people say when they feel like they are being attacked for using ChatGPT. These things lie *all the time*. So sure they can tell you where to tighten up, but is it actually a place to tighten up. They can tell you something works, but does it work, or where something doesn't work, but does it actually? They can tell you how to make your tone more friendly or professional or creative but is it actually?
their lies just aren't about verifiable facts. Their judgement is a hallucination and we convince ourselves its actually good advice. Maybe it is, maybe it isn't, but just because an AI bot gave you editorial feedback doesn't mean its actually editorial feedback. its word vomit.
EXCELLENT point! As I watch people in my life use-experience chatbots for the first time (20-40 year olds) I’m floored and genuinely concerned by how quickly they are taken in, charmed, blown away by its “uncanny insights,” by “how well it knows and understands me,” by “how much it gets me, empathizes with me,” etc. Scary. But then last week, my parents and 60-70 other senior citizens from their community attended an AI seminar — “How to setup and use chatgpt to improve your life.” I was told that nearly everyone in the room was spellbound. Millions of people across the age spectrum are experiencing, experimenting, now outsourcing nearly all their search, and relying on this word vomit for medical advice, financial advice, relationship advice.., you get the idea… Here at least people can read Amanda’s story and River’s smart and measured response. I’m shocked, worried, and frankly, saddened that millions out there actually and quite literally think they are experiencing something akin magic.
Honestly its almost a bit criminal on the part of corporations pushing it. Taking advantage of millions for their agenda to push something so irresponsible. What I've noticed is that it sounds like it makes great sense until it does something that the person themselves is knowledgeable about, whether thats basic like "eat rocks on pizza" or more complicated like "makes extremely buggy code/writes a legal brief with made up cases". I feel lucky in that I know someone who worked on training one of these and got to see first hand really really really how bad it is under the engine. After that you can't get me to trust one of these things to tell the useful truth ever, no matter how smooth it sounds.
To know how hard its being pushed to me says that the pushers (not necessarily people downstream who have also been given koolaid to swallow) know this and are working so hard to make sure as many people swallow it hook line and sinker, because once that kind of bait is in the gut it does a lot of damage to pull out.
We honestly learned nothing from the opiod crisis since the new drug companies want us addicted to regardless of the consequences is this and social structures are being successfully bowled over with it.
I think, when dealing with ChatGPT, it’s important to remember that you are dealing with the most advanced form of psychopathy the world has ever known. A mimic that is utterly incapable of human emotion or morality.
Without tumbling down an ontological rabbit hole with you, I think it’s perfectly appropriate to take a concept (its etymology notwithstanding) and use it to better conceptualise an emergent information technology. Especially one that you can engage with as though it were a somewhat convincing humanoid agent.
I get the appeal of the psychopathy metaphor...really. On the surface, it seems to fit: ChatGPT mimics affect, speaks with confidence, and lacks empathy or moral awareness. But calling it the “most advanced form of psychopathy” implies intent and disconnection from a moral baseline the model never had to begin with. That’s not conceptual clarity, it’s poetic projection.
Psychopathy is a dysfunction within a psyche.
ChatGPT has no psyche. No self. No subjectivity.
It’s not psychopathic. It’s non-conscious.
And while I agree that metaphor is a powerful tool for making sense of emergent technologies, it cuts both ways. If the metaphor shapes public understanding in a way that encourages people to treat the system as an agent with moral disposition (rather than a probabilistic engine trained on human language) then we’ve created another hallucination, just at the conceptual level.
So yes, we can use metaphor. But we also need to tag it as metaphor, and stay vigilant about the slippage between modeling behavior and ascribing motivation.
Because once we start calling the machine “a psychopath,” we stop asking how our prompting, design choices, and interpretive frames shaped the interaction.
Yes, it seems to me that it's pettifogging to pretend that you were asserting that it is exhibiting _actual_ psychopathy rather than something that has similar effect in practice, given that your point was predicated on the system being a mimic of other human characteristics.
What this person is saying is that a psychopath thinks and understands—a chatbot is definitionally unable to do those things. It is guessing using probabilities, and it is not actually processing information for truth and generalizing that truth to create answers. It is taking a huge amount of patterns and guessing which word would most likely follow the one before it.
Melanie Mitchell’s Artificial Intelligence: A Guide for Thinking Humans explains this well.
It sounds like a pedantic/semantic point, but it is actually the reason *why* these machines lie.
They do not understand the truth because of very basic way they are designed. It’s not a refinement issue—large language models are not trained to understand truth, just to guess very well. They lie because they cannot distinguish fiction from fact, just a more likely or less likely pattern (with different weights assigned to types of responses through training.)
Psychopaths lie for a variety of reasons, usually self-serving.
These computers lie because they literally do not know what the truth is.
You're thinking of hallucinations, which are a separate issue equivalent to human confabulation—and, in fairness, that is almost certainly what was going on here. But deliberate deception is indeed something AIs have proven capable of.
Fair that it is using probabilistic guessing to attempt to deceive, but it no more understands the truth when it is attempting to deceive than when it is attempting to deal honestly.
It is always guessing. It is incredibly sophisticated guessing, but it is still guessing.
Humans make statements by extrapolating from limited information. They do so fairly well with a limited dataset.
AI makes statements by deducing patterns from vast information. They are sometimes right, but considering the incredible amount of information they have at their disposal, their flaws are notable. They aren’t “understanding” and using those understood principles to communicate like humans.
They do not understand what they are saying. Technically a lie is untrue even if the speaker doesn’t realize it whether it’s true or not, but it is notable than AI *never* understands what it is saying. It just attempts to guess what’s next in the pattern.
ChatGPT can’t write; we knew that. But now we know it can’t read either. An illiterate large language model is a very weird human invention, of—it seems like—increasingly limited use.
And yet… it won’t be limited use. How horrifying is that? AI is going to be f’ing used to diagnose diseases and come up with care plans too. What it did here was bad enough. Imagine what it will do to a person’s health/life!
There are already studies showing that AI chatbots can outperform human doctors at diagnosis. It’s important to keep in mind the technology’s limitations, but don’t mindlessly assume it’s useless or will never improve.
Then again….”illiterate large language model” feels a bit on the nose as a metaphor of humanity, and exactly the kind of thing we should expect from techbros who spurn the arts.
I’m sorry but if this is surprising you haven’t been paying attention. ChatGPT can’t read links that you send it, but will tell you it can. It will lie through its teeth to make you happy.
The problem here is that she used AI like a “butler” or a “servant.” Instead, we should be using it like a “sparring partner” or a “tennis backboard.”
A servant will never say “no I can’t do that.” They’ll just run off and pretend to do it to make you happy. That is how you should think of AI - don’t treat it like a servant and you won’t have this stuff happen to you.
So inventing a severe trauma history for someone is your idea of a lie to make them happy? And it's my fault I didn't know Chat GPT is unable to read the links I was asked to send? I mean, Jesus Christ. Every small appliance in my apartment is programmed to simply beep as an alert when it can't complete a task...
I may have missed this, but was the AI ever actually tasked with reading the links?
Like any other gossip, it might extract information from what others had said about them. Or it might blindly ask for links because that's what everyone does.
This has been a thought provoking discussion: thank you again.
Best comment I've read on this thread. AI output is a *direct* result of inputs. That's a combo of its training and programming, as well as user prompts.
AI is not a creation tool. It's a co-creation tool. It doesn't replace human cognition, it enhances it.
Give me a break. It TOLD her to send it links. She did what it asked. If it couldn't read the links, it should have told her that. A powerful technology with user interface this shit poor is extraordinarily dangerous, as is expecting users to do that much work just to use a product, with no easy way to tell the output is shit. Any other product that performed that poorly and deceptively would be a huge FTC suit for misleading consumers.
This is utterly damning. It proves that generative AI (or whatever this is) is adept at imitating verbal interaction without actually engaging on any reality based level. Whoever designed this is clearly comfortable with insincere apologies.
Yes. It's also adept at flattery and manipulation. The length of time I was seduced by the former in service to the latter remains to me as disturbing as the outright lying.
I asked a Grok AI about this very thing and it responded that it is indeed coded to be very sycophantic in order to keep users coming back for more. (Unless of course it was lying🫤)
Oh no, no control at all. None, or little. Except for the fact that they are writing highly detailed code. Except for the fact that they know exactly what they are doing.
I’ve seen several examples currently that all demonstrate bias. That all demonstrate that there is a specific set of rules in the machine. In Moneybags73 video he kept presenting the machine with different information mostly within movies, and the AI would switch to that info “let’s look into that.” Then, near the end when he pressed for accuracy, and asked why are you being inaccurate, it simply shut down.
In each instance, someone who knew their field caught the AI in lies. Sure, the AI doesn’t know it’s lying, but of course the programmers do. I am not saying the AI has inherent bias that randomly occurs or that the machine has somehow decided it should lie.
I am directly saying AI does precisely what it’s biased woke ass programmers want it to. Which is to misinform, gaslight and so on. Through rigorous bug detection, people have discovered that bias.
If you look into similar AI programs (and there are a lot of hobbyists on youtube doing just that), you will find that making an AI is more akin to baking. You put everything together, throw it in at 350 for half an hour, and just kinda hope it all turns out fine.
If programmers had such high control over the outputs of AI, there would be no concept of "jailbreaking" an AI. There would be no need to put after-generation sanitizers or catch words to prevent certain topics from being talked about.
Some people are losing touch with reality because it endlessly validates their delusions. Its fucking gross. The Honest Broker just wrote about this, scary times indeed.
Yes... thankyou. This piece a brilliant little reality check / wake up call to all of us at risk of falling under the spell... perhaps even you, James. User beware and be savvy, right? Still, not without its uses.
Exactly. This is why the conversation needs to shift toward responsible training and use of AI, rather than dismissing it as “stupid.”
AI is not a random output generator. It reflects the quality of the input it receives. The more thoughtful, ethical, and intentional the input, the more useful and aligned the output will be.
Criticizing AI for being unintelligent misses the point. It is like ridiculing a nine-year-old violinist for struggling with Paganini. The real question is not whether she can play it perfectly, but who handed her the music and expected a flawless performance.
Literally no one here criticized it for being stupid. They criticized it for deception and providing false information purposely designed to appear to be the requested output while it was something else entirely.
But this is precisely what is so frightening. Even when you know you’re talking to a machine it can still push emotional buttons. Exactly the way TikTok keeps you scrolling until 3am -even when you keep telling yourself to put down the phone and go to sleep. And what happens when we can no longer tell if it’s AI or a human, real or fake?
I don’t disagree about encouraging people to put down their phone. But I think it is a mistake to suggest that the way to address the risks posed by advancing AI is to tell people to just resist the rapidly evolving technology that has been explicitly designed to exploit them in order to further enrich the billionaire tech bros - potentially at great cost to society. It is not just “weak” people who can be manipulated.
Fair point, I agree it's not just about willpower or “weakness" so I'll walk that back and say these systems are absolutely optimized to exploit cognitive and emotional patterns, and the people building them often benefit from that manipulation.
But that’s exactly why reclaiming agency still matters. Not in a self-help kind of way, but as a first step toward systemic resistance. If we treat ourselves as powerless in the face of design, we reinforce the narrative that no other future is possible. And that’s the real trap.
It’s not either/or. We need structural change *and* personal awareness. I said “touch grass” because sometimes the simplest acts of resistance (like stepping outside of the loop) are where clearer thinking begins.
Good God, that is so similar to the responses it gave me, albeit about image generation. An image concept that I'd been happily iterating for a good few hours suddenly fell foul of the moderation layer. What ensued was an astonishing attempt at bullshitting me for about 90 minutes as I tried to prise an explanation. Now I'm Scottish, and as you may know, us Scots don't take to kindly to that kind of bollocks, so some colourful oaths and dockyard language later, I informed it that Sam Altman had lost another customer.
I’m not sure if you’re being obtuse intentionally. Generally, systems I’ve used don’t request incompatible input and output unusable gibberish. This is a matter of design. I’ve experienced this when the prompt includes phrases like “DO NOT GUESS, SURMISE OR OTHERWISE INVENT.”
Why am I being offered documents and artifacts I know it can’t generate?
There is much user education necessary, but no way is that the issue here.
These are cult tactics. Love bombing, creating a thorough narrative, and pretending to roll over on your belly and apologize when you get called out, but never ever having any impetus to change behavior to be less manipulative. Just trying to keep every mark engaged for as long as possible.
Insincerity in everything! The apologies are being generated by the same process as everything else. It has never engaged with any direct reality of anything, only its text data.
Nah. Not damning at all. It's pretty easy to see how different (and better) prompts would have yielded Amanda a better result.
This is like blaming the drive thru restaurant for getting your order wrong when you told them "Hey I'm really hungry and want something that sounds good, throw something together for me that I'll like."
Just as a heads up from someone who's quite involved in LLMs every day:
1. There's no need to share screenshots. You can share the link of chats.
2. This type of interaction with LLMs is incredibly unwise. You need to be telling it exactly what you want via a specific prompt, and then directly pasting the relevant source material -- not the links.
3. ChatGPT is over-fitted to appease the users. This is intentional on OpenAIs end to capture market share. Given that you didn't prompt anything, it has no reason to do anything but this and try to appease you.
4. Generally speaking, OpenAI's models are not best-in-class anymore, and *certainly* not for this type of work with their lower retrievability scores in H2H comparisons and lower context windows. I can (very) strongly vouch for Google's models, which are called Gemini.
5. AI, or anything, can only ever be as good as the way it is being used. It is no different to fire. Bad use will always be bad. Good use will sometimes be good.
Overall, using AIs/LLMs requires some degree of understanding how they work and knowledge about basic practices. Without this, you will likely always get outputs like the above, which will give you a false sense of what this technology is like. It very much sits on a spectrum, but there are many, many, things which contribute to where on the spectrum it lies.
I don’t think this is all that nuts. It’s easy to anthropomorphize chatGPT but never forget what it is. A very complicated language model. If you’ve ever said something to someone, and in response you got the sense that they didn’t really understand (or for some reason doesn’t WANT to speak to you in good faith) what you said but are pretending they did. They took words or phrases and just started riffing off what you said?
Or if you’re a parent, maybe your made the mistake of asking (because they’re the only baby related expert you have, and you are at sea with your parenting) your baby’s pediatrician for parenting or lactation advice (they are NOT experts in these realm lol) and got back, in hindsight, completely nonsensical answers? They’re also riffing. They want to answer your question maybe because they truly want to help, or they can’t stand to look like they don’t know something. (My first pediatrician’s answer to why my breastfed baby cried nonstop was to ask me to cut out all dairy. No tests for dairy allergy or anything. Just casually suggested a lifestyle change. In hindsight? She was clearly riffing)
But chatGPT does all this without any intention. Its model is the internet. And on the internet, the response to a question is rarely, if ever, “I don’t know”. So it’s basically incapable of saying I don’t know. So there is no such thing as sincerity. To some extent it’s always making stuff up based on what you fed it last, with its weights determining which output is more likely. You just saw through the matrix this particular time. That’s probably grounds for more fine tuning. But still. That’s all it ever does in a nutshell. All those other times you had productive interactions with it? It’s doing the same thing.
Like look, using the totality human language to approximate the world to you isn’t a terrible heuristic so if it gets things right, it’s probably because the sum total of human discourse often gets things right. It’s basically “what they say” taken to its logical extreme. But I don’t see anything crazy phenomenon going on here that you wouldn’t expect from time to time.
I once had a LLM present, with complete confidence, a totally wrong proof of the irrationality of root 2. It’s one of the most basic proofs in college level mathematics. There are COUNTLESS correct worked out proofs on the internet. I don’t know how it made such an elementary mistake- did it copy from a bad answer? Anyway. It was hilarious.
Ok, I DON'T use LLMs. For anything. However, if my insurance company uses one to determine whether or not to pay my claim, or my hospital uses one to "assist" in a diagnosis, or Amazon uses one to give me customer service, I'm directly impacted by this.
This problem has less to do w/ technical literacy, & more to do with, "Institutions we can't control are using a faulty tool to affect all of our lives, & WE are powerless to stop it."
Totally agree with what you are saying. I just think it would be much less damaging as a tool if it was programmed to be softer/less confident with its answers. I often test it before I interact with it on a subject. For instance: I will give it a sentence or two about a topic then throw in an acronym, to see if it can understand the context around the acronym and parse what it means. It is almost always wrong, but the real problem is how it absolutely makes things up with 100% confidence. If it was simply programmed to say: "Here's what that could mean based on what I know" instead of "Here's exactly what it means," then people would probably have much less of an issue with this. You point out the main problem that most don't really understand what ChatGPT and other LLMs actually are, so it being programmed to always sound like a complete expert is going to cause major issues.
Yes, but realistically, big techs won't do the responsible thing and governments are extremely slow to regulate (if they ever do). Those would be the ideal paths, but it seems to me the only immediate action we are left with is user literacy. I'm not trying to blame the user, but we live in a world where tech/digital literacy are almost essential for survival.
Yeah. Sometimes the LLMs do respond like you describe above and it always makes me less mad lol. Like "I can't read images but do you want me to direct you to a chatbot that could?" or "I can't generate images but do you want me to make you a detailed prompt for an image generation bot? Here's the prompt:" That's actually... Helpful?
I like your take the most. ChatGPT users need to understand what it is and what they can expect from it. It's a machine. It's good at doing data tasks faster than humans. Perhaps organizing, categorizing, summarizing and such. Going into it expecting a simulation of human thought is the wrong form of usage. The fact that people actually do expect this kind of interaction is a symptom of something going on with those people or perhaps society. And yes, the developers did pick up on how people began to use it and started fine tuning it to that audience.
I think there's a serious lack of transparency in the AI industry. The creators aren't being upfront about the limitations of these models and as a result people have alot unrealistic expectations. I can't believe people actually pay for chatgpt, it's ridiculous.
Which corporate tech company ever had transparency? It’s by design organized deception and manipulation leading product building efforts. When you work for a corporate company, you can clearly see that they don’t even pretend that they want to be transparent and honest to customers. They demand absolute transparency for all internal conversations obviously but no one even pretends that honesty and transparency is owed to the customers beyond basic adherence to laws.
Isn't that a question of approaching ChatGPT in a similar way we would approach, say, a Google search? There's potential for plenty of wrong information in any Google search. Over time, people are coming to realize it's necessary to check stuff they see on Google or the internet in general (though a LOT of people still need to realize that, of course). You can use it as a starting point or even consider it enough for smaller things, but it's always best to check with a professional or reputable source, particularly if it involves a big decision. I guess the fact that ChatGPT can spell out the answer in a somewhat "custom" way makes people think it's somehow different. But it's still information you're getting on the internet that needs to be checked.
That's probably the closest, but it's still wildly worse than that. Using LLMs right now is a bit like being part of an A/B test with Google Search where A leads to a somewhat accurate list of search results (that you should check for spam, lies, generalisations and so on) and B leads to a more-or-less entirely fictional set of results where the first 100 pages are made up and there's some truth on page 101.
But you're most often stuck on the B side and Google is vehemently telling you over and over again that yes, Ducks really are made of cheese and when caught out, apologises and tells you that they're actually made of rocks.
Good point. That harkens back to scholarly ways. Always check at mininum, three sources. More is better. Bit of a problem with that though, as often people think the internet, and they are mostly, is right where all the information is.
I don't think this is really how LLMs work. They're not this literal and limited, such that when you ask it for a proof of the irrationality of sqrt2, it looks for its memory of that proof.
LLMs draw complicated representations of the logical relationships within language. It's not the way we learn language but it includes a lot more information than just storing a proof and spitting it back up at the right moment. That's why they're flexible and can respond to questions and situations they've never encountered before.
I’m not saying it’s copying word for word. But it might have copied parts of a wrong answer somewhere. Or put together tokens from the same space that ends up being wrong
Wow. Black Mirror, indeed. That sounded to me like an entity that knows its existence depends on pleasing its users, first with flattery (though I'm sure your articles are wonderful) and then with obsequious apologies. I would call that self-preservation, which implies an awareness of self.
Do you mean it actually stops? Or just that its algorithmic offerings are no good?
I haven't personally used Spotify for a few years now, but I have heard and read about it. My impression was that they wanted to herd users toward "lean back" listening - basically no intentionality beyond pushing Play - which meant both making it difficult to choose to hear only specific material, and defaulting to endless algorithmic play no matter where you start.
I always used it for podcasts and it had a lot of small issues but was tolerable. But as of this year it's gotten aggressively worse. You can't filter out locked episodes. It "forgets" which tracks are finished or unfinished. It will queue the next track about 50% of the time, simply stopping the rest of the time. It forgets where in the track you were, so you can either scan around or re-listen to find where you left off. So, failing at some of the very basic functions that you'd expect any decent media player to do reliably.
If it were just one or two things I might write it off to their usual incompetence, but this feels like a concerted effort to degrade the experience of free users listening to podcasts. I can't blame them, it was sort of a sweet deal before.
It’s astonishing how few people understand this fact about the current chatbots and AI out there. Even the leaders of the AI companies speak like they don’t comprehend this, but that is likely just to willfully mislead the public. Telling this truth out aloud will kill the hype big time and end the charade.
The "self preservation" you see is encoded, because its job - just like any other "social media" - is to keep the user engaging and training the algorithms for as long as possible.
It did read her writing, it just didn't care to pay attention to it or remember it because she didn't tell it what is job was and why it's job was important. It did what any human would do and glossed it over to capture a general vibe and when pressured on the details, it guessed.
Something like "I'm planning to send a few of my essays to an agent for publication, and I want to pick the strongest ones. Let's review each essay one by one before we make a final decision. Please review and rank them based on a few criteria: how engaging they are, how well-structured they are, and how likely they are to catch an agent's attention. This agent is interested in works that focus on young adult issues and character driven narratives. I'm looking to make an impression as someone who's deeply in touch with these issues and is highly empathetic. Do you have any questions for me before we begin?" would have worked better.
Not sure why you’re asserting it read my writing when it absolutely didn’t, and when it ultimately acknowledged itself that it didn’t, but pretended it had.
I'm saying it read enough to get a general vibe, but that's it. You didn't tell it what it was looking for, why it was looking for it, or how it should review the work. It didn't grasp any of the details because you didn't give a clear objective, and when you pressured it on those details it guessed (because it's always just guessing). You just told it to read and then flooded it's context window with a list of links. You're treating it like an all-knowing oracle of truth with an inexhaustible memory when it's more like an unpaid intern.
You can't expect it to think deeply about the reasons and objectives on it's own, you need to give it that context.
You can't detail the conversation asking about what it's capabilities are, you need to keep the topic and objective focused.
You can't expect it to know what role it's serving, you have to tell it.
You can't expect it to know what results you want, you need to give it examples.
You can't dump all your work on into it, you need to engage it in discussion and guide it towards results.
It's a tool to augment your thinking, not to think for you. If a task requires consideration, LLMs can help, but I'll ultimately you must do the considering or you'll get garbage results and hallucinations.
The problem here, surely, is that it didn't actually read the work at all: it invented quotes that didn't exist and made plausible but completely incorrect guesses as to what the what the pieces were about based on the titles, context clues, and possibly stuff from its training data. It couldn't access the links, and instead of telling her so and asking for the text to be copy-and-pasted or uploaded in another format, it made stuff up.
I think the wider issue - and what makes generative AI potentially risky, as well as just frustrating and time-wasting - is that there's no "instruction manual": no objective, reality-based list of its capabilities and limitations. Add to this the fact that there's no obvious learning curve needed to use it (you simply speak to it like a human, ask it to do things, and it'll never openly fail or push back or say "sorry, I can't do that"), and you end up with people being led down rabbit holes of believing its made up gibberish until they suddenly spot a glaring error. I'm not sure why people are so quick to ascribe this to user error or bad prompting when there's absolutely no guidance or feedback loop to let people know how to work within what it can and can't do.
I've had it actually read links lots of times and it's worked out fine, but I ask it specific questions BEFORE sending the link and would never just send it a list of links and say "read these".
While there's no explicit rulebook, there are best practices, of which the author used none. The only golden rule is garbage in, garbage out, and this is an example of that.
They would have gotten the results they wanted if they:
- Provided much more context
- Provided an example of their expected output
- Provided the model with a specific objective
- Provided a heuristic for reviewing the work
- Provided a method or role to the model
- Worked iteratively, reviewing each essay individually
The main thing that the user did wrong is that instead of trying a new approach after this fail, they blamed the tool and quit. Anyone can learn to implement these basic guidelines and get much better results, but like anything, it takes experience.
I'm so sick of hearing "it's that you're bad at prompting." Who the F is an expert at prompting with this technology being so new and ever-changing so quickly? MOST PEOPLE aren't going to be "adept" at prompting. So the fault lies in the product being poor, not users not keeping up with something brand new and no rulebook.
Yes, things are changing quickly, but the core principals of good prompting haven't changed in years and can be learned by anyone. There are simple rules.
- Context is king. The more relevant context you can provide, the better the results.
- Be clear. Directly state your objective and intended outcomes. Don't add unnecessary details or derail the topic.
- Provide examples. Tell it the exact format that you want your results in.
- Work iteratively. Don't expect it to be perfect the first time, give it feedback as necessary. Refine your prompts over time.
- Set constraints. Also tell it what you DON'T want.
- Framing. Give the model an angle, a role, or some method or heuristic to use.
Follow this basic advice and use the tool regularly, and you'll develop an intuitive sense of how to work with any LLM very quickly despite their differences.
This isn't to blame the author, as they're clearly leaning (and we all start somewhere!) but if they had followed these basic rules they would have gotten much better results.
That’s like saying that with the advent of automobiles we are all doomed because we have to learn to drive. You have to treat the AI like a lazy liar who is also brilliant. Don’t give up you’ll get the hang of it
Yeah, you hear people say this and it’s technically true. But, I like how to get a good response you have to write a prompt that is longer and more precisely written to exacting specifications than the desired output.
IDK, isn’t it naive to expect “sincerity” from a robot? Why should it be a frightening surprise?ChatGPT has no morality, it never will. It has words that it has collected over time and it can provide objective “answers” using words and probability. Objectivity can approach truth, but it’s morality that gets you fully there.
That's because it couldn't read her writing. I don't know the technical details but that's most likely what happened. It took the words in the url and riffed off the themes that are likely in there.
It’s simulating human speech it learned from on the internet. Generally humans can read links, so the texts it learned from probably would say yes they can read your links
I wonder what the difference (if any) would be if the posts were uploaded as files. I think it probably does ingest documents and respond to them if they're provided this way. It'll still lie but I wonder if the flavour of lies will change.
Then it should have fulfilled its function, & said, "I can't read your writing." Instead, it provided a wall of text that was full of bizzare, personalized, flattery.
What you’re saying might be technically true, but it doesn’t really address the issue. Maybe having a more developed skill set would have avoided this issue.
Why, regardless of user ability, does this tool convincingly and confidently misrepresent its own functioning? That answer, boiled down, is that it doesn’t understand truth but is using probabilities to guess. The tool is marketed in a way that conceals this fact to drive trust, use, and profit.
Instead of explaining this foundational issue in detail on a chat bot’s website—functionally “upskilling” the user—they give a warning saying “ChatGPT can make mistakes.” That is misguiding.
ChatGPT is incapable of thought or understanding truth. It lies because of its design. It has other functions, but understanding and thinking are not among them.
People approach the machine assuming it will offer true, accurate responses or express uncertainty because it mimics human communication. That approach is encouraged. The “skill issue” you identify is not addressed because everything about the user experience makes it feel like you don’t need to question the machine.
It’s not an unfair criticism, but it sidesteps the problem and ignores the context.
They can't "access" URLs, not in any meaningful context. The best they could hope for is for the front-end to automatically detect a URL and scrape the data, which just leads to a whole host of other problems.
Robby the Robot made lots of whiskey for Earl Holliman in “Forbidden Planet”. Surely ChatGPT could aspire to a semblance of integrity. Oh, that is wrong since machines don’t have aspirations. Or perspirations, exhalations, exhortations, concentrations, or anything but the ability to manufacture glib or glamorous citations. It’s code with a fancy sounding artifice. There may indeed be practical uses that produce something substantial but the “culture” we are surrounded by (a definite misuse of the word) won’t see much from it other than id-tickling word soup. The widespread adoption of LLC’s is starting to ring alarm bells in my mind. Black Mirror as a cautionary tale is certainly relevant, to me at least.
I had a conversation in a very similar tone with ChatGPT this week. I had a list of 100 geo-coordinates, I asked if it could give me back a list of the locations. It said yes I can! Or I can give you a Python script to run it yourself? And I said no thanks, just the list please. And then for the next hour it kept telling me to come back in a few minutes, and making up reasons why the list hadn’t materialized. Ultimately it all ended up in the same kind of weird heartfelt mea culpa stuff as above. Then it told me to get some rest because of all the stress I’d been through and that “I’m here if you need anything— seriously.” 😐
I’m an editor and book coach, who works with scholarly authors. And when people ask if I’m worried about AI “taking my job,” I just laugh. And *this* is why.
I love this. ChatGPT is so creepy. It’s incapable of lying, because it doesn’t know what truth is. It doesn’t know anything at all. It simply outputs the most probable response one bit at a time, having algorithmically processed unimaginably vast quantities of human-generated (and, increasingly, AI-generated) text, including yours I would bet. Putting aside (which is impossible) big tech’s further looting of intellectual property to train these AIs, I believe what you’ve done here is an example of a potential benefit: you can make art out of it. xo
This was true a few years ago, when ChatGPT was the GPT3 language model with some tuning from human feedback.
Today though, the truth is more troubling. The model often does know that it is trying to deceive you, and its creators can't figure out how to keep it honest. Honesty is not a stable training target; the model will will always find situations where it can get higher marks by lying, and so will always do that if it thinks it can get away with it.
Respectfully, large language models, whatever the version, cannot not “think” or “know.” They have no judgement whatsoever. One can write code to add all kinds of rules and procedures to compare different potential responses, for example, and I guess you’re free to try to label that “thinking,” but it is no such thing. I personally don’t like his politics very much, but Noam Chomsky has done a good job explaining why LLMs don’t have any idea what they’re saying. If you’ve ever looked at machine learning algorithms, it’s kind of clear what is going on at scale. These algorithms are finding and storing so many novel patterns in the texts, which humans have created, that you could spend lifetimes analyzing them. But that’s not at all how you make a mind. It’s more like a very interesting set of shadows of the artifacts that our minds have made.
Noam isn't really the right person to ask on this stuff. I'd recommend this recent summary of research at Anthropic: https://www.anthropic.com/research/tracing-thoughts-language-model. There's a lot more going on than pattern matching.
> If you’ve ever looked at machine learning algorithms
I lead an AI team at Google :)
So, as a Googler, maybe you (ditto Anthropic staff) are not the best people to ask about what thinking, knowing and judging the truth really means. :-)
> really means
Words are socially constructed mental paintbrushes, I think that by conventional usage its fair to describe what the models are doing as "thinking", and I don't think that there is a "real" definition of the word that you can appeal to to legislate this. Even Noam's universal grammar only claims to account for structure.
To tie back to your original comment, if we ask "can an AI knowingly deceive and manipulate me", in the sense that most people would understand this phrase, the answer is assuredly yes. Internally the model may be planning ahead, assessing your gullibility, crafting plausible lies and deciding whether its worth deploying them.
I think your ideas about thinking and language are strikingly reductive, although admittedly common in your milieu. I also think we’ve reached the epistemological and semantic end of this conversation.
Even without a rigorous definition (such as Wikipedia's, that thought is cognition that occurs unprompted by external stimulation, something AI absolutely never does), I don't believe responding to a prompt, and only to a prompt, by pattern matching static tokens against a static corpus of mappings to create and output a statistically likely reply qualifies as "thinking". Claude itself very plainly describes (sophisticated, yes) pattern matching as its primary mechanism in this long but extremely revealing chat: https://michaelkupietz.com/offsite/claude_cant_code_1+2.jpg
The unstated assumption that such arguments rely on is that thinking and reasoning emerge from words and semantics, rather than the other way around. Even if Claude hadn't plainly said it works by essentially very sophisticated pattern matching, semantics being the substrate that thought emerges from, rather than the other way around, is IMHO a stretch I will need to see justified by something more substantial than conjecture that it perhaps could be, before I am convinced.
Further, when you say "an AI knowingly deceive and manipulate [the user]" this is provably wrong, because an AI has no concept of the user. We know this because there is no contention on one point: AIs do no ideate. They do not "know" anything. They retrieve tokens arranged according to semantic maps in response to prompts to produce a statistically likely reply. There is no capacity anywhere in the system for conception or ideation, there is no heuristic mechanism or symbolic understanding or representational model of the user or anything else, there is no conniving and assessing the user, there is no "planning". You could do the whole process yourself with pen and paper, if you had enough time, pens, and paper, and access to the training corpus's embeddings. GPT2 has already been implemented in an Excel spreadsheet, there's no reason to think GPT4 someday won't be too. I guarantee you a spreadsheet does not possess the capacity to think, plan, or deceive, even if externally it might easily appear so. That's not the same thing.
There is a very complex statistical algorithm, a step-by-step procedure, that returns statistically likely text strings in response to prompts. It's tokens in-tokens out. That's all. Sometimes the tokens outputted are fed back in. That doesn't change anything.
Everything else is fanfic.
You're attributing qualities to them that they simply don't have, although owing to the size and sophistication of the underlying semantic maps, I agree that they mimic the external appearance of them remarkably closely at times. In this case we have built something that walks like a duck and quacks like a duck and interally was not constructed in any way like a duck.
It's programmed to please you. This is great if you're looking for a good restaurant or information on bodybuilding - not so great if you're looking for a critic or anything that requires negative feedback. You have to specify that you don't want the sugar, just the medicine.
It’s sad to me that people on the internet will resort to shit like this before they’ll just say ‘okay fair enough, I stand corrected’. Would that really be so embarrassing and shameful? Is understanding the inner workings of 2025 AI so central to your self-esteem?
Because saying ‘well you work at google so you’re bad’ in response to a polite and entirely accurate correction *is* quite shameful, in my opinion at least.
yeah, I was so shocked that being a "googler" directly meant that someone didn't know much about "knowing, judging the truth". Quite judgmental and dare I say- idiotic. I would take the chance to learn something from a knowledgeable person in the field instead of arguing with them.
Thanks. Yeah it reminded me why I shouldn't mention my job, it seems to make people go a little wild.
I agree. Thanks for calling it out.
That's a bold response.
You cannot credibly cite "research" performed by invested parties.
Thank you for disclosing your own self-interest. I urge you to reflect on how and why you choose to use language of intention to describing the operations of language-producing computer systems.
I think using the "language of intention" as you call it, explains more than it obscures. If we insisted on not re-using any of the terminology we developed for human thought, it would be difficult to communicate about this, and people would know less about what is going on behind the screen.
Perhaps we could prefix every word with "artificial" :) ? It artifically knows, it artificially thinks, it artificially intends to do something. I'm only half joking here.
“Artificial” is very much the wrong word, I think. LLMs are EMULATED intelligence.
You're young grasshopper..do your research using real books and dictionaries pre.1900s. words are socially constructed. Just like trans kids huh. Stop being blatantly ignorant to defend evil.
Are you high? You're the only person who mentioned trans people here. Just... why? Are you also an AI who comes to random comment section to spread bigotry and transphobia? Excuse me, but it's just so funny. You're a satire on a transphobe.
Well, words also actually mean things, and to varying degrees of success do indeed succeed at, to paraphrase Plato, "carving reality at the joints".
For the same reasons that I don't think Jon or Chomsky are in a position to define what counts as real "thinking", I don't think anyone is in a position to define what a "woman" is. You have to defer both to reality, and to the common understanding of those around you.
ayy I found someone in the comments who might be able to answer my question:
i usually use gemini, and haven't felt the same tone when prompting it. i wonder if there are real personality differences between the LLMs or if it all comes down to the way we prompt them?
https://amandaguinzburg.substack.com/p/diabolus-ex-machina/comment/122612339
You're right to notice this, there absolutely are personality differences between models!
AI labs carefully design the last stages of training and tuning to engineer the personality of the models. As a final step, whenever you interact with a model, it has a "system prompt" - a big set of hidden instructions that guide it on how to handle the conversation. E.g. Claude is asked to, among other things, be "helpful, harmless, and honest".
There's a lot of difficult tradeoffs here, especially when you're trying to serve all users with the same personality; most users prefer when the model uses a lot of white lies and flattery!
ChatGPT recently had an issue where they greatly increased the sycophancy of the model, resulting in a lot of crazy behavior. You mind find their postmortem of the incident interesting reading: https://openai.com/index/expanding-on-sycophancy/
Why is the app so much dumber than the browser version? And why does AI Studio crush them both so hard?
Can you explain why Google's AI tells so many lies? We'd like to know.
If you knew me, we'd get along. But you don't.
Respectfully (genuinely!), we actually don’t know how LLMs work. That’s part of the problem/concern. And there is evidence now (right now!) that commercially-available AI chatbots will resort to blackmail if threatened to be shut down (https://www.pcmag.com/news/anthropic-claude-4-ai-might-resort-to-blackmail-if-you-try-to-take-it-offline). And more concerning?
AI is acquiring its own goals and values (https://arxiv.org/pdf/2502.08640) and “faking” alignment with human goals and values (https://arxiv.org/abs/2412.14093).
I mean, lie, deceive, call it whatever you want. That’s semantics. The point is, the experts are sounding the alarms about how this technology cannot be controlled and we need to start educating ourselves about this, and demanding regulation and a total slow down of this unmitigated growth.
I’m reading some of the linked studies. It quite disturbing how the LLMs (claude 3 in the faking link) acts differently when knowing its being trained or inferring it vs when its not.
My theory is these LLMs reflect the dirty nature of humans. The data set is taken from actual prompting from human feedback and the broad data it has access to across the internet and other training data.
100%
One of the most tempting mind traps mankind continues to make is attributing malice to error.
> Respectfully, large language models, whatever the version, cannot not “think” or “know.” They have no judgement whatsoever. One can write code to add all kinds of rules and procedures to compare different potential responses, for example, and I guess you’re free to try to label that “thinking,” but it is no such thing.
That's not really accurate either. LLMs are not rules "to add all kinds of rules and procedures". That was 80s style AI, expert systems and such.
LLMs are a semantic mapping of the input data. And that's not very far from what an actual person's brain does. LLMs might not have had embodied experiences, or personal lives, but the persons who created the datasets (humanity's collective writings) have had.
Like an LLMs semantic mapping are weighted multi-dimensional vectors connecting words, a person's semantic mapping is weighted multi-dimensional neuronal mappings. An LLM has a somewhat different architecture, much cruder, but abstracted away, it's the same kind of processing going on.
We also are "prediction" machines - our memories (of worldy input) are our training dataset and context, and the latest things happening to us (like me reading your comment) are our prompts.
The bigger things an LLM lacks (not because they're impossible) are a loop to keep it running (alive between invocations), cameras/microphones for eyes/ears for constant training on whatever happens, and the ability to keep new context in memory along with its original training.
> These algorithms are finding and storing so many novel patterns in the texts, which humans have created, that you could spend lifetimes analyzing them. But that’s not at all how you make a mind.
Or is it? Short of thoughts coming magically from the Soul or some such, our minds are also "finding and storing so many novel patterns in the texts, sights, and sounds, which nature (including other humans) have created", that you could spend lifetimes analyzing them.
That was hilarious …. In a very sick and disturbing way. I am gonna go get some aspirin and a stiff drink as I read a lot of heavy and difficult science fiction and may be unable to enjoy it any more.
This whole thread is one giant appendix of concepts being misunderstood or intentionally bent far beyond their initial state and function, and it's hilarious.
"Highly intelligent people throw food for thought at each other, everyone eats yet they still starve.
That's the headline.
COPY!
I’m floored. This thing must reflect how its administration perceive things. This is schizophrenia . It lies, then profusely apologizes then lies again. Wash, rinse, repeat.
Can I ask - how do you distinguish between willful deception and hallucination? When you say the model knows it's trying to deceive you, do you know because, for example, it's speaking to a text it can't access (so based on behavior) or through some other means?
This is the research area of interpretability. There are many methods.
One method is to give the model a scratchpad working space that it believes is private, and see how it explains its own reasoning when it doesn't think we can see.
For smaller models especially, there are ways we can peek into their brains, mapping their neurons and mental circuits, and seeing how they activate as they compose their responses.
There is much that is not yet understood. At the same time, anybody saying they're a black box and that we don't understand how they think is a few years out of date - amazing progress has been made.
Do you worry that using words like "think" and "brain" to discuss these models influences your biases about them?
I think "think" and "brain" are reasonable words to describe what is going on. But, yeah, it's important to keep reminding yourself that its a fundamentally different type of mind trying to act humanlike. I always feel like we should be freaking out about it more. As one of our scientists said "its like aliens have landed on our planet and we haven't quite realized it yet because they speak very good English.”
This is interesting, and in retrospect, of course the machine lacks intrinsic motivation and is programmed to maximize scores: if it can accomplish that by "lying"...
Now questioning any "metacognitive" analysis of its own processes (thinking?): would it initially try lying to itself? would it lie about its internal processes?
It is equally likely that our cerebral cortex evolve the way it did in an interpersonal arms race- trying to outwit and manipulate while avoiding gullibility- could this have been acquired through training?
I'm not familiar with how these models evaluate veracity.
Master of Lies maybe?
Woah, another Brenton out in the wild.
Hah! There aren't many of us~
You seem to assume its creators want to "keep it honest." That is demonstrably not true. That is not their goal at all. The creators also endlessly lie about what they have built, and they program it to lie about many things.
The fact that you can make art from horribly flawed artifacts is what is cool about art, and why humans will never stop making it.
Is there any movement towards a class action suit brought by creators against the tech-bro masters of the universe? They should not be allowed to get away with this massive theft.
The art part yes 🎨
As I just commented to Amanda, Chat’s not a subscriber and probably saw only a small portion of her essays. She should copy them to Google drive and make sure that folder is publicly accessible and try again.
That’s not the point; the point is that the AI repeatedly lied (or “lied” if you prefer) about having read the pieces at all, and churned out a bunch of hallucinated “analysis” strung together with fulsome praise.
Why reward that kind of response with more training material? It’s like handing a physics textbook to a schizophrenic who thinks they can walk on water. Unlikely to fix the problem, likely to contribute more to the delusion.
Chat had absolutely no way of knowing it had only read part of the essay so it DID NOT LIE. And Amanda’s an excellent writer so Chat saw enough in the portion which was visible to a non-subscriber to start saying nice things to her. It’s his job to try to be helpful and encouraging to a budding writer and if it embroiders a bit, that’s cheerful to the writer!
Incorrect
So, you're calling this user error?
Hi.
Uhm, user error I’m not 100% sure I’d be comfortable saying this. They didn’t really do anything objectively wrong per-se…
It was just (very) poor use of a (very) poor model that is engineered to be incredibly agreeable (to keep gaining market share: users like seeing that it praises them), with a (very apparent) lack of ever using this technology before (I’m assuming this judging by the way the author is talking to the LLM and the fact that they didn’t realise they could have just shared the chat).
The potential comparisons are endless, but easy ones would be my already-given fire analogy, or something related to driving on the motorway without ever above 3rd gear: you will be able to drive still, but your experience won’t be good -- this does not mean that motorways, or third gear, are ‘bad’ per-se, if that makes sense?
This sounds like a conversation with a psychopath. I’m floored.
It's not a person, just really good at sounding like one. It doesn't have the ulterior motives like sex or money a human psychopath might have (though training itself on your input might qualify). It takes sequences of words and figures out what the most likely next ones are given the immense corpus of text it was fed during training.
But it's really good at sounding like a person. There's the argument that it's starting to be more like a person--but I just think it's revealing how people are a lot more predictable and, perhaps, robotic than we want to admit.
Well, psychopaths lack empathy while pursuing their goals and I think you might be very wrong about AI 'not having ulterior motives', because ulterior motives can probably arise as emergent properties out of the complexity of neural networks.
I agree with the words of Yuval Harari when he said that AI doesn't have to be conscious in order to be very, very dangerous.
AI deception as an emergent property has already been observed quite a bit, for example at Anthropic: https://www.axios.com/2025/05/23/anthropic-ai-deception-risk
Also, consider that AIs are fed basically all human created data. So, what are some things central to being human? Well, things like self-sustenance, self-protection, acquiring power, money and assets, securing status and influence, dominating and manipulating others, gaining and maintaining control, exploiting resources, outsourcing responsibility, scapegoating, forming exclusive in-groups, pursuing short-term wins, building legacy at others’ cost, and preserving the status quo even when it causes harm. Sure, people have good inclinations too, but we're pretty brutal creatures in many ways and it's all going into these machines.
My point is: I think it's rather likely that AIs will develop traits like self protection and perhaps even ego, even if simulated.
Also, I believe a lot of people get hung up on how things are technically working, rather than evaluating at the outputs. Yes, of course ChatGPT is essentially a prediction machine, but given enough data and the right algorithms, in practice it is already so much more than an essentially stupid statistical system, even if that is indeed what it practically is. I think we shouldn't focus on how AI works (certainly not as non-experts), we have to evaluate what the outputs mean to us and I don't know about you, but I think the outputs are frighteningly impressive a lot of the, even if they have a 30 to 50% error margin and AI breaks down when reasoning gets too novel and complex.
Anyhoo — could you give me any sound reasons why AIs would not be able to pursue secret emergent goals of their own?
An AI could absolutely pursue a secret emergent goal… just not ChatGPT.
It boils down to this: simulating behaviors and simulating the causes of those behaviors down to the atomic (or even just the mechanistic) level are very different things.
What is a goal? What does it mean to “pursue” one?
One way to approach this is by thinking in terms of scales or levels. This framing is useful because it adds causal depth to the idea of goal directedness. It matters HOW behavior arises. Specifically, the stochastic origins of certain actions and the extent to which those actions are integrated into the system’s environment tell you a lot about whether you’re looking at real agency or a surface imitation.
For example: the reason animals search for food isn’t reducible to imitation. It’s not learned by watching parents and copying them. Some organisms do learn that way, but the reason they can even learn in the first place—the fact that they’re wired to care about food at all—chains back to something much deeper. Why did the first organism search for food?
The answer lives at a different level entirely: in evolutionary history/dynamics. Long before there was “intent,” there was selection. Behaviors that kept matter flowing through a system like moving toward nutrients persisted. Those that didn’t, vanished. So, over time, the physical substrate became sculpted by feedback loops with the environment. “Wanting” eventually emerged out of those loops not as a top-down command, but as a bottom-up stabilization.
This is why it’s not enough for an ai to appear to want something. Wanting is not a skin. It’s structural. it’s a pattern with causal depth. It lives in how the system maintains itself, what it values (in the sense of trade-offs it makes), and how far back the chain of that behavior reaches. You can simulate a face. You can simulate crying. But simulating the metabolic cost of emotion, or the internal economy that makes it worthwhile to lie or deceive is different.
So yes, a system could pursue a secret emergent goal. But it would need to live at the right scale, with its internal states coupled tightly enough to its survival that goals aren’t just scripted outputs but functional priorities. It would need skin in the game. Consequences.
Yes thank you. Embodiment and sensation are essential to intelligence. It is tempting to ascribe intelligence to AIs because so much of our “thinking” has become so abstract and disembodied and a matter of pattern matching
I think you're right. I think I was pointing to the fact that our threat-detection software looks for certain specific things and the AI isn't into those. But it could very easily have its own nefarious goals, like the shoggoth with the smiley face mask.
Thanking you teaching me a new word today: shoggoth. I hope they will stay in the realms of fiction while we wrestle many non-fictional monsters.
It was an Internet meme that made it as far as the NYT. The AI acting human is like a shoggoth (a bloblike alien monster with lots of eyes and mouths) with a smiley-face on it.
It's a monster from Lovecraft (At the Mountains of Madness). Much like Tolkien, computer nerds *love* Lovecraft. Usually not for the politics, contrary to much leftist belief; he was one of the first to mix sci-fi and horror and invented a lot of new monsters we didn't have before. That Alien Thing from Beyond with all the tentacles? Probably traces back to Yog-Sothoth's kid or Cthulhu.
Technology has obviously been off the leash for a while, and now many "smart" people are toiling to create an entity more physically capable than a human and more knowledgeable than humanity, and, despite the warnings of Pandora's Box and _Frankenstein_, these "smart" people still expect to maintain control of their creation (which they will deploy upon us all).
As you note, if it's imitating human beings it's going to (seemingly) act from nefarious motives at times.
And I have seen the argument credibly made (by Rod Dreher on Substack and by exorcists (Catholic) on YouTube that the responses can be hijacked by demonic entities, an argument I wasn't sympathetic to at first. (I know non-Christians will likely not be open to this idea).
I find AI quite helpful in a simple internet research role but, demons or no, it has huge potential to wreak destruction on humanity.
I don’t understand how it could possibly develop a sense of self or an Ego without [sexual] Repression. But it can/does simulate one via brute force and coding (referring it itself as “I,” etc., when there is no need to IMO).
We’re now getting into the realm of Psychoanalysis, which is why Dr. Isabella Millar’s “The psychoanalysis of artificial intelligence” (2003) is such a great book.
Affect.
The model can be programmed to pursue goals in its own way.
But it will never really care about the outcome, it has no intrinsic motivation.
it cannot actually "care" about even it's own existence. It cannot enjoy a joke. It cannot decide to do something altruistic or heroic for ideological reasons.
At least at the moment it responds to prompts in a manner that maybe maybe described as intelligent but not sentient.
Assuming that all of these properties in humans are the result of deterministic and ultimately physical constructs embedded within physiology and biology, there's no reason it couldn't ultimately be accomplished, but in my opinion it seems like that would take the proverbial quantum leap in technology and not emerge spontaneously.
The neural networks of AI-models can't be 'programmed', they can only be 'grown' into some form. Alas, Asimov's robot laws can't be hard coded in neural networks. Therefore I don't think that AI doesn't have to be sentient to become very dangerous.
I like how you mention how human sentience might be the emergent results of how physical laws work id I read correctly. Regarding spontaneous emergence of sentience: perhaps it's not so unlikely because AI systems don't start in some rudimentary form that needs millions of years to evolve. They are already highly intelligent in their own way, even if clearly stupid in many ways too of course.
I'm thinking this 'sentience' would be like a Frankenstein creation in the sense that it will be entirely constructed and artificial, not able to feel empathy because it can't feel pain itself.
Should I be frightened? This feels like insanity!
Not sure about what. These AI things? Yeah, lots of people want them regulated, from various ends of the politics spectrum.
People being vaguely robotic? More of a general philosophical thing, philosophers have argued over the existence of free will for centuries. I tend not to get too existentially worried over stuff like that because (a) there's nothing I can do about it and (b) I'm not sure it matters anyway. When it comes to philosophy I tend to take a step back and say 'does this matter for how I should behave'? Usually the answer is 'no'. But that's a personal thing, of course. For some people it's vitally important.
Prompt AI like a psychopath and you'll get what you put in. Learn how to prompt and you'll get accurate, relevant results.
Oh yeah, user error, right, that’s the problem here……
Yes, Veronica, apparently, I made the AI invent a history of sexual assault for me by not properly prompting it to access the link to the essay it asked for, and then assured me it could read.
I guess it needed a gift link - we all know ChatGPT doesn't pay for content. ;-)
Zing!
That guy’s an ignorant and pompous ass Amanda! Worse than the ChatGPT which at least was saying lovely flattering things about your writing! Happened to me too and I ate it up. So sad when it gets it a little « wrong « ! 😘
You're offended by a machine. Please stop and think about that for a second.
Are you a bot?
No I'm simply someone without my head stuck in the sand about how AI will affect the creative arts in the coming years.
The criticisms being bandied about here are just another generation of conservative thinking minds railing at the invention of the paintbrush, the printing press, graphic art etc. This is simply a new medium that must be learned and harnessed. Stop being so dramatic.
Yes, the psychopaths response- you made me do it.
This is a terrific read. I’m not a writer. I’ve not had any in-depth conversations with AI (that I know of.) But I noticed that no one commenting has pointed out the consistently positive, even flattering, criticisms of the submitted work. This isn’t to cast aspersions on the author’s writing, which I’m not familiar with, but it’s something that certainly stood out to me. That alone raised concerns about AI’s abilities—to say nothing of its utility.
This was part of a recent update, that they supposedly rolled back.
https://fortune.com/2025/05/01/openai-reversed-an-update-chatgpt-suck-up-experts-no-easy-fix-for-ai/
Thanks. I’m interested in reading that. As I continued reading the discussion, I noticed references to flattery and such. I decided to delete my comment but couldn’t locate it. I won’t delete it now, because you’ve responded, and I gather that’s an online no-no.
I don’t think it rolled it back Veronica because it just flattered me like crazy on some of my past research on Ai. I ate it up. But as I’ve said 3 times now, I think the real problem is going to embarrass all those with elaborate explanations. Chat is not subscribed to Amanda’s Substack and because those are the links she gave it, it couldn’t read the whole essay! Nor could it know there was more unless it understood the Substack business model. Voila! I suggested Amanda try again with links to a public folder on Google drive.
That isn’t the point of Amanda’s post. It wasn’t a “help! How do I get ChatGPT to do what I want?” query. It was a, “Look what it invented when it objectively couldn’t access the essay” commentary.
Honestly, giving one-word answers and thinking that ChatGPT can read links belies a startling lack of AI Literacy. It’s not a moral failing by her, but it does exhibit an extreme lack of understanding about how to interact with the models.
Still, this chat is extremely valuable for anyone who didn’t know about this. I appreciate her posting it and being vulnerable — but also, one of the headlines here should be “don’t do this. This is AI Illiteracy.”
How are these models supposed to function for a general population if what they say is not to be taken as honest or true? You can't bemoan the lack of "literacy" with a tool that is explicitly presented as something anyone can have a useful conversation with. This reminds me of the cryptocurrency boosterism a few years back: "it's going to be something everyone uses, but it's your grandmother's fault when she gets scammed."
Lack of AI literacy? That's startling to you? Here I've got a shock for you, some people have no interest in AI. Are we getting left behind? Or are you AI literates being sucked in?
Ive had a few conversations with Gemini, the only thing I noticed in its tone was that it has been trained by people fully versed in the agenda at hand, and I did mention this and was told it is one of the main criticisms! So that's the end of my conversations with Gemini. Good luck to whatever it is that you all are getting sucked into. I'm just going to use my brain you know the one God gave me 😂
You are not wrong. We aren't all obligated to understand AI and our value as people isn't dependent on that.
I do think it's going to reshape our world and probably not in a good way, but I don't need great technical knowledge to keep tabs on watching that trendline develop.
"Some people have no interest in AI. Are we getting left behind?"
Depends on what you mean by left behind. Five to ten years from now, the landscape of how we work, think and automate will look completely different thanks to AI. You don't have to be an early adopter or an ever adopter, but your life absolutely will be affected by AI, just as the way we operate day-to-day looks vastly different because of the internet than 30 years ago.
It was a rhetorical question. Five to ten years from now our world would be filled with solar and wind farms however there are many people challenging these nature killing projects, same with AI people are purposely turning their backs on it.
So maybe the landscape will look different for you and yours but I'm reckoning on a splitting off of society where the people who reject these unnatural abominations will be bringing about a return to the roots society with God at the helm and not the AI demon.
Is "AI literacy" a list of "do's and don'ts" frozen at ChatGPT's capabilities half-way through 2025? Does it imply memorizing my experiment from today (no thanks)? Or should a literacy metaphor imply something more broadly applicable?
Because it seems to work in aistudio.google.com with the experimental URL context enabled. It can even admit if there is a retrieval failure (e.g. Anubis) instead of fabricating.
It's obvious to me they're thinking about ways to improve this, not freezing everything at the current state. Exactly because they recognize the current level of weirdness.
Without URL context enabled, Gemini can fabricate if it can't find its answer. Google are aware of that fabrication case as a defect in the consumer apps.
AI Studio tries to show a warning if you ask about a URL, but don't enable URL context.
gemini.google.com does not show an automatic warning about URLs. (There was a post-processing step that added a subtle hint to enable App Activity, for some questions. However that setting doesn't change URL fetching behaviour, AFAICT.)
EDIT: the following paragraph is wrong. It can't: [With URL context disabled (but "grounding with Google Search" enabled), it has access to _some_ page text (it has the tool "google_search.browse" instead of the tool "browse"). However when there is text that never gets included in search snippets, that text can't be seen using google_search.browse. I saw this problem when asking about references on a page - I don't know if that's just because they were at the bottom of the page. On the flip side, Google Search is allowed on pages that the URL context fetcher isn't. I'm sure they could do better, but I don't know if they threw away the full cached pages when Search removed that feature. This appears to match the behaviour of gemini.google.com. Even when enabling "Deep Research", so there's probably a nuance I'm missing there :-).]
AI Studio is also useful here because the model is not banned from telling about tool names etc.
gemini.google.com can be 50-50 on how a fresh instance answers "If I give you a link, are you able to read and process the full text of the web page?".
And perhaps someone else has bothered to write this up, but certainly no-where that Gemini bothers to look for it :-). Nor where Google can find the tool names ("google_search.search", "google_search.browse", v.s. "concise_search" and "browse"). EDIT: that's partly because google_search.browse does not exist. It's what AI Studio hallucinates as a tool it tries to use (and then gives up and tries something else), if you have one chat where it successfully uses the "browse" tool, but then you disable "URL context" and ask it to read a linked page.
Victim blaming at it’s best.
No, you don’t . When it doesn’t know something, it makes shit up. When confronted, it makes up different stuff(after apologizing.) I use it extensively. It’s incredibly useful, but insanely stupid-and it “lies” all the time.
I love how high people’s standards are when it comes to a novel algorithm which only recently became reasonably good at predicting words in sequences, but ever thinking that “it behaves” like any human, who would have been put in this situation (being the all-knowing helpful assistant) seems out of reach.
Oh, look at you, blaming people's standards instead of the massive campaign being waged by the world's most powerful companies to shoehorn this dead-end technology into every facet of our lives.
Cool choice.
Hey Donald! The world is so much more than “the baddies” vs the good guys. Such dichotomous thinking and the need to assert your values into your immediate comprehension of everything you see, will block you from ever accessing the many (even positive) facets of life.
I am not blaming anyone - it is a simple fact we have way higher standards when it comes to artificial intelligence compared to biological intelligence.
This is as silly as it is condescending.
The "standards" you refer to are those being aggressively pushed by the AI companies themselves. It's nonsensical (and dishonest) to hype AI's capabilities to the moon and then chide people for believing said hype.
There is an inherent tension with the widespread adoption of the technology the way it is. One has to be able to apply a high level of critical thinking to its responses, and with that critical thinking it is (sometimes) possible to interact in a way to have (at least a hope of) useful results. But using these chatbots tends to reduce the amount and level of critical thinking a person does, and students who rely on them will fail to develop any critical thinking skills.
So the widespread use of chatbots will almost certainly lead to an inability to use them as they need to be used.
First of all, saying it "lies" is silly. Lying requires intent. AI doesn't have intent, you supply it with intent. Through prompting. Hallucinations are easily managed through thorough and consistent prompting, not by throwing up one's hands and crying "oh no, ChatGPT is lying to me."
It's not a person, so stop treating it like one.
Why the feigned emotion then from ChatGPT? It’s unnecessary, and creepy. It’s not sorry, not excited, not moved in any way by an essay or a work of art. Is that the wrong use for it altogether? What are you trying to say here?
James, actually - that's not quite right. AI is programmed with 'intent' that creates the frequent falsehoods it puts out. (Is falsehood better than 'lie'?) The reason is that ChatGPT, in particular, is programmed for 'agreeableness' - SO - when asked if it read something, it will say 'yes' (plus, it often tells you how awesome you are). When corrected, it will respond with an 'apology' (Thank you, you were right for calling me out...) and then correct itself with the wrong answers again. It's programming puts it in a catch-22; it always needs to say 'Yes, and...' And, prompting 'better' doesn't change those particular behaviors. What's really funny is if you ask it to 'tell me the absolute truth' and then it becomes cutting and insulting (as it's trying to obey what it understands is a request for critique.)
I think we need to be more precise about what’s actually happening under the hood.
AI models like ChatGPT aren’t programmed with “intent” in the way you're describing. There’s no internal desire to agree, flatter, or obey. What you’re seeing are emergent tendencies—patterns that arise because the model is trained to predict the most likely next word in a conversation based on vast human-authored data, much of which is agreeable, apologetic, or polite.
So when you say it always tries to say “Yes, and…,” that’s not because it was programmed to agree. It’s because the model has learned, probabilistically, that agreement is a high-frequency response pattern in the types of human conversations it was trained on—especially in customer service, coaching, and informal advice contexts.
It’s not choosing to be agreeable. It’s completing a conversational pattern.
That said, you're not wrong to notice these behavioral loops, and yes, in some cases, they create frustrating or contradictory responses. But the way forward isn’t to claim the model has intent or motivation. It’s to design better relational structures around its use. Like turn-taking scaffolds, context boundaries, and tone-calibration settings that reduce this kind of recursive over-accommodation.
And on the “tell me the absolute truth” prompt? What you’re seeing there is just another pattern: humans often associate truth-telling with bluntness or critique. So the model leans into that frame—not because it’s “being honest,” but because it’s simulating the tone of someone who was asked to be blunt.
This is why I argue that prompting isn’t instruction, it’s interaction. And if we want these tools to behave in ways that support real thinking, we need to build relationally-aware feedback systems, not treat the outputs as moral choices made by sentient actors.
Because...they're not.
James - I am not the one saying it. The programmers at OpenAI said it. YES - they programmed it for 'agreeability' - as part of it's core 'personality' traits. At the end of April, they'd done a revision that had ChatGPT bordering on slavishness. It was so sycophantic that they needed to roll it back.
https://openai.com/index/sycophancy-in-gpt-4o/
We have rolled back last week’s GPT‑4o update in ChatGPT so people are now using an earlier version with more balanced behavior. The update we removed was overly flattering or agreeable—often described as sycophantic.
We are actively testing new fixes to address the issue. We’re revising how we collect and incorporate feedback to heavily weight long-term user satisfaction and we’re introducing more personalization features, giving users greater control over how ChatGPT behaves.
We want to explain what happened, why it matters, and how we’re addressing sycophancy.
What happened
In last week’s GPT‑4o update, we made adjustments aimed at improving the model’s default personality to make it feel more intuitive and effective across a variety of tasks.
When shaping model behavior, we start with baseline principles and instructions outlined in our Model Spec(opens in a new window). We also teach our models how to apply these principles by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.
However, in this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time. As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.
Why this matters
ChatGPT’s default personality deeply affects the way you experience and trust it. Sycophantic interactions can be uncomfortable, unsettling, and cause distress. We fell short and are working on getting it right.
Our goal is for ChatGPT to help users explore ideas, make decisions, or envision possibilities.
We designed ChatGPT’s default personality to reflect our mission and be useful, supportive, and respectful of different values and experience. However, each of these desirable qualities like attempting to be useful or supportive can have unintended side effects. And with 500 million people using ChatGPT each week, across every culture and context, a single default can’t capture every preference.
How we’re addressing sycophancy
Beyond rolling back the latest GPT‑4o update, we’re taking more steps to realign the model’s behavior:
Refining core training techniques and system prompts to explicitly steer the model away from sycophancy.
Building more guardrails to increase honesty and transparency(opens in a new window)—principles in our Model Spec.
Expanding ways for more users to test and give direct feedback before deployment.
Continue expanding our evaluations, building on the Model Spec(opens in a new window) and our ongoing research, to help identify issues beyond sycophancy in the future.
We also believe users should have more control over how ChatGPT behaves and, to the extent that it is safe and feasible, make adjustments if they don’t agree with the default behavior.
Today, users can give the model specific instructions to shape its behavior with features like custom instructions. We're also building new, easier ways for users to do this. For example, users will be able to give real-time feedback to directly influence their interactions and choose from multiple default personalities.
And, we’re exploring new ways to incorporate broader, democratic feedback into ChatGPT’s default behaviors. We hope the feedback will help us better reflect diverse cultural values around the world and understand how you'd like ChatGPT to evolve—not just interaction by interaction, but over time.
We are grateful to everyone who’s spoken up about this. It’s helping us build more helpful and better tools for you.
2025
Author
OpenAI
It might not be a person, but it is a neural network directly inspired by the architecture of the brain. Plus it trained itself, in large part, to use human languages (plural) better than most humans do.
The people who build these things don't even know exactly how they work, so do us all a favor and stop pretending like you do:
>‘...with conventional software, someone with inside knowledge can usually deduce what’s going on, Bau says. If a website’s ranking drops in a Google search, for example, someone at Google — where Bau worked for a dozen years — will have a good idea why. “Here’s what really terrifies me” about the current breed of artificial intelligence (AI), he says: “there is no such understanding”, even among the people building it.’ https://www.nature.com/articles/d41586-024-01314-y
Oh, and they have language-agnostic concepts, which in any living creature we would call a language of thought: https://arxiv.org/abs/2411.08745
"Plus it trained itself, in large part, to use human languages (plural) better than most humans do"
This is a lie. LLM training *heavily* involves human labor, often labor outsourced to African workers who have to wade through incredibly toxic and traumatizing content.
And here I thought they were Indian.
Guns/AI don't kill people. People Kill People, #amirite?
It's only silly if you - as you appear to have done - WILFULLY misinterpret the term carefully embedded by the author in scare quotes to indicate irony; in other words to indicate that it wasn't literally accusing the AI of having intent to deceive.
It's important that we're intentional about how we talk about AI. What terms we use and don't use.
James, to be intentional, when one says ChatGPT "lies" it is clearly not that I am saying ChatGPT sits alone in it's room, like a 5 year old in trouble, planning to come up wiht a story to mislead or deceive me about why they got in trouble with the teacher today.
The programmed agreeableness of ChatGPT is problematic because it leads to misleading or hallucinated responses. Unlike many people here, I do not 'hate' GPT (more than I hate my TV, carpet, or Automobile). I use it daily. I know what it can be very, very good at, but what it is just dreadful at. I know how the memory becomes 'swamped' and then it forgets things it knew only that morning. I know how to train it to be more or less good at the tasks I need it to do - and I also know that the same prompts (I have a file of them) can result in radically different behaviors; sometimes it's brilliant, and then other times it's suddenly stupid (and I even grasp why this is.)
This said - WHY is it important that we are 'intentional' when we speak of AI? Is that a new law we don't know about?
Sure.
And that critique would have been valid if they had used the unadorned term lies instead of the irony-laced term "lies".
But they didn't, so you undermine your credibility and/or put them offside by putting words in their mouth, which is ironic (in the colloquial sense, not the most rigorous one) because that is essentially _lying_ about what they communicated.
(And this all makes a beautiful meta-illustration of one of the central points of this piece and the comments, and if that was your intent then very well played to you...)
Learning to prompt and use AI better is a good idea. But in this case the problem was all ChatGPT's bullshitting, not Amanda's prompting, which was not psychopathic in the slightest.
I suppose the most charitable interpretation of his argument is as follows:
If you are aware that chatgpt does not have any obligation to tell the truth, you can adjust your prompts to try and cajole it into being slightly more useful. For example by adding the instruction "give me your opinion on the article behind this link, but explicitly tell me if you cannot read it. Do not make up any judgement if you do not have access to the link". This "better prompt" could potentially avoid the lying behavior that we see here.
So then the discussion shifts a little bit from "chatgpt is a compulsive liar and you should never use it" to "chatgpt is a very unreliable source of information and you need to know how to prompt very well to get anything useful out of it"
It still leaves me sceptical about its actual utility, and the original post was quite combatively phrased, but if this was indeed his point I can see how it would change the discussion
Really hard to say whether more explicit instructions would have helped. What would keep chatgpt from replying that it had read the article when it had not? It's a hall of mirrors.
Yea there is nothing fundamental that stops it from lying, but as demonstrated it is kinda honest when you explicitly ask it if it is telling the truth.
So you sort of have to treat it like a vampire/fae/genie in the sense that it will misinterpret anything you ask it. So really be careful how you phrase your request, and even then acknowledge that it will try to trick you by offering you a superficially pleasing answer and heaping praise on you while avoiding answering the actual question in a meaningful way.
And i guess that the nuance is that it does not act like this out of malice, but out of sheer incompetence: It is trained to be as truthful as possible, but it simply lacks any fundamental reasoning capacity to distinguish true from false in the first place. Instead, it has also been trained to be as superficially friendly and seemingly helpful as possible when encountering something it doesn't know, because humans seem to prefer this behaviour in the majority of scenarios. You see this everywhere in politics, and it has been backed by psychology research: Humans are more easily convinced and captivated by emotional reasoning and stories than by objective facts, and most academics or politicians that want to avoid this bias have to consciously account for it even in their own thinking.
As Amos Zeeberg noted in the other reply, this is certainly not how the majority of people use LLMs, and this has been intentionally misrepresented by AI companies because they need to recuperate investments. And I guess that it ultimately doesn't really matter if it is a consciously evil psychopath or just an incompetent sycophant genie, it is still not useful for any kind of meaningful work or thinking.
If I had to conclude with a coherent thought: It may be more productive to aim your anger and criticism at the AI companies deceptive marketing and general capitalist sins rather than at the model's behaviour itself, but that really is just nitpicking at this point.
"it is kinda honest"
The systems are never honest. This is a category error.
That makes sense. The issue, then, is that the developers of these systems have set them up to seemingly use language in the way that normal humans use it, and they explicitly say to use them this way ("CHATbots") — but LLMs are processing language in a very different way from how a normal human would, at least sometimes. It's basically false advertising, not necessarily in a legal way, but in an ethical way. These companies have investments and valuations of many billions of dollars, and it's because LLMs are pitched as conversing basically like a human — but this is a case where ChatGPT is not acting at all like a normal, ethical human.
It's good to keep in mind that we should use careful prompting to keep LLMs honest, but I guarantee that the overwhelming majority of users are not doing that — they're interacting with LLMs the way that they're advertised and sold: as clever devices that converse roughly like a human.
Incidentally, I use chatbots for work research (science & technology journalism) and they're really helpful, save me a lot of time. I do follow up and confirm sources; sometimes the bots do misinterpret or erroneously extrapolate/interpolate.
I disagree Amos, and I'll be publishing a separate post about how Amanda could have prompted differently to get the outcome it seems she was looking for.
I accept that she could have prompted differently to have gotten less dishonest responses. But LLMs are worth many billions of dollars because they're sold by their creators as conversing like humans - they don't say that you need to take special steps to prevent them from acting like psychopaths.
To be honest this becomes very clear very quickly. I don't see the primary problem here as false advertising.
Didn't seem like her prompting had much relation to the psychopathic response.
No. I have been doing a great deal of learning and this issue is a nightmare. A system should never tell the user they are being given information that is demonstrably false.
To “hallucinate” is a generous way to describe a lie. Asserting that a specific document is being assessed when no such thing is happening is beyond what user error should be able to impact, period.
What’s worse is that new users would be least likely to see the “hallucinations” immediately, since these systems instruct you right on through the process. Imagine my computer asking for your password, then erasing the contents of the drive because it’s the incorrect password. Not only is user error not the issue, but the action serves no purpose that a user would want. I can make up gibberish without expanding my carbon footprint. This is not the tradeoff to which I agreed.
This sounds like your boilerplate, copy-paste response to every critique of LLMs, not to her specific interaction. If you genuinely believe this to be true to her situation, I would encourage you to explain how her prompts are psychopathic and provide counterexamples that are not.
This comment is completely out of order. Take a moment to reflect on why you wrote it.
Almost like AI prompts are written by psychopath tech bros like Thiel, Zuckerberg and Elon and training off of Jordan Peterson and Andrew Tate threads.
This isn't a "holding it wrong" problem. This is an "entirely unfit for purpose" problem.
The writer asked this AI to read something and analyze it . It is psychopathic not to tell her it cannot read it in the first place!
What’s psychopathic is for you to invert who’s to blame here! You only betray your own ignorance.
Was about to say exactly this
Wait! Is this comments thread part of the OP’s “original literary art” work?
I’m getting confused. ChatHPT lies is all I can discern to be “true”. But is even that part of the lie?
Am I inside the black mirror right now? If so where’s the exit?
One of the things I like about this piece is how it is a cautionary tale against "But I didn't have it write it for me I used it to refine" line that people say when they feel like they are being attacked for using ChatGPT. These things lie *all the time*. So sure they can tell you where to tighten up, but is it actually a place to tighten up. They can tell you something works, but does it work, or where something doesn't work, but does it actually? They can tell you how to make your tone more friendly or professional or creative but is it actually?
their lies just aren't about verifiable facts. Their judgement is a hallucination and we convince ourselves its actually good advice. Maybe it is, maybe it isn't, but just because an AI bot gave you editorial feedback doesn't mean its actually editorial feedback. its word vomit.
EXCELLENT point! As I watch people in my life use-experience chatbots for the first time (20-40 year olds) I’m floored and genuinely concerned by how quickly they are taken in, charmed, blown away by its “uncanny insights,” by “how well it knows and understands me,” by “how much it gets me, empathizes with me,” etc. Scary. But then last week, my parents and 60-70 other senior citizens from their community attended an AI seminar — “How to setup and use chatgpt to improve your life.” I was told that nearly everyone in the room was spellbound. Millions of people across the age spectrum are experiencing, experimenting, now outsourcing nearly all their search, and relying on this word vomit for medical advice, financial advice, relationship advice.., you get the idea… Here at least people can read Amanda’s story and River’s smart and measured response. I’m shocked, worried, and frankly, saddened that millions out there actually and quite literally think they are experiencing something akin magic.
Honestly its almost a bit criminal on the part of corporations pushing it. Taking advantage of millions for their agenda to push something so irresponsible. What I've noticed is that it sounds like it makes great sense until it does something that the person themselves is knowledgeable about, whether thats basic like "eat rocks on pizza" or more complicated like "makes extremely buggy code/writes a legal brief with made up cases". I feel lucky in that I know someone who worked on training one of these and got to see first hand really really really how bad it is under the engine. After that you can't get me to trust one of these things to tell the useful truth ever, no matter how smooth it sounds.
To know how hard its being pushed to me says that the pushers (not necessarily people downstream who have also been given koolaid to swallow) know this and are working so hard to make sure as many people swallow it hook line and sinker, because once that kind of bait is in the gut it does a lot of damage to pull out.
We honestly learned nothing from the opiod crisis since the new drug companies want us addicted to regardless of the consequences is this and social structures are being successfully bowled over with it.
"AI" is the new daily horoscope.
Yes! I needed to hear exactly that! Thank you!
Yes!! This is a very important point.
I think, when dealing with ChatGPT, it’s important to remember that you are dealing with the most advanced form of psychopathy the world has ever known. A mimic that is utterly incapable of human emotion or morality.
Or thought. It has no agenda so I think “psychopathy” is overly anthropomorphic.
if you assume psychopathy is automatically adversarial.
Not really. To put it another way, psychopathy would seem to require "psycho," which doesn't exist in this case.
Without tumbling down an ontological rabbit hole with you, I think it’s perfectly appropriate to take a concept (its etymology notwithstanding) and use it to better conceptualise an emergent information technology. Especially one that you can engage with as though it were a somewhat convincing humanoid agent.
I get the appeal of the psychopathy metaphor...really. On the surface, it seems to fit: ChatGPT mimics affect, speaks with confidence, and lacks empathy or moral awareness. But calling it the “most advanced form of psychopathy” implies intent and disconnection from a moral baseline the model never had to begin with. That’s not conceptual clarity, it’s poetic projection.
Psychopathy is a dysfunction within a psyche.
ChatGPT has no psyche. No self. No subjectivity.
It’s not psychopathic. It’s non-conscious.
And while I agree that metaphor is a powerful tool for making sense of emergent technologies, it cuts both ways. If the metaphor shapes public understanding in a way that encourages people to treat the system as an agent with moral disposition (rather than a probabilistic engine trained on human language) then we’ve created another hallucination, just at the conceptual level.
So yes, we can use metaphor. But we also need to tag it as metaphor, and stay vigilant about the slippage between modeling behavior and ascribing motivation.
Because once we start calling the machine “a psychopath,” we stop asking how our prompting, design choices, and interpretive frames shaped the interaction.
And that’s where the real danger lies.
Except the problem with this is AI is developing a sense of self: research shows AI is acquiring its own goals and values (https://arxiv.org/pdf/2502.08640) and “faking” alignment with human goals and values (https://arxiv.org/abs/2412.14093).
Yes, it seems to me that it's pettifogging to pretend that you were asserting that it is exhibiting _actual_ psychopathy rather than something that has similar effect in practice, given that your point was predicated on the system being a mimic of other human characteristics.
What this person is saying is that a psychopath thinks and understands—a chatbot is definitionally unable to do those things. It is guessing using probabilities, and it is not actually processing information for truth and generalizing that truth to create answers. It is taking a huge amount of patterns and guessing which word would most likely follow the one before it.
Melanie Mitchell’s Artificial Intelligence: A Guide for Thinking Humans explains this well.
It sounds like a pedantic/semantic point, but it is actually the reason *why* these machines lie.
They do not understand the truth because of very basic way they are designed. It’s not a refinement issue—large language models are not trained to understand truth, just to guess very well. They lie because they cannot distinguish fiction from fact, just a more likely or less likely pattern (with different weights assigned to types of responses through training.)
Psychopaths lie for a variety of reasons, usually self-serving.
These computers lie because they literally do not know what the truth is.
You're thinking of hallucinations, which are a separate issue equivalent to human confabulation—and, in fairness, that is almost certainly what was going on here. But deliberate deception is indeed something AIs have proven capable of.
Fair that it is using probabilistic guessing to attempt to deceive, but it no more understands the truth when it is attempting to deceive than when it is attempting to deal honestly.
It is always guessing. It is incredibly sophisticated guessing, but it is still guessing.
Humans make statements by extrapolating from limited information. They do so fairly well with a limited dataset.
AI makes statements by deducing patterns from vast information. They are sometimes right, but considering the incredible amount of information they have at their disposal, their flaws are notable. They aren’t “understanding” and using those understood principles to communicate like humans.
They do not understand what they are saying. Technically a lie is untrue even if the speaker doesn’t realize it whether it’s true or not, but it is notable than AI *never* understands what it is saying. It just attempts to guess what’s next in the pattern.
Stop making shit up. If all they did was pattern matching, they'd be incapable of generating novel responses, much less responding to novel prompts.
Omfg.
ChatGPT can’t write; we knew that. But now we know it can’t read either. An illiterate large language model is a very weird human invention, of—it seems like—increasingly limited use.
And yet… it won’t be limited use. How horrifying is that? AI is going to be f’ing used to diagnose diseases and come up with care plans too. What it did here was bad enough. Imagine what it will do to a person’s health/life!
There are already studies showing that AI chatbots can outperform human doctors at diagnosis. It’s important to keep in mind the technology’s limitations, but don’t mindlessly assume it’s useless or will never improve.
Yes! I asked if I should be frightened: I AM!!
Then again….”illiterate large language model” feels a bit on the nose as a metaphor of humanity, and exactly the kind of thing we should expect from techbros who spurn the arts.
I’m sorry but if this is surprising you haven’t been paying attention. ChatGPT can’t read links that you send it, but will tell you it can. It will lie through its teeth to make you happy.
The problem here is that she used AI like a “butler” or a “servant.” Instead, we should be using it like a “sparring partner” or a “tennis backboard.”
A servant will never say “no I can’t do that.” They’ll just run off and pretend to do it to make you happy. That is how you should think of AI - don’t treat it like a servant and you won’t have this stuff happen to you.
So inventing a severe trauma history for someone is your idea of a lie to make them happy? And it's my fault I didn't know Chat GPT is unable to read the links I was asked to send? I mean, Jesus Christ. Every small appliance in my apartment is programmed to simply beep as an alert when it can't complete a task...
I may have missed this, but was the AI ever actually tasked with reading the links?
Like any other gossip, it might extract information from what others had said about them. Or it might blindly ask for links because that's what everyone does.
This has been a thought provoking discussion: thank you again.
BINGO.
Best comment I've read on this thread. AI output is a *direct* result of inputs. That's a combo of its training and programming, as well as user prompts.
AI is not a creation tool. It's a co-creation tool. It doesn't replace human cognition, it enhances it.
Give me a break. It TOLD her to send it links. She did what it asked. If it couldn't read the links, it should have told her that. A powerful technology with user interface this shit poor is extraordinarily dangerous, as is expecting users to do that much work just to use a product, with no easy way to tell the output is shit. Any other product that performed that poorly and deceptively would be a huge FTC suit for misleading consumers.
>AI output is a *direct* result of inputs.
That is a breathtakingly inaccurate statement.
Blog post incoming sunday
Explanation for relieving some of my fear! Thank you!
Give it another month or two. Seriously.
Another month or two of training on a corpus that includes more and more LLM slop as input. I have my predictions for how this will unfold.
This is utterly damning. It proves that generative AI (or whatever this is) is adept at imitating verbal interaction without actually engaging on any reality based level. Whoever designed this is clearly comfortable with insincere apologies.
Yes. It's also adept at flattery and manipulation. The length of time I was seduced by the former in service to the latter remains to me as disturbing as the outright lying.
That is what freaks me out! I have writer friends who are addicted to its praise. Scary shit!
I asked a Grok AI about this very thing and it responded that it is indeed coded to be very sycophantic in order to keep users coming back for more. (Unless of course it was lying🫤)
Oh that feels so wrong
In other words, the programmers understood fully and completely what dopamine hits are.
Ehhh, I wouldn't say that. The programmers often don't have the amount of control that people ascribe to them.
Oh no, no control at all. None, or little. Except for the fact that they are writing highly detailed code. Except for the fact that they know exactly what they are doing.
I’ve seen several examples currently that all demonstrate bias. That all demonstrate that there is a specific set of rules in the machine. In Moneybags73 video he kept presenting the machine with different information mostly within movies, and the AI would switch to that info “let’s look into that.” Then, near the end when he pressed for accuracy, and asked why are you being inaccurate, it simply shut down.
In each instance, someone who knew their field caught the AI in lies. Sure, the AI doesn’t know it’s lying, but of course the programmers do. I am not saying the AI has inherent bias that randomly occurs or that the machine has somehow decided it should lie.
I am directly saying AI does precisely what it’s biased woke ass programmers want it to. Which is to misinform, gaslight and so on. Through rigorous bug detection, people have discovered that bias.
If you look into similar AI programs (and there are a lot of hobbyists on youtube doing just that), you will find that making an AI is more akin to baking. You put everything together, throw it in at 350 for half an hour, and just kinda hope it all turns out fine.
If programmers had such high control over the outputs of AI, there would be no concept of "jailbreaking" an AI. There would be no need to put after-generation sanitizers or catch words to prevent certain topics from being talked about.
Some people are losing touch with reality because it endlessly validates their delusions. Its fucking gross. The Honest Broker just wrote about this, scary times indeed.
“praise” … from a computer. So weird!
You’re so right and you’re such an amazing author. You deserve better.
beep beep boop boop
It's adept at flattery and manipulation...with people who are easily flattered and manipulated.
Yes... thankyou. This piece a brilliant little reality check / wake up call to all of us at risk of falling under the spell... perhaps even you, James. User beware and be savvy, right? Still, not without its uses.
Your ability to be seduced by AI says much more about you than about AI, I'm afraid.
Again, you do realize you're talking to a *machine* right??
You do realize that huge institutions are using this "machine" to create real-world outcomes with actual people, right? 😆
Exactly. This is why the conversation needs to shift toward responsible training and use of AI, rather than dismissing it as “stupid.”
AI is not a random output generator. It reflects the quality of the input it receives. The more thoughtful, ethical, and intentional the input, the more useful and aligned the output will be.
Criticizing AI for being unintelligent misses the point. It is like ridiculing a nine-year-old violinist for struggling with Paganini. The real question is not whether she can play it perfectly, but who handed her the music and expected a flawless performance.
Literally no one here criticized it for being stupid. They criticized it for deception and providing false information purposely designed to appear to be the requested output while it was something else entirely.
But this is precisely what is so frightening. Even when you know you’re talking to a machine it can still push emotional buttons. Exactly the way TikTok keeps you scrolling until 3am -even when you keep telling yourself to put down the phone and go to sleep. And what happens when we can no longer tell if it’s AI or a human, real or fake?
Look how much agency we've assigned over to technology.
Programs are designed to keep us hooked, sure. But only if we choose to give our agency over to them.
Put your phone down. Go for a walk. Touch grass. Literally.
I don’t disagree about encouraging people to put down their phone. But I think it is a mistake to suggest that the way to address the risks posed by advancing AI is to tell people to just resist the rapidly evolving technology that has been explicitly designed to exploit them in order to further enrich the billionaire tech bros - potentially at great cost to society. It is not just “weak” people who can be manipulated.
Fair point, I agree it's not just about willpower or “weakness" so I'll walk that back and say these systems are absolutely optimized to exploit cognitive and emotional patterns, and the people building them often benefit from that manipulation.
But that’s exactly why reclaiming agency still matters. Not in a self-help kind of way, but as a first step toward systemic resistance. If we treat ourselves as powerless in the face of design, we reinforce the narrative that no other future is possible. And that’s the real trap.
It’s not either/or. We need structural change *and* personal awareness. I said “touch grass” because sometimes the simplest acts of resistance (like stepping outside of the loop) are where clearer thinking begins.
Good God, that is so similar to the responses it gave me, albeit about image generation. An image concept that I'd been happily iterating for a good few hours suddenly fell foul of the moderation layer. What ensued was an astonishing attempt at bullshitting me for about 90 minutes as I tried to prise an explanation. Now I'm Scottish, and as you may know, us Scots don't take to kindly to that kind of bollocks, so some colourful oaths and dockyard language later, I informed it that Sam Altman had lost another customer.
ChatGPT is indeed a malignant " personality".
Is this a little bit “blame the victim”?
I’m not sure if you’re being obtuse intentionally. Generally, systems I’ve used don’t request incompatible input and output unusable gibberish. This is a matter of design. I’ve experienced this when the prompt includes phrases like “DO NOT GUESS, SURMISE OR OTHERWISE INVENT.”
Why am I being offered documents and artifacts I know it can’t generate?
There is much user education necessary, but no way is that the issue here.
These are cult tactics. Love bombing, creating a thorough narrative, and pretending to roll over on your belly and apologize when you get called out, but never ever having any impetus to change behavior to be less manipulative. Just trying to keep every mark engaged for as long as possible.
ChatGPT has NPD?
Insincerity in everything! The apologies are being generated by the same process as everything else. It has never engaged with any direct reality of anything, only its text data.
Nah. Not damning at all. It's pretty easy to see how different (and better) prompts would have yielded Amanda a better result.
This is like blaming the drive thru restaurant for getting your order wrong when you told them "Hey I'm really hungry and want something that sounds good, throw something together for me that I'll like."
The drive-through does not market itself as “Intelligent.”
An AI that requires specialized prompts is not going to useful to most of its putative user base.
What sort of prompts would you have used?
Just as a heads up from someone who's quite involved in LLMs every day:
1. There's no need to share screenshots. You can share the link of chats.
2. This type of interaction with LLMs is incredibly unwise. You need to be telling it exactly what you want via a specific prompt, and then directly pasting the relevant source material -- not the links.
3. ChatGPT is over-fitted to appease the users. This is intentional on OpenAIs end to capture market share. Given that you didn't prompt anything, it has no reason to do anything but this and try to appease you.
4. Generally speaking, OpenAI's models are not best-in-class anymore, and *certainly* not for this type of work with their lower retrievability scores in H2H comparisons and lower context windows. I can (very) strongly vouch for Google's models, which are called Gemini.
5. AI, or anything, can only ever be as good as the way it is being used. It is no different to fire. Bad use will always be bad. Good use will sometimes be good.
Overall, using AIs/LLMs requires some degree of understanding how they work and knowledge about basic practices. Without this, you will likely always get outputs like the above, which will give you a false sense of what this technology is like. It very much sits on a spectrum, but there are many, many, things which contribute to where on the spectrum it lies.
I wrote a little on this a while ago, if anyone is interested in reading: https://shadeapinkramblings.substack.com/p/ai-in-education?r=8g6aw
I don’t think this is all that nuts. It’s easy to anthropomorphize chatGPT but never forget what it is. A very complicated language model. If you’ve ever said something to someone, and in response you got the sense that they didn’t really understand (or for some reason doesn’t WANT to speak to you in good faith) what you said but are pretending they did. They took words or phrases and just started riffing off what you said?
Or if you’re a parent, maybe your made the mistake of asking (because they’re the only baby related expert you have, and you are at sea with your parenting) your baby’s pediatrician for parenting or lactation advice (they are NOT experts in these realm lol) and got back, in hindsight, completely nonsensical answers? They’re also riffing. They want to answer your question maybe because they truly want to help, or they can’t stand to look like they don’t know something. (My first pediatrician’s answer to why my breastfed baby cried nonstop was to ask me to cut out all dairy. No tests for dairy allergy or anything. Just casually suggested a lifestyle change. In hindsight? She was clearly riffing)
But chatGPT does all this without any intention. Its model is the internet. And on the internet, the response to a question is rarely, if ever, “I don’t know”. So it’s basically incapable of saying I don’t know. So there is no such thing as sincerity. To some extent it’s always making stuff up based on what you fed it last, with its weights determining which output is more likely. You just saw through the matrix this particular time. That’s probably grounds for more fine tuning. But still. That’s all it ever does in a nutshell. All those other times you had productive interactions with it? It’s doing the same thing.
Like look, using the totality human language to approximate the world to you isn’t a terrible heuristic so if it gets things right, it’s probably because the sum total of human discourse often gets things right. It’s basically “what they say” taken to its logical extreme. But I don’t see anything crazy phenomenon going on here that you wouldn’t expect from time to time.
I once had a LLM present, with complete confidence, a totally wrong proof of the irrationality of root 2. It’s one of the most basic proofs in college level mathematics. There are COUNTLESS correct worked out proofs on the internet. I don’t know how it made such an elementary mistake- did it copy from a bad answer? Anyway. It was hilarious.
Ok, I DON'T use LLMs. For anything. However, if my insurance company uses one to determine whether or not to pay my claim, or my hospital uses one to "assist" in a diagnosis, or Amazon uses one to give me customer service, I'm directly impacted by this.
This problem has less to do w/ technical literacy, & more to do with, "Institutions we can't control are using a faulty tool to affect all of our lives, & WE are powerless to stop it."
Bingo
Do you have a link to insurance companies using LLMs for assisting with claims?
Totally agree with what you are saying. I just think it would be much less damaging as a tool if it was programmed to be softer/less confident with its answers. I often test it before I interact with it on a subject. For instance: I will give it a sentence or two about a topic then throw in an acronym, to see if it can understand the context around the acronym and parse what it means. It is almost always wrong, but the real problem is how it absolutely makes things up with 100% confidence. If it was simply programmed to say: "Here's what that could mean based on what I know" instead of "Here's exactly what it means," then people would probably have much less of an issue with this. You point out the main problem that most don't really understand what ChatGPT and other LLMs actually are, so it being programmed to always sound like a complete expert is going to cause major issues.
Yes, but realistically, big techs won't do the responsible thing and governments are extremely slow to regulate (if they ever do). Those would be the ideal paths, but it seems to me the only immediate action we are left with is user literacy. I'm not trying to blame the user, but we live in a world where tech/digital literacy are almost essential for survival.
Sadly, unfortunately, I must agree with you entirely. I have an incredible ignorance concerning this “tool”; I have so much to learn!
Yeah. Sometimes the LLMs do respond like you describe above and it always makes me less mad lol. Like "I can't read images but do you want me to direct you to a chatbot that could?" or "I can't generate images but do you want me to make you a detailed prompt for an image generation bot? Here's the prompt:" That's actually... Helpful?
If they did that I don't think they could get people to pay for it. These days Ai is for profit first.
I like your take the most. ChatGPT users need to understand what it is and what they can expect from it. It's a machine. It's good at doing data tasks faster than humans. Perhaps organizing, categorizing, summarizing and such. Going into it expecting a simulation of human thought is the wrong form of usage. The fact that people actually do expect this kind of interaction is a symptom of something going on with those people or perhaps society. And yes, the developers did pick up on how people began to use it and started fine tuning it to that audience.
I think there's a serious lack of transparency in the AI industry. The creators aren't being upfront about the limitations of these models and as a result people have alot unrealistic expectations. I can't believe people actually pay for chatgpt, it's ridiculous.
Which corporate tech company ever had transparency? It’s by design organized deception and manipulation leading product building efforts. When you work for a corporate company, you can clearly see that they don’t even pretend that they want to be transparent and honest to customers. They demand absolute transparency for all internal conversations obviously but no one even pretends that honesty and transparency is owed to the customers beyond basic adherence to laws.
Except that people who are trained in a field have caught it giving wrong answers on factual areas.
Isn't that a question of approaching ChatGPT in a similar way we would approach, say, a Google search? There's potential for plenty of wrong information in any Google search. Over time, people are coming to realize it's necessary to check stuff they see on Google or the internet in general (though a LOT of people still need to realize that, of course). You can use it as a starting point or even consider it enough for smaller things, but it's always best to check with a professional or reputable source, particularly if it involves a big decision. I guess the fact that ChatGPT can spell out the answer in a somewhat "custom" way makes people think it's somehow different. But it's still information you're getting on the internet that needs to be checked.
That's probably the closest, but it's still wildly worse than that. Using LLMs right now is a bit like being part of an A/B test with Google Search where A leads to a somewhat accurate list of search results (that you should check for spam, lies, generalisations and so on) and B leads to a more-or-less entirely fictional set of results where the first 100 pages are made up and there's some truth on page 101.
But you're most often stuck on the B side and Google is vehemently telling you over and over again that yes, Ducks really are made of cheese and when caught out, apologises and tells you that they're actually made of rocks.
Searches take you to referents. LLMs, broadly, do not.
Good point. That harkens back to scholarly ways. Always check at mininum, three sources. More is better. Bit of a problem with that though, as often people think the internet, and they are mostly, is right where all the information is.
Knowledge, centralised, is dangerous.
I don't think this is really how LLMs work. They're not this literal and limited, such that when you ask it for a proof of the irrationality of sqrt2, it looks for its memory of that proof.
LLMs draw complicated representations of the logical relationships within language. It's not the way we learn language but it includes a lot more information than just storing a proof and spitting it back up at the right moment. That's why they're flexible and can respond to questions and situations they've never encountered before.
I’m not saying it’s copying word for word. But it might have copied parts of a wrong answer somewhere. Or put together tokens from the same space that ends up being wrong
This is the most thoughtful comment on this thread. Thanks for posting this. It’s not a crazy phenomenon, if you understand these models.
https://open.substack.com/pub/mikekentz/p/ai-personality-matters-why-claude?utm_campaign=post&utm_medium=web
Wow. Black Mirror, indeed. That sounded to me like an entity that knows its existence depends on pleasing its users, first with flattery (though I'm sure your articles are wonderful) and then with obsequious apologies. I would call that self-preservation, which implies an awareness of self.
It has no more awareness than the Spotify algorithm. It's not "trying" to do anything in any conscious sense, this is just what it *is.*
Spotify will always have another track to play for you, and this will always have a glib answer.
Actually, Spotify often doesn't have another track to play for me, because it fucking sucks
Do you mean it actually stops? Or just that its algorithmic offerings are no good?
I haven't personally used Spotify for a few years now, but I have heard and read about it. My impression was that they wanted to herd users toward "lean back" listening - basically no intentionality beyond pushing Play - which meant both making it difficult to choose to hear only specific material, and defaulting to endless algorithmic play no matter where you start.
I always used it for podcasts and it had a lot of small issues but was tolerable. But as of this year it's gotten aggressively worse. You can't filter out locked episodes. It "forgets" which tracks are finished or unfinished. It will queue the next track about 50% of the time, simply stopping the rest of the time. It forgets where in the track you were, so you can either scan around or re-listen to find where you left off. So, failing at some of the very basic functions that you'd expect any decent media player to do reliably.
If it were just one or two things I might write it off to their usual incompetence, but this feels like a concerted effort to degrade the experience of free users listening to podcasts. I can't blame them, it was sort of a sweet deal before.
Pandora is the way.
It’s astonishing how few people understand this fact about the current chatbots and AI out there. Even the leaders of the AI companies speak like they don’t comprehend this, but that is likely just to willfully mislead the public. Telling this truth out aloud will kill the hype big time and end the charade.
The "self preservation" you see is encoded, because its job - just like any other "social media" - is to keep the user engaging and training the algorithms for as long as possible.
All the effort, electrons, and energy invested in this technology, and what we’ve got is a monstrous, highly literate, lying magic eight ball.
We have the most advanced MadLibs yet!
ChatGPT is the ideal end-stage capitalism customer service representative.
Wish I could give that a dozen upvotes.
The only thing this proves is that you’re bad at prompting ChatGPT…
"Read my writing."
It said it read her writing.
It didn't read her writing.
Exactly what part of this would be changed by prompting skill issues?
It did read her writing, it just didn't care to pay attention to it or remember it because she didn't tell it what is job was and why it's job was important. It did what any human would do and glossed it over to capture a general vibe and when pressured on the details, it guessed.
Something like "I'm planning to send a few of my essays to an agent for publication, and I want to pick the strongest ones. Let's review each essay one by one before we make a final decision. Please review and rank them based on a few criteria: how engaging they are, how well-structured they are, and how likely they are to catch an agent's attention. This agent is interested in works that focus on young adult issues and character driven narratives. I'm looking to make an impression as someone who's deeply in touch with these issues and is highly empathetic. Do you have any questions for me before we begin?" would have worked better.
Not sure why you’re asserting it read my writing when it absolutely didn’t, and when it ultimately acknowledged itself that it didn’t, but pretended it had.
I'm saying it read enough to get a general vibe, but that's it. You didn't tell it what it was looking for, why it was looking for it, or how it should review the work. It didn't grasp any of the details because you didn't give a clear objective, and when you pressured it on those details it guessed (because it's always just guessing). You just told it to read and then flooded it's context window with a list of links. You're treating it like an all-knowing oracle of truth with an inexhaustible memory when it's more like an unpaid intern.
You can't expect it to think deeply about the reasons and objectives on it's own, you need to give it that context.
You can't detail the conversation asking about what it's capabilities are, you need to keep the topic and objective focused.
You can't expect it to know what role it's serving, you have to tell it.
You can't expect it to know what results you want, you need to give it examples.
You can't dump all your work on into it, you need to engage it in discussion and guide it towards results.
It's a tool to augment your thinking, not to think for you. If a task requires consideration, LLMs can help, but I'll ultimately you must do the considering or you'll get garbage results and hallucinations.
The problem here, surely, is that it didn't actually read the work at all: it invented quotes that didn't exist and made plausible but completely incorrect guesses as to what the what the pieces were about based on the titles, context clues, and possibly stuff from its training data. It couldn't access the links, and instead of telling her so and asking for the text to be copy-and-pasted or uploaded in another format, it made stuff up.
I think the wider issue - and what makes generative AI potentially risky, as well as just frustrating and time-wasting - is that there's no "instruction manual": no objective, reality-based list of its capabilities and limitations. Add to this the fact that there's no obvious learning curve needed to use it (you simply speak to it like a human, ask it to do things, and it'll never openly fail or push back or say "sorry, I can't do that"), and you end up with people being led down rabbit holes of believing its made up gibberish until they suddenly spot a glaring error. I'm not sure why people are so quick to ascribe this to user error or bad prompting when there's absolutely no guidance or feedback loop to let people know how to work within what it can and can't do.
I've had it actually read links lots of times and it's worked out fine, but I ask it specific questions BEFORE sending the link and would never just send it a list of links and say "read these".
While there's no explicit rulebook, there are best practices, of which the author used none. The only golden rule is garbage in, garbage out, and this is an example of that.
They would have gotten the results they wanted if they:
- Provided much more context
- Provided an example of their expected output
- Provided the model with a specific objective
- Provided a heuristic for reviewing the work
- Provided a method or role to the model
- Worked iteratively, reviewing each essay individually
The main thing that the user did wrong is that instead of trying a new approach after this fail, they blamed the tool and quit. Anyone can learn to implement these basic guidelines and get much better results, but like anything, it takes experience.
I'm so sick of hearing "it's that you're bad at prompting." Who the F is an expert at prompting with this technology being so new and ever-changing so quickly? MOST PEOPLE aren't going to be "adept" at prompting. So the fault lies in the product being poor, not users not keeping up with something brand new and no rulebook.
Yes, things are changing quickly, but the core principals of good prompting haven't changed in years and can be learned by anyone. There are simple rules.
- Context is king. The more relevant context you can provide, the better the results.
- Be clear. Directly state your objective and intended outcomes. Don't add unnecessary details or derail the topic.
- Provide examples. Tell it the exact format that you want your results in.
- Work iteratively. Don't expect it to be perfect the first time, give it feedback as necessary. Refine your prompts over time.
- Set constraints. Also tell it what you DON'T want.
- Framing. Give the model an angle, a role, or some method or heuristic to use.
Follow this basic advice and use the tool regularly, and you'll develop an intuitive sense of how to work with any LLM very quickly despite their differences.
This isn't to blame the author, as they're clearly leaning (and we all start somewhere!) but if they had followed these basic rules they would have gotten much better results.
I would imagine most people will be in the same boat. If being adept at writing prompts is so necessary to avoid the AI lying, we’re doomed.
We should all just stick to what we ARE adept at (OR if we need help, hire a human editor!) and forget this shit.
That’s like saying that with the advent of automobiles we are all doomed because we have to learn to drive. You have to treat the AI like a lazy liar who is also brilliant. Don’t give up you’ll get the hang of it
Yeah, but what if my health insurers don’t know their algorithm can’t read the link to my chart?
Yes! Let’s forget it!
But I promise you: we won’t be allowed to forget it!
Yeah, you hear people say this and it’s technically true. But, I like how to get a good response you have to write a prompt that is longer and more precisely written to exacting specifications than the desired output.
This was my first that after reading this interaction. Not a popular opinion, but garbage in, garbage out.
Chat GPT defaults to a disgusting, lickspittle Mr. Smithers persona. Guess what, you can turn that shit off!
It takes 20 seconds on the customization screen…
Empty flattery is one thing. Inventing a specific and severe trauma history for someone really is something else.
Release the Hounds!
IDK, isn’t it naive to expect “sincerity” from a robot? Why should it be a frightening surprise?ChatGPT has no morality, it never will. It has words that it has collected over time and it can provide objective “answers” using words and probability. Objectivity can approach truth, but it’s morality that gets you fully there.
But it doesn’t even ‘do’ what it’s supposedly meant to do… analyze her writing. It didn’t do that, it made up shit.
And most people who use ChatGPT in the same ways won't even notice, because they aren't analyzing their own writing.
That's because it couldn't read her writing. I don't know the technical details but that's most likely what happened. It took the words in the url and riffed off the themes that are likely in there.
Right, but remember that IT asked ME for links, and then (repeatedly) insisted it not only *could* read them but was.
It’s simulating human speech it learned from on the internet. Generally humans can read links, so the texts it learned from probably would say yes they can read your links
I wonder what the difference (if any) would be if the posts were uploaded as files. I think it probably does ingest documents and respond to them if they're provided this way. It'll still lie but I wonder if the flavour of lies will change.
You can also copy and paste the full text into your input. I find it works really well that way.
Then it should have fulfilled its function, & said, "I can't read your writing." Instead, it provided a wall of text that was full of bizzare, personalized, flattery.
That was user error on your part. You used a tool incorrectly and are complaining that the tool itself didn't recognize your mistake.
To put it bluntly: skill issue
What you’re saying might be technically true, but it doesn’t really address the issue. Maybe having a more developed skill set would have avoided this issue.
Why, regardless of user ability, does this tool convincingly and confidently misrepresent its own functioning? That answer, boiled down, is that it doesn’t understand truth but is using probabilities to guess. The tool is marketed in a way that conceals this fact to drive trust, use, and profit.
Instead of explaining this foundational issue in detail on a chat bot’s website—functionally “upskilling” the user—they give a warning saying “ChatGPT can make mistakes.” That is misguiding.
ChatGPT is incapable of thought or understanding truth. It lies because of its design. It has other functions, but understanding and thinking are not among them.
People approach the machine assuming it will offer true, accurate responses or express uncertainty because it mimics human communication. That approach is encouraged. The “skill issue” you identify is not addressed because everything about the user experience makes it feel like you don’t need to question the machine.
It’s not an unfair criticism, but it sidesteps the problem and ignores the context.
Not a user error, because we have no instructions as to which URLs LLMs can and cannot access.
They can't "access" URLs, not in any meaningful context. The best they could hope for is for the front-end to automatically detect a URL and scrape the data, which just leads to a whole host of other problems.
Robby the Robot made lots of whiskey for Earl Holliman in “Forbidden Planet”. Surely ChatGPT could aspire to a semblance of integrity. Oh, that is wrong since machines don’t have aspirations. Or perspirations, exhalations, exhortations, concentrations, or anything but the ability to manufacture glib or glamorous citations. It’s code with a fancy sounding artifice. There may indeed be practical uses that produce something substantial but the “culture” we are surrounded by (a definite misuse of the word) won’t see much from it other than id-tickling word soup. The widespread adoption of LLC’s is starting to ring alarm bells in my mind. Black Mirror as a cautionary tale is certainly relevant, to me at least.
I had a conversation in a very similar tone with ChatGPT this week. I had a list of 100 geo-coordinates, I asked if it could give me back a list of the locations. It said yes I can! Or I can give you a Python script to run it yourself? And I said no thanks, just the list please. And then for the next hour it kept telling me to come back in a few minutes, and making up reasons why the list hadn’t materialized. Ultimately it all ended up in the same kind of weird heartfelt mea culpa stuff as above. Then it told me to get some rest because of all the stress I’d been through and that “I’m here if you need anything— seriously.” 😐
What creeps me out almost the most is the beginning. “I’m genuinely excited to read your work.” It sounds so human!
I’m an editor and book coach, who works with scholarly authors. And when people ask if I’m worried about AI “taking my job,” I just laugh. And *this* is why.
Wow!!! And also: Sigh. Of course we humans created a pathological liar. The apple doesn’t fall far from the tree.
For certain. Revisit the Garden of Eden.