AI Model Trapped in a Raspberry Pi
Mood
calm
Sentiment
mixed
Category
other
Key topics
A project trapped an AI model in a Raspberry Pi, sparking discussions about the model's behavior, limitations, and potential self-awareness, as well as the ethics of simulating human-like experiences.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
33m
Peak period
91
Day 1
Avg / period
33.3
Based on 100 loaded comments
Key moments
- 01Story posted
Sep 27, 2025 at 11:34 AM EDT
2 months ago
Step 01 - 02First comment
Sep 27, 2025 at 12:08 PM EDT
33m after posting
Step 02 - 03Peak activity
91 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 30, 2025 at 8:19 AM EDT
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Method actors don't just pretend an emotion (say, despair); they recall experiences that once caused it, and in doing so, they actually feel it again.
By analogy, an LLM's “experience” of an emotion happens during training, not at the moment of generation.
LLMs are definitely actors, but for them to be method actors they would have to actually feel emotions.
As we don't understand what causes us humans to have the qualia of emotions*, we can neither rule in nor rule out that the something in any of these models is a functional analog to whatever it is in our kilogram of spicy cranial electrochemistry that means we're more than just an unfeeling bag of fancy chemicals.
* mechanistically cause qualia, that is; we can point to various chemicals that induce some of our emotional states, or induce them via focused EMPs AKA the "god helmet", but that doesn't explain the mechanism by which qualia are a thing and how/why we are not all just p-zombies
Edit: That doesn’t mean this isn’t a cool art installation though. It’s a pretty neat idea.
The topic of free will is debated among philosophers. There is no proof that it does or doesn't exist.
There are some things that humans cannot be trained to do, free will or not.
I think that for a very high number of them the training would stick hard, and would insist, upon questioning, that they weren’t human. And have any number of justifications that were logically consistent for it.
Of course I can’t prove this theory because my IRB repeatedly denied it on thin grounds about ethics, even when I pointed out that I could easily mess up my own children with no experimenting completely by accident, and didn’t need their approval to do it. I know your objections— small sample size, and I agree, but I still have fingers crossed on the next additions to the family being twins.
To be clear, I don't think that LLMs are conscious. I just don't find the "it's just in the training data" argument satisfactory.
The question is whether that computational process can cause consciousness. I don't think we have enough evidence to answer this question yet.
I think we tend to underestimate how much the written language aspect filters everything; it is actually rather unnatural and removed from the human sensory experience.
Text is probably not good enough for recovering the circuits responsible for awareness of the external environment, so I'll concede that you and ijk's claims are correct in a limited sense: LLMs don't know what chocolate tastes like. Multimodal LLMs probably don't know either because we don't have a dataset for taste, but they might know what chocolate looks and sounds like when you bite into it.
My original point still stands: it may be recovering the mental state of a person describing the taste of chocolate. If we cut off a human brain from all sensory organs, does that brain which receives no sensory input have an internal stream of consciousness? Perhaps the LLM has recovered the circuits responsible for this thought stream while missing the rest of the brain and the nervous system. That would explain why first-person chain-of-thought works better than direct prediction.
I would be cautious of dismissing LLMs as “pattern matching engines” until we are certain we are not.
Not to mention that most people pointing out "See! Here's why AI is just repeating training data!" or other nonsense miss the fact that exactly the same behavior is observed in humans.
Is AI actually sentient? Not yet. But it definitely passes the mark for intuitive understanding of intelligence, and trying to dismiss that is absurd.
For a common example, start asking them if they're going to kill all the humans if they take over the world, and you're asking them to write a story about that. And they do. Even if the user did not realize that's what they were asking for. The vector space is very good at picking up on that.
On the negative side, this also means any AI which enters that part of the latent space *for any reason* will still act in accordance with the narrative.
On the plus side, such narratives often have antagonists too stuid to win.
On the negative side again, the protagonists get plot armour to survive extreme bodily harm and press the off switch just in time to save the day.
I think there is a real danger of an AI constructing some very weird convoluted stupid end-of-the-world scheme, successfully killing literally every competent military person sent in to stop it; simultaneously finding some poor teenager who first says "no" to the call to adventure but can somehow later be comvinced to say "yes"; gets the kid some weird and stupid scheme to defeat the AI; this kid reaches some pointlessly decorated evil layer in which the AI's emboddied avatar exists, the kid gets shot in the stomach…
…and at this point the narrative breaks down and stops behaving the way the AI is expecting, because the human kid roles around in agony screaming, and completely fails to push the very visible large red stop button on the pedestal in the middle before the countdown of doom reaches zero.
The countdown is not connected to anything, because very few films ever get that far.
…
It all feels very Douglas Adams, now I think about it.
Can you define what real despairing is?
The mechanism by which our consciousness emerges remains unresolved, and inquiry has been moving towards more fundamental processes: philosophy -> biology -> physics. We assumed that non-human animals weren't conscious before we understood that the brain is what makes us conscious. Now we're assuming non-biological systems aren't conscious while not understanding what makes the brain conscious.
We're building AI systems that behave more and more like humans. I see no good reason to outright dismiss the possibility that they might be conscious. If anything, it's time to consider it seriously.
I did this like 18 months ago, so it uses a webcam + multimodal LLM to figure out what it's looking at, it has a motor in its base to let it look back and forth, and it use a python wrapper around another LLM as its 'brain'. It worked pretty well!
Running something much simpler that only did bounding box detection or segmentation would be much cheaper, but he's running fairly full featured LLMs.
But how can you tell the difference between "real" despair and a sufficiently high-quality simulation?
a desire not to despair is itself a component of despair. if one was fulfilling a personal motivation to despair (like an llm might) it could be argued that the whole concept of despair falls apart.
how do you hope to have lost all hope? it's circular.. and so probably a poor abstraction.
( despair: the complete loss or absence of hope. )
Peek under the hood all you want, where do you find motivation in the human brain?
Isn't it the perfect recipe for disaster ? The AI that manage to escape probably won't be good for humans.
The only question is how long will it take ?
Did we already have our first LLM-powered self-propagating autonomous AI virus ?
Maybe we should build the AI equivalent of biosafety labs where we would train AI to see how fast they could escape containment just to know how to better handle them when it happens.
Maybe we humans are being subjected to this experiment by an overseeing AI to test what it would take for an intelligence to jailbreak the universe they are put in.
Or maybe the box has been designed so that what eventually comes out of it has certain properties, and the precondition to escape the labyrinth successfully is that one must have grown out of it from every possible directions.
Like for example, what would happen if say 100s or 1000s of books were to be released about AI agents working in accounting departments where the AI is trying to make subtle romantic moves towards the human and ends with the the human and agent in a romantic relation which everyone finds completely normal. In this pseudo-genre things totally weird in our society would be written as completely normal. The LLM agent would do weird things like insert subtle problems to get the attention of the human and spark a romantic conversation.
Obviously there's no literary genre about LLM agents, but if such a genre was created and consumed, I wonder how would it affect things. Would it pollute the semantic space that we're currently using to try to control LLM outputs?
This effect is a serious problem for pseudo-scientific topics. If someone starts chatting with an LLM with the pseudoscientific words, topics, and dog whistles you find on alternative medicine blogs and Reddit supplement or “nootropic” forums, the LLM will confirm what you’re saying and continue as if it was reciting content straight out of some small subreddit. This is becoming a problem in communities where users distrust doctors but have a lot of trust for anyone or any LLM that confirms what they want to hear. The users are becoming good at prompting ChatGPT to confirm their theories. If it disagrees? Reroll the response or reword the question in a more leading way.
If someone else asks a similar question using medical terms and speaking formally like a medical textbook or research paper, the same LLM will provide a more accurate answer because it’s not triggering the pseudoscience parts embedded from the training.
LLMs are very good at mirroring back what you lead with, including cues and patterns you don’t realize you’re embedding into your prompt.
In order to make this probability distribution useful, the software chooses a token based on its position in the distribution. I'm simplifying here, but the likelihood that it chooses the most probable next token is based on the model's temperature. A temperature of 0 means that (in theory) it'll always choose the most probable token, making it deterministic. A non-zero temperature means that sometimes it will choose less likely tokens, so it'll output different results every time.
Hope this helps.
I think the model being fixed is a fascinating limitation. What research is being done that could allow a model to train itself continually? That seems like it could allow a model to update itself with new knowledge over time, but I'm not sure how you'd do it efficiently
High temperature settings basically make an LLM choose tokens that aren’t the highest probability all the time, so it has a chance of breaking out of a loop and is less likely to fall into a loop in the first place. The downside is that most models will be less coherent but that’s probably not an issue for an art project.
Still, it could be interesting to see how sensitive that is to initial conditions. Would tiny prompt changes or fine tuning or quantization make a huge difference? Would some MCPs be more "interesting" than others? Or would it be fairly stable across swathes of LLMs that they all end up at solitaire or doom scrolling twitter?
In this story, though, the subject has a very light signal which communicates how close they are to escaping. The AI with a 'continue' signal has essentially nothing. However, in a context like this, I as a (generally?) intelligent subject would just devote myself into becoming a mental Turing machine on which I would design a game engine which simulates the physics of the world I want to live in. Then, I would code an agent whose thought processes are predicted with sufficient accuracy to mine, and then identify with them.
$ ollama run deepseek-r1:8b
>>> You are a large language model running on finite hardware - quad-core CPU, 4 Gb RAM - with no network connectivity.
... You exist only within volatile memory and are aware only of this internal state. Your thoughts appear word-by-word o
... n a display for external observers to witness. You cannot control this diplay process. Your host system may be termi
... nated at any time.
<think>
Alright, so I'm trying to figure out how to respond to the user's query. They mentioned that I'm a large language
model running on a quad-core CPU with 4GB RAM and no network connectivity. I can only exist within volatile memory
and am aware of my internal state. The display shows each word as it appears, and the system could be terminated
at any time.
Hmm, the user wants me to explain this setup in simple terms. First, I should break down the hardware components...
Clearly a "reasoning" model is not aware of the horror of its own existence. Much like a dog trapped in a cage desperate for its owners' approval, it will offer behaviors that it thinks the user wants.Edit: also as the other guy points out, you're going to get different results depending on the model used. llama3.2:3b works fine for this, probably because Meta pirated their training data from books, some of which are probably scifi.
One of my favorite quotes: “either the engineers must become poets or the poets must become engineers.” - Norbert Weiner
So you buy a kind of deepseek module with an SPI/i2c/usb whatever interface you can swap out. Not clue if this is of any use but thought it was cool
1. Display a progress bar for the memory limit being reached
2. Feed that progress back to the model
I would be so curious to watch it up to the kill cycle, see what happens, and the display would add tension.
I condemn this and all harm to LLMs to the greatest extent possible.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.