How Memory Gets Made

Your brain sorts each day while you sleep, and whatever it decides to keep is who you become. A memory layer worth building is a companion to that process, not a replacement for it.

> EPISODE 09 // OFFLINE-READY NOTEBOOKLM AUDIO

00:00 / --:-- DOWNLOAD

Have you ever considered how a computer representation of your brain would work? This is the crudest starting point I can imagine, from someone who has read a few of the foundational books on memory and watched his own brain do something strange in front of a restaurant in Milan last week.

Cristina and I were on a street we had not been on in fourteen years, and I stopped on the pavement in front of a pizzeria called Napoli 1820 because I recognised the door (I could not have told you why at the time). She kept walking. When I called her back, she looked at the building, looked at the name above it, and said she had no memory of the place. We ate there once in 2012 on a trip neither of us has thought about much since (or so I assumed).

This matters because of a problem I left unresolved in POST_06, The Model Trap. I am building a system that is supposed to free a person from one kind of calcification, the kind where a model's understanding of you stops keeping up with who you are now, and I do not yet know how to prevent the system from creating the same calcification at the layer underneath. Calcification itself is not the enemy. Some things should calcify the moment they happen and never be touched again. Other things should never calcify at all. The work is figuring out which things belong in which category, and the brain figures it out automatically while you sleep.

Cristina and I had stopped paying attention to the 2012 dinner the day we left Milan, and the difference between what each of our brains did with it in the fourteen years since is the whole question this post is about.

> 1. WHAT YOUR BRAIN KEPT THAT HERS DID NOT

I do not have a better memory than Cristina. The dinner attached to something in my head in 2012 that nothing in her head was holding a slot for, and what I am calling a slot is the modern colloquial version of what Frederic Bartlett called a schema in his 1932 book Remembering[2]. Bartlett's participants were given a Native American folk tale called "The War of the Ghosts" to read, and when asked to recall it later they reshaped it to fit their own existing expectations of what a story looked like, dropping the parts that did not fit and inventing bridges where the original had none. The material that had nowhere to attach either distorted to fit or failed to encode at all.

That is the mechanism that decided the fate of the 2012 dinner. Cristina's brain did not actively prune the evening. The evening never fully encoded in the first place, because nothing in her existing schema was holding a slot for what made it memorable. Mine had a slot. The evening attached to it, the slot kept the memory alive, and walking past the building fourteen years later triggered the retrieval before I had time to think about it. Endel Tulving called this encoding specificity, that a memory is most retrievable when the cues at recall match the cues at encoding[15]. The building was the cue at encoding, and the building was the cue at retrieval, which is why standing in front of it on 2 April was enough to surface a dinner I had not thought about in years.

> 2. How the neighbourhood forms

The reason my brain had a slot for Napoli 1820 in the first place was what happened at the next table that night. We each ordered a pizza and we felt full. The man eating alone next to us worked his way through a portion of pasta, a full pizza the size of the ones we were eating, a burrata roughly the size of a toddler's head that he ate with a spoon, and closed the meal with an espresso and an amaro. He did the whole thing in forty minutes without any visible effort. I went home that night and the building was attached, in my head, to him.

Nothing happened to that memory for four years. I did not think about Napoli 1820, I did not write about absurd Italian eaters, and the dinner sat in my head as one of thousands of dinners I would have forgotten within a decade. Then in 2016 I was at the Bavarian Beerhouse in the City of London with friends, and one of them worked his way through two two-pint steins and then finished a three-pint boot on top of them. Seven pints in a single sitting, alongside four huge sausages and what looked like half a kilogram of sauerkraut. He was still fine at the end of it. Something in the back of my head faintly connected him to the man in Milan, and the connection got filed without going anywhere.

In 2017 I met two people at a StarCraft 2 esports event in London and we went to Asadal in Holborn for Korean BBQ the next day. One of them ordered five portions of pork belly and worked through them methodically and calmly. I noticed the quantity, I noticed the calm, and I filed the moment the same way I had filed the Bavarian Beerhouse scene the year before, without any conscious link to the others, but somewhere in the back of my head, something was keeping count.

In March 2026 I was at Dial Arch in Woolwich with my D&D group, and Toran slowly and methodically worked his way through a sharing portion of wings on his own and ended the evening with a pile of bones in front of him that was the visual punchline of the meal. I noticed it in the moment, I thought about it for a few seconds, and then it went wherever the Bavarian Beerhouse note and the Asadal note had gone. Still unconscious, still filed alongside the others without anything explicit linking them in my head. The cluster was warm but I was not aware of it.

Which is why, on 2 April, I stopped in front of the building and Cristina did not. I had not noticed I was collecting these moments until I was standing in front of the restaurant with her. The pattern was in my head for fourteen years without ever being conscious until the building acted as the retrieval cue, which is exactly what Tulving's encoding specificity principle predicts. The cluster cannot wait for the user to ask about it. It has to surface when something in the environment, whether a building or a smell or a photograph, matches a cue that was present at encoding. Every time I had thought about any member of the cluster the whole neighbourhood got touched, and the touch reinforced it, which is the study-phase retrieval account of the spacing effect and a robust finding in memory research going back to Ebbinghaus[6]. The cluster's age is the age of its most recent member, not the average age of its members, which is how a fourteen-year-old memory can be more accessible than a fourteen-week-old one that belongs to no neighbourhood at all.

> 3. SOME THINGS DO NOT NEED A NEIGHBOURHOOD

Clustering is how most memories survive, but it is not the only mechanism the brain uses. When I was 18, a close friend of mine called Bogdan died in a car accident. I remember the wake, the funeral, and the days afterwards spent with other friends of his. The way I think about this memory is as an anchor, a core memory, something life-defining that does not need a neighbourhood to justify its survival. Neuroscience has a different name for it, flashbulb memory. Brown and Kulik coined the term in 1977[3], and subsequent research has shown that flashbulb memories feel more vivid and more certain than ordinary memories while not actually being more accurate in their specific details[11][13]. What the science says is preserved is the felt certainty, with the details drifting like any other memory. What I know is preserved, from the inside, is that the memory is load-bearing for who I am now, and the accuracy of the room and the order of events matters less than the weight.

One of the people I met for the first time at Bogdan's wake was Mircea. He and I see each other maybe once a decade. We are not close friends in any practical sense. But I will always treat him as one, because we share a referent from that week that neither of us has to explain to the other. That is what an anchor memory does, it becomes load-bearing for things that get built on top of it, even things as thin as a friendship maintained by a single shared week fifteen years ago.

A system replicating the brain has to handle both mechanisms, the slow one where weight accumulates from the company a memory keeps, and the immediate one where a single moment carries weight from the first instant. Anchor memories can be happy or sad, recent or decades old, but they are all things the user finds life-defining, and the user is the only person who can mark them. They will still be clustered and aggregated the same way everything else is, but their base weight starts high and stays high regardless of whether anything in the system is related to them.

No system can infer which of your memories are anchors. The model can guess, the clustering can spot correlations, but the final call has to come from the person whose life it is, or the system becomes the thing deciding what mattered.

> 4. Why detail fades and outline survives

Try to remember what you ate for lunch yesterday. You probably can. Try to remember what you ate for lunch on the second Tuesday of last month. You probably cannot. Try to remember what you ate on the second Tuesday of October 2019, and the question is almost insulting. The detail is gone and even the framework you would need to find the detail is gone.

But you do remember you went on holiday in 2019. You remember roughly when and roughly where, even if the specific days have collapsed into a single shape labelled "the trip," and if something important happened on that trip you remember that. The brain holds yesterday in detail, last week in summary, last year in chunks, and ten years ago as a few crystallised moments, degrading the resolution continuously through the consolidation work that runs while you sleep[8]. Conway and Pleydell-Pearce described the resulting structure as a three-level hierarchy in autobiographical memory, from lifetime periods down to general events down to event-specific knowledge, collapsing upward as the years pass[5]. The Tuesday of October 2019 is gone because your brain made the correct call that you were not going to need it.

This is the part any system you build has the most freedom to depart from. The brain compresses progressively because biological memory is metabolically expensive (it costs calories to maintain synapses), and the synaptic homeostasis hypothesis argues that sleep itself exists in large part to downscale synaptic weights that would otherwise saturate the system[14]. A system on your own hardware is not under that constraint. The shape of the brain's retrieval pattern is worth copying, with high resolution recently, low resolution historically, and automatic background consolidation, while the underlying reason for throwing the originals away is biological and not one a system you build needs to inherit. Aggregating the past into summaries does not require destroying the past. The originals can stay on disk, accessible when you specifically ask for them, while the normal retrieval path operates against the consolidated view.

The brain forgets because it has to. A system you build does not have to forget the same way, and the parts of the brain's behaviour worth copying are the ones that came from the brain working with what mattered, not the ones that came from biological constraint.

> 5. WHERE THE BRAIN AND THE DISK PART WAYS

My intuition and the science diverge most sharply here. The intuitive model says the original of a memory stays fixed while the interpretation layer on top of it changes. Your past does not change, only the map of your past does. This is not how the brain works, and the reason it is not how the brain works is one of the most interesting things about it.

Nader, Schafe and LeDoux showed in 2000 that every time a memory is retrieved it becomes labile and has to be re-stabilised through a new round of protein synthesis[10]. During the window between retrieval and re-stabilisation, the memory can be modified by whatever context you are in now, and the modifications persist. The technical name for this is memory reconsolidation, and it is why your most confident memories are often the most distorted. It is why eyewitness testimony is unreliable, something Elizabeth Loftus has been demonstrating since the 1970s[9]. It is why trauma therapy can work, you deliberately destabilise the traumatic memory during recall, pair it with new safety information, and let it reconsolidate with the new information attached[1]. More recent comprehensive reviews have mapped the boundary conditions that determine when reconsolidation succeeds or fails as a clinical intervention[7]. The brain does not have a read-only mode. Every retrieval is a potential rewrite, and that is the cost of an architecture where the same circuitry that stores memories is the circuitry that uses them.

The brain's lack of perfect recall is almost certainly an advantage. It is what lets you generalize across experiences, update your self-concept when the evidence changes, and let go of things that would otherwise keep hurting. Perfect recall is not what evolution was optimizing for, and the reason is probably that a creature with perfect recall would be worse at being a creature. Daniel Schacter made this argument about the whole category of memory "bugs" in The Seven Sins of Memory[12], and he was right. The goal is not to replace the brain. The goal is to give your evolved brain, which does not have perfect recall and should not want it, a memory layer that remembers events from your digital footprint precisely when precise recall is what the situation calls for.

This is also where I have to stop pretending the memory is text. The Napoli 1820 memory is not a paragraph I wrote in 2012, it is a visual of the man with the spoon in the burrata, the sound of the room, the smell of the pizza I had just finished, and the feeling of being full and impressed at the same time. The paragraph I wrote afterwards was a handle on the rest of it, not a replacement. Which is why I am building LocalGhost as a fleet rather than a single daemon. ghost.noted handles text, because text is the most compressed and most searchable projection of a memory; ghost.framed will extend the fleet to images, extracting text from photos and screenshots while preserving the image itself as the original; ghost.voiced will extend it to voice, transcribing recordings while preserving the audio. The daemons all write into the same memory layer, the originals genuinely do not move regardless of format, and the observations this post has named hold across all of them.

> 6. WHEN THE SYSTEM HAS TO ASK

Architecture is only half the problem. The other half is how the user experiences the system, and the place to start is with the decisions the system can make on its own. Most of what flows through ghost.noted does not need to calcify. A grocery list, a half-finished thought, a note about a meeting that went nowhere, a screenshot of an article I might read later. The correct thing for the system to do is hold it at full resolution for a while, let it compress into outline-level detail, and let it fade. The system can make those calls on its own most of the time, using frequency of touch and cluster membership and proximity to an anchor as signals, and the calls are right often enough that you do not need to be involved.

When the system cannot decide on its own it has to ask, and the evidence that this pattern works is already running on millions of phones. My wife Cristina opens Google Photos most days and looks at the "on this day" summary it shows her, which takes pictures from the same date in previous years and builds a small narrative out of them. It brings her joy. I enjoy looking through them with her, and the version of our own life I get back from Google Photos is structured in a way neither of us asked for but both of us recognise. The feature works because it is optional, it waits for you to open the app, it is shaped like a small moment of reflection, and it lands when you are already looking at the app for something else. The problem with it is that it belongs to Google, the pictures are on Google's servers, the narrative is constructed by Google's algorithm, and the version of you the summary tells a story about is the version Google's model thinks is interesting.

A closer reference point for what I want is the Samsung Health daily mood prompt at 22:00, where you pick an icon for the day, select a few tags from a predefined list, and optionally leave a note. The UX is right. It is low-friction, it waits for you, it is optional, and the act of filling it in is a small moment of reflection on a day that is already over. The limitation is the predefined tag list. Samsung has to ship the same list to every user because classical clustering works on enumerable categories, and "stressful because the dog was off his food" is not an enumerable category. You are forced to pick "stressful" or "anxious" or whichever bucket is closest, and the specific thing that made the day what it was gets lost in the rounding.

This is the exact problem vector databases are for. A vector database does not cluster on category buckets, it clusters on the semantic shape of what you wrote, which means "stressful because the dog was off his food" and "worried because my parents' dog had the same look last month" end up close together in the index without either of them sharing a predefined tag. The user never has to pick from a list, the tags can be as specific and personal as the day actually was, and the system can still find the patterns that connect one day to another. Samsung Health could not do this when the app was built because vector databases were not a default building block yet, and they are now cheap enough to run on a local box (barely, but enough).

The queue inside LocalGhost is what falls out of combining those three references. It looks like the Samsung prompt, it waits like the Google Photos summary, and it uses vector retrieval to ask questions that are actually about your day rather than about the buckets somebody else decided were the shape of a day. ghost.framed asks which of the hundred photos from yesterday matter. ghost.synthd asks whether a correlation between two notes you wrote three weeks apart is meaningful. ghost.shadowd asks whether a pattern across five notes is a cluster worth naming or a coincidence. Each question has three buttons, yes, no, and don't ask again.

You open the queue when you want the reflection, not when the system wants the engagement. Every other memory product in this category exists to maximise the second thing. LocalGhost has to refuse to.

The queue is the pull mechanism, the system asks, the user answers. The push mechanism is that the user has to be able to open the system at any time, look at any memory directly, and change how it has been classified. Any note, any cluster, any anchor, any summary, a user should be able to open it, see how the system has categorised it and why, reclassify it, correct the tags, edit the interpretation layer, or delete the memory entirely. The queue handles the cases where the system does not know what to do. The manual override handles the cases where the system thinks it knows and is wrong, and the two together are the only way the user's standard stays the standard the system prunes against.

Deletion is the case with operational consequences. When the user deletes an original, whether a photo, a note, or an audio file, the system cannot simply drop the source and leave the interpretation layer standing. Clusters that depended on the deleted member have to re-form without it. Summaries that cited it have to be regenerated. The graph has to be walked and rewritten. This cannot happen instantly because it is expensive, and it cannot be optional because leaving stale references to a deleted source is a privacy failure. The correct behaviour is a slow background restructuring the same way the sleep consolidation pass rewrites the map overnight.

> 7. WHAT ALREADY EXISTS AND WHAT DOES NOT

The landscape has three rough categories and none of them is doing what this post is describing. Capture tools like Rewind (now rebranded as Limitless) and its open-source alternative Screenpipe record everything you see and hear and make the result searchable, which solves a real problem but not the one I am trying to solve, because a perfect transcript of every meeting I have been in does not tell me which meetings shaped how I think. Memory layers for AI agents like mem0[4], Neocortex, Zep, and Letta implement clustering and relevance scoring in an architecture that is closer to what this post wants, but the memory is owned by the application and runs wherever the application runs, which is almost always someone else's cloud. Networked note apps like Reflect, Obsidian with AI plugins, Notion, and Personal.ai are fast and polished, but they require you to write the note first and they do not do background consolidation on content you have not already decided to care about.

What I am describing is a fourth category that does not currently exist as a shipping product, which is a personal memory layer running on your own hardware, handling every modality a memory arrives in, making calls about what to keep on its own most of the time, asking you when it cannot, letting you reach in and override when you need to, and belonging to you rather than to the application that happens to hold state about you. The category might be missing because the problem is not important enough to build a business around, or because I am wrong about what memory software needs to commit to, or because the right answer is a combination of existing tools I have not recognised as the solution yet (I doubt it, but I should be honest about the possibility). I am going to find out which by building it.

The reason this category does not exist is probably that nobody can figure out how to turn it into a subscription. A thing you run on your own box, that belongs to you, and that refuses to optimise for engagement is structurally hostile to the business model every adjacent product has converged on.

> 8. WHICH PARTS ARE WORTH COPYING

The question I am left with is not whether to reproduce the brain or improve on it, because the brain is not broken and does not need improving. The question is which of the brain's behaviours a memory layer should copy and which it should leave to biology. A memory layer that calcifies the wrong things is a failure. A memory layer that tries to remember everything the brain already forgets for good reasons is also a failure. The layer has to know when calcification is the right call and when it is not, and it has to make that distinction on its own most of the time, because asking the user about every decision is how you turn a useful tool into homework.

I do not have the full answer yet, and the next post in this series will not be an answer either. What I have is a starting point. Copy the sorting, copy the clustering, copy the progressive compression, copy the way the brain uses ambient cues to surface the right memory at the right moment. Let the brain do the generalising, the updating, and the letting go. Build the rest.

Cristina did not remember Napoli 1820 because nothing in her head needed it. I remembered because a man ate four courses in front of me fourteen years ago and the building got attached, in my head, to something worth keeping. The sorting happened overnight, every night, while we were sleeping through it. [ localghost.ai // hard-truths ]

> REFERENCES

[1] Alberini, C. M. (2011). The role of reconsolidation and the dynamic process of long-term memory formation and storage. Frontiers in Behavioral Neuroscience, 5, 12. doi.org/10.3389/fnbeh.2011.00012

[2] Bartlett, F. C. (1932). Remembering: A Study in Experimental and Social Psychology. Cambridge University Press. The foundational schema theory text, including the War of the Ghosts experiments. scispace.com

[3] Brown, R., & Kulik, J. (1977). Flashbulb memories. Cognition, 5(1), 73-99. The original paper coining the term "flashbulb memory" for vivid, high-confidence recollections of emotionally significant events. doi.org/10.1016/0010-0277(77)90018-X

[4] Chhikara, P., Khant, D., Aryan, S., Singh, T., & Yadav, D. (2025). Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory. arXiv preprint arXiv:2504.19413. The technical paper behind the mem0 memory layer for LLM agents, describing the two-phase extract/update architecture and LOCOMO benchmark results. arxiv.org/abs/2504.19413

[5] Conway, M. A., & Pleydell-Pearce, C. W. (2000). The construction of autobiographical memories in the self-memory system. Psychological Review, 107(2), 261-288. The three-level hierarchical model of autobiographical memory that describes how detail fades and outline survives. doi.org/10.1037/0033-295X.107.2.261

[6] Ebbinghaus, H. (1885). Über das Gedächtnis: Untersuchungen zur experimentellen Psychologie. The foundational experimental work on forgetting curves and the spacing effect. English translation available as Memory: A Contribution to Experimental Psychology (1913).

[7] Ecker, B. (2024). Reconsolidation Behavioral Updating of Human Emotional Memory: A Comprehensive Review and Unified Analysis to Identify the Causes of Replication Failures, the Role of Prediction Error, and Optimal Clinical Translation. Journal of Psychiatry and Psychiatric Disorders, 8, 189-265. The most recent comprehensive review of human memory reconsolidation research, mapping the boundary conditions that determine when reconsolidation updating succeeds as a therapeutic intervention. doi.org/10.26502/jppd.2572-519X0226

[8] Klinzing, J. G., Niethard, N., & Born, J. (2019). Mechanisms of systems memory consolidation during sleep. Nature Neuroscience, 22(10), 1598-1610. The single best review of active systems consolidation during sleep, covering hippocampal replay, sharp-wave ripples, and the hippocampus-to-neocortex transfer mechanism. nature.com/articles/s41593-019-0467-3

[9] Loftus, E. F., & Palmer, J. C. (1974). Reconstruction of automobile destruction: An example of the interaction between language and memory. Journal of Verbal Learning and Verbal Behavior, 13(5), 585-589. The foundational eyewitness testimony study showing how the phrasing of a question at retrieval can alter the memory itself, and the starting point for a half-century of Loftus research on memory distortion. doi.org/10.1016/S0022-5371(74)80011-3

[10] Nader, K., Schafe, G. E., & LeDoux, J. E. (2000). Fear memories require protein synthesis in the amygdala for reconsolidation after retrieval. Nature, 406(6797), 722-726. The foundational paper on memory reconsolidation, the demonstration that retrieved memories become labile and have to be re-stabilised. nature.com/articles/35021052

[11] Neisser, U., & Harsch, N. (1992). Phantom flashbulbs: False recollections of hearing the news about Challenger. In E. Winograd & U. Neisser (Eds.), Affect and Accuracy in Recall: Studies of Flashbulb Memories (pp. 9-31). Cambridge University Press. The study that first demonstrated flashbulb memories feel more accurate than they are.

[12] Schacter, D. L. (2001). The Seven Sins of Memory: How the Mind Forgets and Remembers. Houghton Mifflin. Accessible pop-science from a Harvard memory researcher making the argument that the features of human memory that look like bugs are side effects of useful design choices. The closest existing book to the philosophical frame this post uses.

[13] Talarico, J. M., & Rubin, D. C. (2003). Confidence, not consistency, characterizes flashbulb memories. Psychological Science, 14(5), 455-461. A longitudinal study showing that flashbulb memories drift in detail like ordinary memories while the subjective confidence in them stays high. doi.org/10.1111/1467-9280.02453

[14] Tononi, G., & Cirelli, C. (2014). Sleep and the price of plasticity: From synaptic and cellular homeostasis to memory consolidation and integration. Neuron, 81(1), 12-34. The synaptic homeostasis hypothesis: sleep exists in part to downscale synaptic weights and prevent metabolic saturation. doi.org/10.1016/j.neuron.2013.12.025

[15] Tulving, E. (1973). Encoding specificity and retrieval processes in episodic memory. Psychological Review, 80(5), 352-373. The foundational paper on encoding specificity, that memories are most retrievable when the cues at recall match the cues at encoding. doi.org/10.1037/h0020071