The Ghost Has to Be Useful on Day One
A long onboarding flow is a team admitting they don't know who they are building for.
Cristina and I sat down for lunch in Assisi at AssaggiAssisi. I'd been up late writing the ghost.cued post and I was still turning it over in my head, so I started telling her about it. The daemon that reads your environment and surfaces the right memory at the right moment. She asked how it would know what to surface and when. I said from the years of material you've already built up on your devices.
She said that sounded fine in theory but her point was that Gemini already knew her. Her health, her sleep, her plans, her interests, all of it (built up from years of her talking to the thing). She assumed my new daemon would need the same multi-week ramp she'd gone through with Gemini (the dynamic I'd covered in the model-trap post), and she wasn't going to sit through it again just because I had built something new. She'd rather spend twenty minutes on install day filling in a form and have something that fit her from the start than get a cold-start daemon that only became the ghost she wanted some time in May. I offered a personality test as the short version of that, Big Five or similar, twenty minutes on install day and the ghost has a seed. She thought about it and dismissed it as too pigeonhole-y. A personality test files you into a category on day one, and the moment the ghost thinks of you in those terms it's stuck with the wrong frame until you notice the mistake and correct it. She was expecting a setup phase, and I spent the rest of the lunch trying to convince her we didn't need one, while also trying to think my own way out of needing one, because the context should already be there in whatever devices the user has attached.
Cristina said "too pigeonhole-y" at the table and moved on, but I went home and spent a few hours looking into it, because a short fun test really would have been the easiest way to seed the ghost. The idea kept falling apart the more I read. Big Five reduces you to five axes the field settled on around 1990 [1]. Myers-Briggs and Enneagram do the same thing with different categories. Any of them put you in a bucket the moment you fill in the form, and the ghost then filters everything it does through that bucket. The bucket becomes the daemon's default, and it stays that way until you catch the daemon doing something wrong, six months later, and realise it still thinks of you as whatever the test said on day one.
The problem gets worse the more specific the category is. A setup form might have a tick-box for ADHD, autism, dyslexia, or one of the many other conditions a person might be navigating, and the ghost would read the tick-box and adjust. But we're all on some spectrum somewhere, weird in different ways. Two people who tick the same box can be almost nothing like each other, and a daemon that treats them as the same category is flattening the thing LocalGhost exists to preserve.
Cristina's instinct was right. The test itself is broken as a tool for seeding a personal daemon, no matter how short or fun I made it. That answered the narrow question. Once that one was closed I moved on to the wider one, which was whether any form belongs on install day at all.
The wider answer is also no. The memory post I had finished a few nights earlier lays out how the ghost should interact with the user, and none of the patterns it describes look anything like a form on install day. A form asks you to rate yourself against axes somebody else designed, with no context in front of you, and it does it before the ghost has shown you anything at all. The patterns in the memory post are the opposite of that, and they are where the question-asking belongs once the ghost is running.
Trello before Atlassian bought it, and Linear now at $1.25 billion [8]. Both shipped without an onboarding form. Jira has been asking users what department they work in since 2002 and the wizard is still getting longer. LocalGhost.ai has to sit where Trello and Linear sit.
A company that has to ask you a wall of questions to work out who you are is a company staffed by people who don't know their own product and don't know who it is for (Atlassian, cough cough). Real employees know the audience because most of the time they are part of the audience. I have shipped onboarding forms myself, under deadline pressure, at CCData, and I knew the form was the shortcut every time. Onboarding flows are lazy, inelegant, and offensive to the user. The lunch with Cristina, and the time I spent on it afterwards, is what made me realise I do not want to build one into LocalGhost.
If not a form, then what. The answer is inference from the behavioural traces the user has been producing on their own devices for years, and the evidence that this works is a decade old. Kosinski, Stillwell, and Graepel showed in 2013 that Facebook Likes alone could predict sexual orientation, ethnicity, political and religious views, personality traits, intelligence, happiness, substance use, parental separation, age, and gender to 80-95% accuracy on the binary outcomes [2]. Two years later, Youyou, Kosinski, and Stillwell ran the follow-up that still anchors the field. A computer model using Facebook Likes judged personality more accurately than the participants' own friends and family. Ten Likes outperformed a work colleague. 150 Likes outperformed a parent or sibling. 300 Likes outperformed a spouse [3]. The point wasn't about Facebook. Behavioural traces predict personality better than the people who know you personally, and they do it without anyone sitting you down to take a test. That is the reason I think LocalGhost is worth building, and also the reason I think the space it addresses will be taken over and exploited by big tech, as I wrote in the inflection post. The same inference that makes the ghost genuinely useful on local hardware is what the cloud platforms will want to run against everyone who has not yet installed a local one.
The neurodivergence point is backed by the last decade of digital phenotyping research. Perochon et al. (2023) in Nature Medicine screened autism in 17-36-month-olds using tablet-based behavioural measures, AUC 0.90 [4]. Lee et al. (2023) used wearable data to detect ADHD and sleep problems with AUC 0.798 [5]. Casals, Larsson, and Hansen (2025) built a smartphone-sensor ADHD predictor hitting 80.8% sensitivity and 79.5% specificity, with the built-in sensor features adding meaningfully to a cognitive-test baseline [6]. Neurodivergence has a behavioural signature that shows up in how people move, read, respond, and attend. A device you already own produces enough of that signature to classify accurately in a research setting. What the ghost infers on any one user is a guess. The ghost has to treat it that way, a hypothesis that updates the moment the user corrects it and never a bucket that sticks.
Put the studies together and the picture is that behavioural signal lives in almost everything the user does with a device. How you write, what you read, what you rewrite, and how long you leave things in draft. How you walk, how you exercise, how you sleep, all of it picked up by the wearable on your wrist and the phone in your pocket. How you communicate, who you reach for and how often, which Stachl et al. (2020) found was the single highest-signal channel for personality prediction across a month of passive smartphone sensing [7]. The photos you take and keep and return to, the music you play and skip, the way you tap on the phone itself (which in the Casals study added usable signal on top of a dedicated cognitive test). None of it is a personality test because none of it is you rating yourself. All of it is what you have already done, for your own reasons, left on devices you already own. A ghost that can read that footprint locally has more to work with on day one than a cloud assistant that just met you has after a month of conversation.
The ghost is a fleet of daemons, one per modality, each of which has a README in the repo. ghost.noted handles text, ghost.framed handles images, ghost.voiced handles audio, and ghost.synthd sits above them. They run on a NAS, either one you already own or one we sell you. Each of your other devices runs an app that pairs with the NAS on first run over the local network and opens a VPN tunnel back to it. After that the app works from anywhere. On a train, in a hotel, at a cafe, it stays paired with your NAS and pushes its modality over the tunnel, where ghost.synthd turns what comes in into a model of you. The same app is how the ghost reaches you when it has something to say and how you reach it when you have something to ask.
One device is enough to start getting value and you can add the rest over time, or never, and the ghost still works. If you do not want to install the app on any device at all, the ghost builds from what you tell it in chat on the box itself, which is slower but still works. Most people will install the phone app at least, because the NAS is already theirs and a NAS is a product category users recognise. You get a local archive of your own stuff as a side effect, your photos, your notes, your writing, your music, your health data, all indexed locally and searchable by you.
The approvals work the way the platform already does them. iOS, Android, macOS, the wearable platform, each asks about its own permissions the first time the ghost needs them, one at a time. No central form, no LocalGhost-specific consent surface pretending to understand the thirty things being granted. Revoking is uninstalling. Everything the ghost sees lives on the NAS. The model it builds of you is written to local disk, inference runs on the local GPU, nothing ships anywhere else.
The architecture is the privacy commitment, and I think I can build a pretty good version of you with it. What it does not solve is how the thing will feel to live with. The clearest picture of that came out of the same lunch. Cristina had ordered a cheese board at AssaggiAssisi. She had not picked the cheeses. She had asked for the board and what came out was good. The longer menu existed, with every cheese and every cured meat spelled out, and we had ignored it. If we kept coming back the kitchen would vary what they brought out, and we would slowly work through whatever they had. That is the shape I want for day-one LocalGhost. A good default board on day one, and better at your taste with every visit.
The cheese board is the reassuring version. A digital version of you, if built well, is also going to feel creepy. Even a partial success will be eerie to sit with, because it will know you from your own footprint and sometimes it will be right about you in ways you had not noticed yourself.
[1] The Big Five personality taxonomy (also called the Five-Factor Model) is the consensus framework in trait psychology, consolidated through work by Goldberg, Costa, McCrae, and others between roughly 1981 and 1992. The 100-item and 50-item versions of the International Personality Item Pool are free to use and take 10-20 minutes to complete. Source for the historical provenance and for the standard instrument length. Goldberg et al., 2006. "The international personality item pool and the future of public-domain personality measures." Journal of Research in Personality. ipip.ori.org
[2] Kosinski, Stillwell, and Graepel, 2013. "Private traits and attributes are predictable from digital records of human behavior." Proceedings of the National Academy of Sciences, 110(15), 5802-5805. Source for the 58,466-participant figure, the accuracy numbers on predicting sexual orientation (88%), ethnicity (95%), and political affiliation (85%) from Facebook Likes alone, and for the claim that behavioural traces predict sensitive personal attributes. pnas.org/doi/10.1073/pnas.1218772110
[3] Youyou, Kosinski, and Stillwell, 2015. "Computer-based personality judgments are more accurate than those made by humans." Proceedings of the National Academy of Sciences, 112(4), 1036-1040. Source for the 86,220-participant study and for the specific thresholds, 10 Likes to outperform a work colleague's personality judgment of the participant, 70 for a friend or cohabitant, 150 for a parent or sibling, 300 for a spouse. Also the source for the finding that computer judgments based on Likes had higher external validity than self-reported personality scores for predicting some life outcomes. pnas.org/doi/10.1073/pnas.1418680112
[4] Perochon, Di Martino, Carpenter, et al., 2023. "Early detection of autism using digital behavioral phenotyping." Nature Medicine. Prospective study of 475 toddlers (17-36 months) in a primary-care setting using the SenseToKnow tablet app. Source for the AUC 0.90, sensitivity 87.8%, specificity 80.8% figures on autism screening from behavioural signals captured by computer vision. nature.com/articles/s41591-023-02574-3
[5] Lee et al., 2023. "Machine Learning-Based Prediction of Attention-Deficit/Hyperactivity Disorder and Sleep Problems With Wearable Data in Children." JAMA Network Open. Diagnostic study of 79 children with ADHD and 68 with sleep problems using circadian rhythm data from wearables. Source for the AUC 0.798 figure on ADHD detection from wearable data. jamanetwork.com/journals/jamanetworkopen/fullarticle/2802554
[6] Casals, Larsson, and Hansen, 2025. "Machine learning on a smartphone-based CPT for ADHD prediction." Frontiers in Psychiatry. Study of 952 neurotypical controls and 292 unmedicated ADHD patients using a smartphone-delivered Continuous Performance Test plus motion and face-tracking sensor data. Source for the 80.8% sensitivity, 79.5% specificity numbers on ADHD prediction from smartphone sensor data, and for the specific claim that sensor features from the phone itself added meaningfully to the CPT baseline. frontiersin.org/articles/10.3389/fpsyt.2025.1564351
[7] Stachl et al., 2020. "Predicting personality from patterns of behavior collected with smartphones." Proceedings of the National Academy of Sciences, 117(30), 17680-17687. Study of 624 volunteers tracked over 30 consecutive days producing 25,347,089 logging events across six behavioural classes (communication and social behaviour, music consumption, app usage, mobility, overall phone activity, day/night activity). Source for the Big Five prediction from passive smartphone sensing, for the finding that communication and social behaviour was the most predictive class overall, and for the median prediction performance at broad domain level (r = 0.37). pnas.org/doi/10.1073/pnas.1920484117
[8] Linear, June 2025. Series C funding round of $82 million led by Accel, at a $1.25 billion valuation, bringing total raised to $134.2 million. Linear has been profitable since 2021, serves more than 15,000 customers including OpenAI, Scale AI, and Perplexity, and grew ARR more than 200% year-on-year in 2025. Source for the $1.25 billion valuation and for Linear's minimal-onboarding design approach, which Linear's own product documentation describes as pre-populated workspaces and no setup configuration. TechCrunch coverage at techcrunch.com/2025/06/10/atlassian-rival-linear-raises-82m-at-1-25b-valuation. Linear's own announcement at linear.app/now/building-our-way.