Vintage LLM models
Vintage LLMs, or Historical Large Language Models (HLLMs) are a super fascinating topic because they let researchers strip away modern knowledge and test whether AI can actually reason instead of just recall. By training models only on past data, people can ask whether discoveries like relativity or quantum mechanics could be rediscovered from scratch, turning history into a kind of controlled experiment. They also offer a way to “time travel,” letting us interact with past worldviews and see how people once understood society, science, and reality. Beyond AI, they could even simulate historical populations, giving behavioral scientists a new tool to study how human thinking changes over centuries. What makes them compelling is that they blur the line between memory and intelligence, and force us to ask which one really matters.
Update: 10-June-2026 -- I added TypewriterLM;
There were zero vintage LLMs just a few months ago and now there are 7 that I know of:
Talkie
Made by Nick Levine, David Duvenaud, and Alec Radford. A 13B model from 1930, trained on 260B of text books, newspapers, patents, and legal documents.
This is the largest vintage model trained from scratch so far, and they are planning larger versions.
Links:
- https://talkie-lm.com/chat -- Chat interface
- https://huggingface.co/talkie-lm/talkie-1930-13b-it
- https://huggingface.co/lewtun/talkie-1930-13b-it-hf -- GGUF for Llama.cpp
- https://huggingface.co/zakarth/talkie-1930-13b-it-vulkan-fixed-GGUF -- GGUF that actually works on AMD/Vulkan
- https://talkie-lm.com/introducing-talkie -- Introducing talkie: a 13B vintage language model from 1930
TypewriterLM
TypewriterLM is a 7.24B LLM trained exclusively on English text predating 1913, based on the Llama architecture.
Links:
- https://arxiv.org/html/2606.02991v1
- https://huggingface.co/typewriter-ai/typewriter-1913-7B-base -- base model only
MonadGPT
Made by Pierre-Carl Langlais. Knowledge cutoff: roughly 1800? Trained on 11,000 early modern texts in English, French and Latin, mostly coming from EEBO and Gallica.
Fine-tuned from Teknium/OpenHermes-2-Mistral-7B (which is based on Mistralai/Mistral-7B-v0.1).
By default, it's not running in "vintage mode", but if you use the right system prompt, it will act like someone from the 17th century.
Links:
- https://huggingface.co/Pclanglais/MonadGPT
- https://huggingface.co/TheBloke/MonadGPT-GGUF -- GGUF for Llama.cpp
Machina Mirabilis
Made by Michael Hla. Knowledge cutoff: year 1900. Trained on filtered data from: Institutional books, British Library books, and American Stories.
Built with Andrej Karpathy's Nanochat.
The model architecture is very custom unfortunately, and it's difficult to port to GGUF.
Links:
- https://gpt1900.com -- Chat interface
- https://huggingface.co/mhla/gpt1900-instruct-v3-sft
- https://michaelhla.com/blog/machina-mirabilis.html -- An experiment to see if an LLM trained from scratch on text prior to 1900 can come up with quantum mechanics and relativity.
Miss Violet Hartwell (London, 1899)
Made by Zakarth. Trained on different custom datasets British Library, Oxford Text Archives, Internet Archive Periodicals.
Links:
- https://huggingface.co/spaces/zakarth/violetdemo -- Chat interface
- https://huggingface.co/zakarth/violet-1b4-chat-gguf -- GGUF files
- https://huggingface.co/zakarth/violet-1b4-chat -- 1.41B params
- https://huggingface.co/zakarth/violet-160m-chat -- 160M params
Mr. Chatterbox
Made by Trip Venturella. Knowledge cutoff: year 1899, trained on a corpus of over 28,000 Victorian-era British texts, drawn from a dataset made available by the British Library.
The model has 340 million params, trained using Andrej Karpathy's Nanochat.
- https://huggingface.co/spaces/tventurella/mr_chatterbox -- Chat interface
- https://huggingface.co/tventurella/mr_chatterbox_model
- https://estragon.news/mr-chatterbox-or-the-modern-prometheus -- Mr. Chatterbox, or, The Modern Prometheus
Time-Capsule
Made by Hayk Grigorian. Knowledge cutoff: year 1875.
A few model sizes and architectures are available, but they are only base-models (you can't chat with them).
Links:
- https://huggingface.co/collections/haykgrigorian/timecapsulellm-1800-1875-london
- https://github.com/haykgrigo3/TimeCapsuleLLM
Extra links
Relevant articles worth mentioning:
AI 'Ranke-4B' built only with data from before 1913, such as 'I don't know Hitler' and 'old discriminatory attitudes,' can give answers that are not tainted by hindsight
The models are not released yet.
https://gigazine.net/gsc_news/en/20251222-ranke-4b
https://github.com/DGoettlich/history-llms/blob/main/ranke-4b/prerelease_notes.md
Are "Vintage LLMs" the start of a new humanistic field?
Thoughts on Historical Language Models and Talkie-1930
https://resobscura.substack.com/p/are-vintage-llms-the-start-of-a-new
Vintage Large Language Models
https://owainevans.github.io/talk-transcript.html
HLLMs: Large Language Models based on historical text could offer informative tools for behavioral science
https://pnas.org/doi/10.1073/pnas.2407639121
Discord server
If you're interested in this topic or vintage datasets, join our Discord server. See you there!