(public) repo, owned by ryan-alport
description: tiny language model written from scratch in C
branches
file tree (branch: main)
- README.md
- TOOLS/alternate-tokenizer.c
- TOOLS/embedding-outputs.c
- TOOLS/probability-outputs.c
- TOOLS/structs.c
- TOOLS/token-algebra.c
- TOOLS/vocab.c
- TRAINED/100k-model.bin
- TRAINED/10k-model.bin
- TRAINED/1mil-model.bin
- TRAINED/250k-model.bin
- TRAINED/2mil-model.bin
- TRAINED/3mil-model.bin
- TRAINED/500k-model.bin
- TRAINED/50k-model.bin
- TRAINED/info.md
- TRAINED/untrained-model.bin
- generate.c
- includes.c
- suess.txt
- train.c
diff viewer
all Commits
-
e5abc7f
- added sample models + readme update
by [email protected] (10 days ago) -
c11b090
- documentation updates and tools addition
by RyanXPS (10 days ago) -
09eaa9a
- added sample suess corpus, code clarity fixes
by RyanXPS (3 weeks ago) -
888b51e
- initial working model
by RyanXPS (3 weeks ago)