5fcms2sh-HeavyBall-PSGDProsafetensorsbf16Public
teutonic-lucas/teutonic-q3-4b-5fcms2sh-heavyball-psgdpro
sha256:fe57b68861d5e6605b42d69cf5e5e71e0394f2447a3e2e3a7b0d3f6a82f8da09·Indexed 1d ago
Parameters
4.4B
Total size
8.2 GB
Files
7
Quantization
BF16
No README on this version
Push a README.md with the model files to see it rendered here.
Model architecture
config.json- Architecture
- Qwen3ForCausalLM
- Model type
- qwen3
- Hidden size
- 2,560
- Layers
- 36
- Attention heads
- 32 (8 kv)
- FFN size
- 9,728
- Vocab size
- 151,936
- Context window
- 40K
Files
7 itemsmodel.safetensors
4c8b95acaf76
8.2 GB
safetensors
tokenizer.json
be75606093db
10.9 MB
config.json
9430f79acea2
1.6 KB
tokenizer_config.json
5f007d04324a
696 B
q3_train_state.json
f30d69d2d0eb
595 B
teutonic_model_store.json
c60072b60da5
527 B
generation_config.json
0af44c629f5c
213 B