The Zamia Brain project provides infrastructure useful to create natural language processing systems based on transformer networks (see https://arxiv.org/abs/1706.03762 ).

This project is still highly experimental, everything is subject to change without prior notice. The current approach is to generate training corpora for pre-training as well as (multi-)domain refinement. The goal is to train networks that are very robust (i.e. avoid brittleness present in traditional rule-based systems) in their natural language processing capabilities (pretraining) while allowing for a certain amount of control of their behavior (refinement).

For this, you will find these components:

Available Models

Downloads here:

https://goofy.zamia.org/zamia-speech/brain/

Model Size Language Training corpus Vocabulary
gpt2-german-345M-r20190906 345M german 4.5 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece
gpt2-german 117M german 3 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece
transformerXL-german-163M-r20190928 163M german 1 epochs on 27GB twitter+wikipedia+heise+parole 50k sentencepiece

Credits

Massive thanks to Konstantin Lopuhin https://github.com/lopuhin for great code and support!