Suraj Patil
|
ca33278fdb
|
FlaxGPT2 (#11556)
* flax gpt2
* combine masks
* handle shared embeds
* add causal LM sample
* style
* add tests
* style
* fix imports, docs, quality
* don't use cache
* add cache
* add cache 1st version
* make use cache work
* start adding test for generation
* finish generation loop compilation
* rewrite test
* finish
* update
* update
* apply sylvains suggestions
* update
* refactor
* fix typo
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
|
2021-05-18 22:50:51 +01:00 |
|