Ita Zaporozhets
|
e48e5f1f13
|
Support reading tiktoken tokenizer.model file (#31656)
* use existing TikTokenConverter to read tiktoken tokenizer.model file
* del test file
* create titktoken integration file
* adding tiktoken llama test
* ALTNATIVE IMPLEMENTATION: supports llama 405B
* fix one char
* remove redundant line
* small fix
* rm unused import
* flag for converting from tiktokeng
* remove unneeded file
* ruff
* remove llamatiktokenconverter, stick to general converter
* tiktoken support v2
* update test
* remove stale changes
* udpate doc
* protect import
* use is_protobuf_available
* add templateprocessor in tiktokenconverter
* reverting templateprocessor from tiktoken support
* update test
* add require_tiktoken
* dev-ci
* trigger build
* trigger build again
* dev-ci
* [build-ci-image] tiktoken
* dev-ci
* dev-ci
* dev-ci
* dev-ci
* change tiktoken file name
* feedback review
* feedback rev
* applying feedback, removing tiktoken converters
* conform test
* adding docs for review
* add doc file for review
* add doc file for review
* add doc file for review
* support loading model without config.json file
* Revert "support loading model without config.json file"
This reverts commit 2753602e51c34cef2f184eb11f36d2ad1b02babb.
* remove dev var
* updating docs
* safely import protobuf
* fix protobuf import error
* fix protobuf import error
* trying isort to fix ruff error
* fix ruff error
* try to fix ruff again
* try to fix ruff again
* try to fix ruff again
* doc table of contents
* add fix for consistency.dockerfile torchaudio
* ruff
* applying feedback
* minor typo
* merging with push-ci-image
* clean up imports
* revert dockerfile consistency
|
2024-09-06 14:24:02 +02:00 |
|