* only dir not even init
* init
* tokenizer removed and reference of codegen added
* modeling file updated a lot remaining app_rotary_emb
* conversion script done
* conversion script fixed, a lot of factoring done and most tests pass
* added token_clf and extractive_QA_head
* integration tests pass
* flash attn tests pass!
* config done
* more docs in modeling file
* some style fix
* style and others
* doc test error fix
* more doc fix
* some attention fixes
* most fixes
* style and other fixes
* docs fix and config
* doc fix
* some comments
* conversion script updated
* conversion script updated
* Revert "conversion script updated"
This reverts commit e92378c54084ec0747041b113083d1746ecb6c7f.
* final comments
* add Phi to language_modeling.md
* edit phi.md file
* rebase and fix
* removed phi-1.5 example
* changed model_type from 'phi'->'mixformer-sequential'
* small change
* small change
* revert \small change
* changed mixformer-sequential->phi
* small change
* added phi-1.5 example instead of phi-1
* doc test might pass now
* rebase and small change
* added the dropout layer
* more fixes
* modified .md file
* very very small doc change
* init commit
* attention arch done except rotary emb
* rotary emb done
* text encoder working
* outputs matching
* arch first pass done
* make commands done, tests and docs remaining
* all tests passed, only docs remaining
* docs done
* doc-builder fix
* convert script removed(not relevant)
* minor comments done
* added ckpt conversion script
* tokenizer done
* very minor fix of index.md 2
* mostly make fixup related
* all done except fe and rotary emb
* very small change
* removed unidecode dependency
* style changes
* tokenizer removed require_backends
* added require_inflect to tokenizer tests
* removed VOCAB_FILES in tokenizer test
* inflect dependency removed
* added rotary pos emb cache and simplified the apply method
* style
* little doc change
* more comments
* feature extractor added
* added processor
* auto-regressive config added
* added CLVPConditioningEncoder
* comments done except the test one
* weights added successfull(NOT tested)
* tokenizer fix with numbers
* generate outputs matching
* almost tests passing Integ tests not written
* Integ tests added
* major CUDA error fixed
* docs done
* rebase and multiple fixes
* fixed rebase overwrites
* generate code simplified and tests for AutoRegressive model added
* minor changes
* refectored gpt2 code in clvp file
* weights done and all code refactored
* mostly done except the fast_tokenizer
* doc test fix
* config file's doc fixes
* more config fix
* more comments
* tokenizer comments mostly done
* modeling file mostly refactored and can load modules
* ClvpEncoder tested
* ClvpDecoder, ClvpModel and ClvpForCausalLM tested
* integration and all tests passed
* more fixes
* docs almost done
* ckpt conversion refectored
* style and some failing tests fix
* comments
* temporary output fix but test_assisted_decoding_matches_greedy_search test fails
* majority changes done
* use_cache outputs same now! Along with the asisted_greedy_decoding test fix
* more comments
* more comments
* prepare_inputs_for_generation fixed and _prepare_model_inputs added
* style fix
* clvp.md change
* moved clvpconditionalencoder norms
* add model to new index
* added tokenizer input_ids_with_special_tokens
* small fix
* config mostly done
* added config-tester and changed conversion script
* more comments
* comments
* style fix
* some comments
* tokenizer changed back to prev state
* small commnets
* added output hidden states for the main model
* style fix
* comments
* small change
* revert small change
* .
* Update clvp.md
* Update test_modeling_clvp.py
* :)
* some minor change
* new fixes
* remove to_dict from FE
* Add index.md for tukish language
* Fix index.md (huggingface/transformers#27088)
* Add 'tr' to additional files
* Update docs/source/tr/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update index.md
---------
Co-authored-by: Mert Yanık <mert.yanik@lcwaikiki.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix error in convert_openai_to_hf.py: "_download() missing 1 required positional argument: root"
* Fix error in convert_openai_to_hf.py: "TypeError: byte indices must be integers or slices, not str"
* Fix decoder_attention_heads value in convert_openai_to_hf.py.
Correct the assignment for `decoder_attention_heads` in the conversion script for the Whisper model.
* Black reformat convert_openai_to_hf.py file.
* Fix Whisper model configuration defaults (for Tiny).
- Correct encoder/decoder layers and attention heads count.
- Update model width (`d_model`) to 384.
* Add docstring to the convert_openai_to_hf.py script with a doctest
* Add shebang and +x permission to the convert_openai_to_hf.py
* convert_openai_to_hf.py: reuse the read model_bytes in the _download() function
* Move convert_openai_to_hf.py doctest example to whisper.md
* whisper.md: Add an inference example to the Conversion section.
* whisper.md: remove `model.config.forced_decoder_ids` from examples (deprecated)
* whisper.md: Remove "## Format Conversion" section; not used by users
* whisper.md: Use librispeech_asr_dummy dataset and load_dataset()
I'm adding accelerate as one of the libraries to install because otherwise when running the Trainer, the model errorr out with the error.
ImportError: Using the `Trainer` with `PyTorch` requires `accelerate>=0.20.1`: Please run `pip install transformers[torch]` or `pip install accelerate -U`
Further context:
1. I've tried this across different environments so I believe that the environment is not the issue.
2. I had the latest transformers library version running.
3. Typically even after install accelerate and import it, it wouldn't resolve the issue until I restart the notebook and try again.
* first batch of structure improvements for model_docs
* second batch of structure improvements for model_docs
* more structure improvements for model_docs
* more structure improvements for model_docs
* structure improvements for cv model_docs
* more structural refactoring
* addressed feedback about image processors
* Add type annotations to TFConvNextDropPath
* Use tf.debugging.assert_equal for TFConvNextEmbeddings shape check
* Add TensorFlow implementation of ConvNeXTV2
* check_docstrings: add TFConvNextV2Model to exclusions
TFConvNextV2Model and TFConvNextV2ForImageClassification have docstrings
which are equivalent to their PyTorch cousins, but a parsing issue prevents them
from passing the test.
Adding exclusions for these two classes as discussed in #25558.
* docs(zh): translate tflite.md
* docs(zh): add space around links
* Update docs/source/zh/tflite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add support for loading GPTQ models on CPU
Right now, we can only load the GPTQ Quantized model on the CUDA
device. The attribute `gptq_supports_cpu` checks if the current
auto_gptq version is the one which has the cpu support for the
model or not.
The larger variants of the model are hard to load/run/trace on
the GPU and that's the rationale behind adding this attribute.
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>
* Update quantization.md
* Update quantization.md
* Update quantization.md
* add
* add
* add
* Add deepspeed.md
* Add
* add
* Update docs/source/ja/main_classes/callback.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/output.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/pipelines.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/text_generation.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/ja/main_classes/processors.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update logging.md
* Update toctree.yml
* Update docs/source/ja/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Add suggesitons
* m
* Update docs/source/ja/main_classes/trainer.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update toctree.yml
* Update Quantization.md
* Update docs/source/ja/_toctree.yml
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update toctree.yml
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* docs(zh): translate custom_models.md
* minor fix in customer_models
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* add `MaskGenerationPipeline` in docs
* Update __init__.py
* fix repo consistency and clarify docstring
* add on check docstirngs
* actually we do have a tf sam
* oops
* initial edits
* improvements for clarity and flow
* improvements for clarity and flow, removed the repetead section
* removed two docs that had no content
* Revert "removed two docs that had no content"
This reverts commit e98fa2fa0d.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feedback addressed
* more feedback addressed
* feedback addressed
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* adds agnostic decorators and availability fns
* renaming decorators and fixing imports
* updating some representative example tests
bloom, opt, and reformer for now
* wip device agnostic functions
* lru cache to device checking functions
* adds `TRANSFORMERS_TEST_DEVICE_SPEC`
if present, imports the target file and updates device to function
mappings
* comments `TRANSFORMERS_TEST_DEVICE_SPEC` code
* extra checks on device name
* `make style; make quality`
* updates default functions for agnostic calls
* applies suggestions from review
* adds `is_torch_available` guard
* Add spec file to docs, rename function dispatch names to backend_*
* add backend import to docs example for spec file
* change instances of to
* Move register backend to before device check as per @statelesshz changes
* make style
* make opt test require fp16 to run
---------
Co-authored-by: arsalanu <arsalanu@graphcore.ai>
Co-authored-by: arsalanu <hzji210@gmail.com>