* Initial commit
* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Cleanup config docstring
* Update src/transformers/models/falcon/configuration_falcon.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Convert to relative imports
* Remove torch < 1.8 warning
* Restructure cos_sin header
* qkv -> query, key, value
* Refactor attention calculation
* Add a couple of config variables to account for the different checkpoints
* Successful merging of the code paths!
* Fix misplaced line in the non-parallel attention path
* Update config and tests
* Add a pad_token_id when testing
* Support output_attentions when alibi is None
* make fixup
* Skip KV cache shape test
* No more _keys_to_ignore_on_load_missing
* Simplify self attention a bit
* Simplify self attention a bit
* make fixup
* stash commit
* Some more attention mask updates
* Should pass all tests except assisted generation!
* Add big model generation test
* make fixup
* Add temporary workaround for test
* Test overrides for assisted generation
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Test overrides for assisted generation
* Add generation demo
* Update copyright
* Make the docstring model actually small
* Add module-level docstring
* Remove all assertions
* Add copied from bloom
* Reformat the QKV layer
* Add copied from bloom
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove unused line and reformat
* No single letter variables
* Cleanup return names
* Add copied from line
* Remove the deprecated arguments blocks
* Change the embeddings test to an alibi on/off test
* Remove position_ids from FalconForQA
* Remove old check for token type IDs
* Fix the alibi path when multi_query is False
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/falcon/test_modeling_falcon.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update config naming
* Fix typo for new_decoder_architecture
* Add some comments
* Fix docstring
* Fix docstring
* Create range in the right dtype from the start
* Review comment cleanup
* n_head_kv -> num_kv_heads
* self.alibi -> self.use_alibi
* self.num_kv -> self.num_kv_heads
* Reorder config args
* Made alibi arguments Optional
* Add all model docstrings
* Add extra checkpoints
* Add author info for Falcon
* Stop removing token_type_ids because our checkpoints shouldn't return it anymore
* Add one hopeful comment for the future
* Fix typo
* Update tests, fix cache issue for generation
* Use -1e9 instead of -inf to avoid float overflow
* Recompute the rotary embeddings much less often
* Re-enable disabled tests
* One final fix to attention mask calculation, and update tests
* Cleanup targeting falcon-40b equivalency
* Post-rebase docs update
* Update docstrings, especially in the config
* More descriptive variable names, and comments where we can't rename them
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* feat: Add `_build_conversation_input_ids` to GPT-SW3 tokenizer, adjust line length
* feat: Merge in PR https://github.com/huggingface/transformers/pull/24504.
This allows the GPT-SW3 models (and other GPT-2 based models) to be 4-bit quantised
using `load_in_4bit` with `bitsandbytes`.
* fix: F-string
* fix: F-string
* fix: Remove EOS token from all responses
* fix: Remove redundant newlines
* feat: Add `load_in_4bit` to `Pipeline`
* fix: Separate turns with `\n<s>\n` rather than `<s>`
* fix: Add missing newline in prompt
* tests: Add unit tests for the new `_build_conversation_input_ids` method
* style: Automatic style correction
* tests: Compare encodings rather than decodings
* fix: Remove `load_in_4bit` from pipeline arguments
* docs: Add description and references of the GPT-SW3 chat format
* style: Line breaks
* Apply suggestions from code review
Fix Conversation type hints
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix: Import TYPE_CHECKING
* style: Run automatic fixes
* tests: Remove `_build_conversation_input_ids` unit tests
* tests: Remove import of `Conversation` in GPT-SW3 unit test
* style: Revert formatting
* style: Move TYPE_CHECKING line after all imports
* style: Imports order
* fix: Change prompt to ensure that `sp_model.encode` and `encode` yields same result
* docs: Add TODO comment related to the addition of whitespace during decoding
* style: Automatic style checks
* fix: Remove final whitespace in prompt, as prefix whitespace is used by sentencepiece
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add attention dropout, post attention dropout, post mlp dropout to gpt-neox
* fix typo
* add documentation
* fix too long line
* ran Checking/fixing src/transformers/models/gpt_neox/configuration_gpt_neox.py src/transformers/models/gpt_neox/modeling_gpt_neox.py
python utils/custom_init_isort.py
python utils/sort_auto_mappings.py
doc-builder style src/transformers docs/source --max_len 119 --path_to_docs docs/source
python utils/check_doc_toc.py --fix_and_overwrite
running deps_table_update
updating src/transformers/dependency_versions_table.py
python utils/check_copies.py
python utils/check_table.py
python utils/check_dummies.py
python utils/check_repo.py
Checking all models are included.
Checking all models are public.
Checking all models are properly tested.
Checking all objects are properly documented.
Checking all models are in at least one auto class.
Checking all names in auto name mappings are defined.
Checking all keys in auto name mappings are defined in `CONFIG_MAPPING_NAMES`.
Checking all auto mappings could be imported.
Checking all objects are equally (across frameworks) in the main __init__.
python utils/check_inits.py
python utils/check_config_docstrings.py
python utils/check_config_attributes.py
python utils/check_doctest_list.py
python utils/update_metadata.py --check-only
python utils/check_task_guides.py
* precompiled_charsmap checking before adding to the normalizers' list.
* precompiled_charsmap checking for all Sentencepiece tokenizer models
* precompiled_charsmap checking for SPM tokenizer models - correct formatting
* Limit Pydantic to V1 in dependencies
Pydantic is about to release V2 release which will break a lot of things. This change prevents `transformers` to be used with Pydantic V2 to avoid breaking things.
* more
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* hidden layers, huh, what are they good for (absolutely nothing)
* Some tests break with 1 hidden layer, use 2
* Use 1 hidden layer in a few slow models
* Use num_hidden_layers=2 everywhere
* Slightly higher tol for groupvit
* Slightly higher tol for groupvit
* Adding warning messages to BERT for missing attention masks
These warning messages when there are pad tokens within the input ids and
no attention masks are given. The warning message should only show up once.
* Adding warning messages to BERT for missing attention masks
These warning messages are shown when the pad_token_id is not None
and no attention masks are given. The warning message should only
show up once.
* Ran fix copies to copy over the changes to some of the other models
* Add logger.warning_once.cache_clear() to the test
* Shows warning when there are no attention masks and input_ids start/end with pad tokens
* Using warning_once() instead and fix indexing in input_ids check
---------
Co-authored-by: JB Lau <hckyn@voyager2.local>
* don't add space before single letter chars that don't have a merge
* fix the fix
* fixup
* add a test
* more testing
* fixup
* hack to make sure fast is also fixed
* update switch transformers test
* revert convert slow
* Update src/transformers/models/t5/tokenization_t5.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add typechecking
* quality
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>