transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-04 13:20:12 +06:00

History

Matt b3ab3fac1d Falcon port (#24523 ) * Initial commit * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Cleanup config docstring * Update src/transformers/models/falcon/configuration_falcon.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert to relative imports * Remove torch < 1.8 warning * Restructure cos_sin header * qkv -> query, key, value * Refactor attention calculation * Add a couple of config variables to account for the different checkpoints * Successful merging of the code paths! * Fix misplaced line in the non-parallel attention path * Update config and tests * Add a pad_token_id when testing * Support output_attentions when alibi is None * make fixup * Skip KV cache shape test * No more _keys_to_ignore_on_load_missing * Simplify self attention a bit * Simplify self attention a bit * make fixup * stash commit * Some more attention mask updates * Should pass all tests except assisted generation! * Add big model generation test * make fixup * Add temporary workaround for test * Test overrides for assisted generation * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Test overrides for assisted generation * Add generation demo * Update copyright * Make the docstring model actually small * Add module-level docstring * Remove all assertions * Add copied from bloom * Reformat the QKV layer * Add copied from bloom * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove unused line and reformat * No single letter variables * Cleanup return names * Add copied from line * Remove the deprecated arguments blocks * Change the embeddings test to an alibi on/off test * Remove position_ids from FalconForQA * Remove old check for token type IDs * Fix the alibi path when multi_query is False * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/falcon/test_modeling_falcon.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update config naming * Fix typo for new_decoder_architecture * Add some comments * Fix docstring * Fix docstring * Create range in the right dtype from the start * Review comment cleanup * n_head_kv -> num_kv_heads * self.alibi -> self.use_alibi * self.num_kv -> self.num_kv_heads * Reorder config args * Made alibi arguments Optional * Add all model docstrings * Add extra checkpoints * Add author info for Falcon * Stop removing token_type_ids because our checkpoints shouldn't return it anymore * Add one hopeful comment for the future * Fix typo * Update tests, fix cache issue for generation * Use -1e9 instead of -inf to avoid float overflow * Recompute the rotary embeddings much less often * Re-enable disabled tests * One final fix to attention mask calculation, and update tests * Cleanup targeting falcon-40b equivalency * Post-rebase docs update * Update docstrings, especially in the config * More descriptive variable names, and comments where we can't rename them --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>		2023-07-11 13:36:31 +01:00
..
asr.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
audio_classification.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
document_question_answering.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
image_captioning.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
image_classification.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
language_modeling.md	Falcon port (#24523 )	2023-07-11 13:36:31 +01:00
masked_language_modeling.md	Add Multi Resolution Analysis (MRA) (New PR) (#24513 )	2023-07-10 10:50:43 +01:00
monocular_depth_estimation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
multiple_choice.md	Add Multi Resolution Analysis (MRA) (New PR) (#24513 )	2023-07-10 10:50:43 +01:00
object_detection.md	Fix model referenced and results in documentation. Model mentioned was inaccessible (#24609 )	2023-07-05 13:25:36 -03:00
question_answering.md	Falcon port (#24523 )	2023-07-11 13:36:31 +01:00
semantic_segmentation.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
sequence_classification.md	Falcon port (#24523 )	2023-07-11 13:36:31 +01:00
summarization.md	[`Umt5`] Add google's umt5 to `transformers` (#24477 )	2023-07-03 07:38:21 +02:00
text-to-speech.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
token_classification.md	Falcon port (#24523 )	2023-07-11 13:36:31 +01:00
translation.md	[`Umt5`] Add google's umt5 to `transformers` (#24477 )	2023-07-03 07:38:21 +02:00
video_classification.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
zero_shot_image_classification.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
zero_shot_object_detection.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00