transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-30 01:32:23 +06:00

Author	SHA1	Message	Date
baeseongsu	c1780ce7a4	fix head_mask for albert encoder part(`AlbertTransformer`) (#11596 ) * fix head mask for albert encoder part * fix head_mask for albert encoder part	2021-05-06 02:18:02 -04:00
Mats Sjöberg	864c1dfe34	Accept tensorflow-rocm package when checking TF availability (#11595 )	2021-05-05 14:44:29 -04:00
Patrick von Platen	3e3e41ae20	Pytorch - Lazy initialization of models (#11471 ) * lazy_init_weights * remove ipdb * save int * add necessary code * remove unnecessary utils * Update src/transformers/models/t5/modeling_t5.py * clean * add tests * correct * finish tests * finish tests * fix some more tests * fix xlnet & transfo-xl * fix more tests * make sure tests are independent * fix tests more * finist tests * final touches * Update src/transformers/modeling_utils.py * Apply suggestions from code review * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * clean tests * give arg positive name * add more mock weights to xlnet Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-05-05 17:22:20 +02:00
Lysandre	8fa8e19429	Skip Funnel test	2021-05-05 12:38:01 +02:00
Deepali	83e59d8e0b	add importlib_metadata and huggingface_hub as dependency in the conda recipe (#11591 ) * add importlib_metadata as dependency (#11490) Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com> * add huggingface_hub dependency Co-authored-by: Deepali Chourasia <deepch23@us.ibm.com>	2021-05-05 03:36:18 -04:00
Stas Bekman	bf0dfa98d3	copies need to be fixed too (#11585 )	2021-05-05 03:35:15 -04:00
Stas Bekman	c065025c47	[trainer] document resume randomness (#11588 ) * document resume randomness * fix link * reword * fix * reword * style	2021-05-04 14:17:11 -07:00
Sylvain Gugger	6b241e0e3b	Reproducible checkpoint (#11582 ) * Set generator in dataloader * Use generator in all random samplers * Checkpoint all RNG states * Final version * Quality * Test * Address review comments * Quality * Remove debug util * Add python and numpy RNGs * Split states in different files in distributed * Quality * local_rank for TPUs * Only use generator when accepted * Add test * Set seed to avoid flakiness * Make test less flaky * Quality	2021-05-04 16:20:56 -04:00
Patrick Fernandes	0afe4a90f9	[Flax] Add Electra models (#11426 ) * add electra model to flax * Remove Electra Next Sentence Prediction model added by mistake * fix parameter sharing and loosen equality threshold * fix styling issues * add mistaken removen imports * fix electra table * Add FlaxElectra to automodels and fixe docs * fix issues pointed out the PR * fix flax electra to comply with latest changes * remove stale class * add copied from Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-04 20:56:09 +02:00
Philipp Schmid	226e74b610	Removes SageMakerTrainer code but keeps class as wrapper (#11587 ) * removed all old code * make quality	2021-05-04 14:31:18 -04:00
Patrick von Platen	084a187da3	[FlaxRoberta] Add FlaxRobertaModels & adapt run_mlm_flax.py (#11470 ) * add flax roberta * make style * correct initialiazation * modify model to save weights * fix copied from * fix copied from * correct some more code * add more roberta models * Apply suggestions from code review * merge from master * finish * finish docs Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-05-04 19:57:59 +02:00
Sylvain Gugger	2ce0fb84cc	Make quality scripts work when one backend is missing. (#11573 ) * Make quality scripts work when one backend is missing. * Check env variable is properly set * Add default * With print statements * Fix typo * Set env variable * Remove debug code	2021-05-04 09:53:44 -04:00
Lysandre Debut	09b0bcfea9	Enable added tokens (#11325 ) * Fix tests * Reorganize * Update tests/test_modeling_mobilebert.py * Remove unnecessary addition	2021-05-04 08:13:57 -04:00
abhishek thakur	c40c7e213b	Add multi-class, multi-label and regression to transformers (#11012 ) * add to bert * review comments * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * self.config.problem_type * fix style * fix * fin * fix * update doc * fix * test * Test more problem types * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * remove * fix * quality * make fix-copies * remove test Co-authored-by: abhishek thakur <abhishekkrthakur@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2021-05-04 02:23:40 -04:00
Stas Bekman	7c622482e8	fix resize_token_embeddings (#11572 )	2021-05-03 13:12:06 -07:00
Sylvain Gugger	fe82b1bfa0	Update training tutorial (#11533 ) * Update training tutorial * Apply suggestions from code review Co-authored-by: Hamel Husain <hamelsmu@github.com> * Address review comments * Update docs/source/training.rst Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * More review comments * Last review comments Co-authored-by: Hamel Husain <hamelsmu@github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-05-03 13:18:46 -04:00
Sylvain Gugger	f4c9a7e62e	Accumulate opt state dict on do_rank 0 (#11481 )	2021-05-03 13:18:27 -04:00
Nicolas Patry	1e8e06862f	Fixes a useless warning. (#11566 ) Fixes #11525	2021-05-03 18:48:13 +02:00
Sylvain Gugger	87dd1a00ef	Fix metric computation in `run_glue_no_trainer` (#11569 )	2021-05-03 11:42:55 -04:00
Muktan	a721a5eefd	[Wav2vec2] Fixed tokenization mistakes while adding single-char tokens to tokenizer (#11538 ) * Fixed tokenization mistakes while adding single-char tokens to tokenizer * Added tests and Removed unnecessary comments. * finalize wav2vec2 tok * add more aggressive tests * Apply suggestions from code review * fix useless import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-05-03 17:19:12 +02:00
NielsRogge	f3cf8ae7b3	Add LUKE (#11223 ) * Rebase with master * Minor bug fix in docs * Copy files from adding_luke_v2 and improve docs * change the default value of use_entity_aware_attention to True * remove word_hidden_states * fix head models * fix tests * fix the conversion script * add integration tests for the pretrained large model * improve docstring * Improve docs, make style * fix _init_weights for pytorch 1.8 * improve docs * fix tokenizer to construct entity sequence with [MASK] entity when entities=None * Make fix-copies * Make style & quality * Bug fixes * Add LukeTokenizer to init * Address most comments by @patil-suraj and @LysandreJik * rename _compute_extended_attention_mask to get_extended_attention_mask * add comments to LukeSelfAttention * fix the documentation of the tokenizer * address comments by @patil-suraj, @LysandreJik, and @sgugger * improve docs * Make style, quality and fix-copies * Improve docs * fix docs * add "entity_span_classification" task * update example code for LukeForEntitySpanClassification * improve docs * improve docs * improve the code example in luke.rst * rename the classification layer in LukeForEntityClassification from typing to classifier * add bias to the classifier in LukeForEntitySpanClassification * update docs to use fine-tuned hub models in code examples of the head models * update the example sentences * Make style & quality * Add require_torch to tokenizer tests * Add require_torch to tokenizer tests * Address comments by @sgugger and add community notebooks * Make fix-copies Co-authored-by: Ikuya Yamada <ikuya@ikuya.net>	2021-05-03 09:07:29 -04:00
Frederik Bode	6a11e4c2ad	fix the mlm longformer example by changing [MASK] to <mask> (#11559 )	2021-05-03 12:43:30 +01:00
Lysandre Debut	1c86157d9d	Remove `datasets` submodule. (#11563 )	2021-05-03 06:02:33 -04:00
Patrick von Platen	c448c01f25	[Wav2Vec2] Fix convert (#11562 ) * push * small change * correct other typo	2021-05-03 11:53:30 +02:00
Suraj Patil	623281aa12	[Flax BERT/Roberta] few small fixes (#11558 ) * small fixes * style	2021-05-03 10:35:06 +02:00
lewtun	a5d2967bd8	Fix examples in M2M100 docstrings (#11540 ) Replaces `tok` with `tokenizer` so examples can run with copy-paste	2021-05-03 10:56:31 +05:30
jingyihe	980208650a	Fixed docs for the shape of `scores` in `generate()` (#10057 ) * Fixed the doc for the shape of return scores tuples in generation_utils.py. * Fix the output shape of `scores` for `DecoderOnlyOutput`. * style fix	2021-05-02 10:10:47 +02:00
Stas Bekman	4e7bf94e72	[DeepSpeed] fp32 support (#11499 ) * prep for deepspeed==0.3.16 * new version * too soon * support and test fp32 mode * troubleshooting doc start * workaround no longer needed * add fp32 doc * style * cleanup, add tf32 note * clarify * release was made	2021-04-30 12:51:48 -07:00
Stas Bekman	282f3ac3ef	[debug utils] activation/weights underflow/overflow detector (#11274 ) * sync * add activation overflow debug utility * cleanup * document detect_overflow * import torch * add deprecation warning * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * convert to rst, add note * add class * fix docs * improve the doc * rework to dump a lot more info about each frame * complete expansion * cleanup * format * cleanup * doesn't have to be transformers * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * wrap long line * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-30 11:15:46 -07:00
Hamel Husain	804c2974d5	Improve task summary docs (#11513 ) * fix task summary docs * refactor to use model.config.id2label instead of list * fix nit * Update docs/source/task_summary.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-30 09:06:47 -04:00
Sylvain Gugger	bc80f8bc37	Add Stas and Suraj as authors (#11526 )	2021-04-30 09:03:13 -04:00
Bhadresh Savani	84326a28f8	[Examples] Added support for test-file in QA examples with no trainer (#11510 ) * added support for test-file * fixed typo * added suggested changes * reformatted code * modifed files * fix post processing error * Trigger CI * removed extra lines	2021-04-30 09:02:50 -04:00
Lysandre Debut	af0692a2ca	Run model templates on master (#11527 )	2021-04-30 08:47:12 -04:00
Suraj Patil	57c8e822f7	reszie token embeds (#11524 )	2021-04-30 08:47:01 -04:00
Matt	20d6931e32	Update TF text classification example (#11496 ) Big refactor, fixes and multi-GPU/TPU support	2021-04-30 13:45:33 +01:00
bonniehyeon	8b945ef03e	Fix do_eval default value in training_args.py (#11511 ) * Fix do_eval default value in training_args.py * Update PULL_REQUEST_TEMPLATE.md	2021-04-30 08:35:12 -04:00
Takuya Makino	c2cd02ac62	Accepts BatchEncoding in LengthSampler (#11431 )	2021-04-30 08:27:46 -04:00
Shubham Sanghavi	30ede8994e	Implement Fast Tokenization for Deberta (#11387 )	2021-04-30 08:08:15 -04:00
Nicolas Patry	db9dd09cf9	Adding `AutomaticSpeechRecognitionPipeline`. (#11337 ) * Adding `AutomaticSpeechRecognitionPipeline`. - Because we added everything to enable this pipeline, we probably should add it to `transformers`. - This PR tries to limit the scope and focuses only on the pipeline part (what should go in, and out). - The tests are very specific for S2T and Wav2vec2 to make sure both architectures are supported by the pipeline. We don't use the mixin for tests right now, because that requires more work in the `pipeline` function (will be done in a follow up PR). - Unsure about the "helper" function `ffmpeg_read`. It makes a lot of sense from a user perspective, it does not add any additional dependencies (as in hard dependency, because users can always use their own load mechanism). Meanwhile, it feels slightly clunky to have so much optional preprocessing. - The pipeline is not done to support streaming audio right now. Future work: - Add `automatic-speech-recognition` as a `task`. And add the FeatureExtractor.from_pretrained within `pipeline` function. - Add small models within tests - Add the Mixin to tests. - Make the logic between ForCTC vs ForConditionalGeneration better. * Update tests/test_pipelines_automatic_speech_recognition.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Adding docs + main import + type checking + LICENSE. * Doc style !. * Fixing TYPE_HINT. * Specifying waveform shape in the docs. * Adding asserts + specify in the documentation the shape of the input np.ndarray. * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Adding require to tests + move the `feature_extractor` doc. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-04-30 11:54:08 +02:00
CeShine Lee	76116f479b	T5 Gradient Checkpointing (#11353 ) * Implement gradient checkpoinging for T5Stack * A bit more robust type checking * Add `gradient_checkpointing` to T5Config * Formatting * Set requires_grad only when training * None return value will only cause problems when training * Change the output tuple according to `use_cache` * Enable gradient checkpointing for the decoder Squashed commit of the following: commit 658bdd0bd1215353a8770f558bda2ea69a0ad0c7 Author: Ceshine Lee <shuanck@gmail.com> Date: Sat Apr 24 14:08:17 2021 +0800 Only set `require_grad` for gradient checkpointing commit acaeee6b2e675045fb28ce2176444c1d63e908bd Author: Ceshine Lee <shuanck@gmail.com> Date: Sat Apr 24 13:59:35 2021 +0800 Make gradient checkpointing work with the decoder * Formatting	2021-04-30 14:13:55 +05:30
Manuel Romero	58c789e3d2	Update README.md (#11489 ) Add link to code	2021-04-30 04:29:59 -04:00
Patrick von Platen	022a1e9e67	make style (#11520 )	2021-04-30 09:54:58 +02:00
Philip May	e0db8276a6	add sp_model_kwargs to unpickle of xlm roberta tok (#11430 ) add test for pickle simplify test fix test code style add missing pickle import fix test fix test fix test	2021-04-30 03:44:58 -04:00
Frederik Bode	b43e3f93ac	correct the dimension comment of matrix multiplication (#11494 ) Co-authored-by: Frederik Bode <frederik@paperbox.ai>	2021-04-30 09:42:13 +02:00
Lysandre Debut	f37f2adb68	Pin HuggingFace Hub dependency (#11502 )	2021-04-30 02:57:50 -04:00
Lysandre	60d5bda4fd	Patch notification service	2021-04-30 08:56:18 +02:00
Sylvain Gugger	b29eb247d3	Split checkpoint from model_name_or_path in examples (#11492 ) * Split checkpoint from model_name_or_path in examples * Address review comments * Address review comments	2021-04-29 18:33:47 -04:00
Michael Benayoun	d6ec54ba36	solved coefficient issue for the TF version of gelu_fast (#11514 ) Co-authored-by: Michael Benayoun <michael@huggingface.co>	2021-04-29 21:47:26 +02:00
Sylvain Gugger	ad1f7bef13	Reformat to make code clearer in tokenizer call (#11497 ) * Reformat to make code clearer * Reformat to make code clearer	2021-04-29 07:51:09 -04:00
Patrick von Platen	f748bd4242	[Flax] Add docstrings & model outputs (#11498 ) * add attentions & hidden states * add model outputs + docs * finish docs * finish tests * finish impl * del @ * finish * finish * correct test * apply sylvains suggestions * Update src/transformers/models/bert/modeling_flax_bert.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * simplify more Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-04-29 12:04:51 +02:00

1 2 3 4 5 ...

7133 Commits