Commit Graph

9530 Commits

Author SHA1 Message Date
Heerak Son
db3edd050b
Update run_translation_no_trainer.py (#16652)
args.model_name_or_path -> args.config_name
fix it
2022-04-12 08:55:12 -04:00
smelm
b9f12bedd3
Only call get_output_embeddings when tie_word_embeddings is set (#16667)
This avoids an unnecessary call and avoids problems during
initialization of class hierarchies.

Co-authored-by: Samuel Melm <samuel.melm@stud.uni-heidelberg.de>
2022-04-12 07:55:44 -04:00
Michael Chung
924484ee4a
Add Doc Test GPT-2 (#16439)
* First Pass All Tests Pass

* WIP

* Adding file to documentation tests

* Change the base model for the example in the doc test.

* Fix Code Styling by running
make fixup

* Called Style

* Reverted to gpt2 model rather than distill gpt2
Then used a token classification model over a sequence model for an example.

* Fix Styling Issue

* Hopefully ignores the formatting issue.

Co-authored-by: ArEnSc <xx.mike.chung.xx@gmail.com>
2022-04-12 12:11:03 +02:00
Patrick von Platen
70851a6bf0
[Bart] correct doc test (#16722) 2022-04-12 10:19:49 +02:00
Zachary Mueller
69233cf03b
Fix example logs repeating themselves (#16669)
Move declaration of log streams to before tests, so that results won't get compounded on top of each other
2022-04-11 16:25:16 -04:00
Yih-Dar
dce33f2150
Improve PT/TF equivalence test (#16557)
* add error message

* Use names in the error message

* allow ModelOutput

* rename to check_pt_tf_outputs and move outside

* fix style

* skip past_key_values in a better way

* Add comments

* improve code for label/loss

* make the logic clear by moving the ignore keys out

* fix _postprocessing_to_ignore

* fix _postprocessing_to_ignore: create new outputs from the remaining fields

* ignore past_key_values in TFGPT2 models for now

* make check_pt_tf_outputs better regarding names

* move check_pt_tf_models outside

* rename methods

* remove test_pt_tf_model_equivalence in TFCLIPModelTest

* Reduce TFViTMAEModelTest.test_pt_tf_model_equivalence

* move prepare_pt_inputs_from_tf_inputs outside check_pt_tf_models

* Fix quality

* Clean-up TFLxmertModelTester.test_pt_tf_model_equivalence

* Fix quality

* fix

* fix style

* Clean-up TFLEDModelTest.test_pt_tf_model_equivalence

* Fix quality

* add docstring

* improve comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:19:12 +02:00
Yih-Dar
7f7300856d
Handle image_embeds in ViltModel (#16696)
* update

* batch_size -> text_batch_size

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 22:16:20 +02:00
Nicholas Broad
161c0a2eec
Private repo TrainingArgument (#16707)
* private repo argument to trainer

* format

Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
2022-04-11 13:37:16 -04:00
Zachary Mueller
d4b3e359aa
Don't push checkpoints to hub in no_trainer scripts (#16703)
Adds checkpoint prefixes to the gitignore if `push_to_hub` is used along with `checkpointint_steps`
2022-04-11 12:42:45 -04:00
Yih-Dar
c04619ecf3
Enable more test_torchscript (#16679)
* update _create_and_check_torchscript

* Enable test_torchscript

* clear_class_registry

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:23:35 +02:00
Yih-Dar
3918d6a9d6
Reduce memory leak in _create_and_check_torchscript (#16691)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:22:28 +02:00
Yih-Dar
2109afae71
Rename the method test_torchscript (#16693)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:21:45 +02:00
Yih-Dar
40618ec29e
Fix TF_MASKED_LM_SAMPLE (#16698)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 18:19:28 +02:00
Suraj Patil
1471857f13
update decoder_vocab_size when resizing embeds (#16700) 2022-04-11 18:02:10 +02:00
Ahmed Elnaggar
5e68675755
Fix t5 shard on TPU Pods (#16527)
* Fix t5 shard on TPU Pods

The current script doesn't work properly on a TPU pod because the global batch is not divided correctly per host.
This pull request fixes this issue by dividing the global batch to each host before it is shared on each host.

* fix style

Co-authored-by: ahmed-elnaggar <ahmed.elnaggar@allianz.com>
2022-04-11 16:45:20 +02:00
Minh Chien Vu
2831826bc6
Add Doc Test for BERT (#16523)
* Add doctest BERT

* make fixup

* fix typo

* change checkpoints

* make fixup

* define doctest output value, update doctest for mobilebert

* solve fix-copies

* update QA target start index and end index

* change checkpoint for docs and reuse defined variable

* Update src/transformers/models/bert/modeling_tf_bert.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* make fixup

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2022-04-11 15:51:28 +02:00
Patrick von Platen
098b002644
[Doctests] Correct task summary (#16644) 2022-04-11 14:59:35 +02:00
Sadra
6ef7186b5d
fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-*" exist (#16686)
I create an archive of older checkpoints during training the checkpoint has a  name with `f"{checkpoint_prefix}-*.zip/.tar ` 
previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`
2022-04-11 07:32:07 -04:00
Joao Gante
b0bf3011c1
Generate: min length can't be larger than max length (#16668)
* min length must be smaller than max length

* Update min_length in tests
2022-04-11 11:55:30 +01:00
Jia LI
4868a830db
Jia multi gpu eval (#16428)
* add simple multi gpu complet

* add human_eval_multi_gpu

* use copy strategy to distribute across gpu, to avoid padding

* add doc string

* update code style

* use task id to arrange output

* truncate input to avoid zero pad

* Stop the copy mechanism

* update style

* restore copies to scale better in distributed mode

* update style

* replace human eval

* Apply suggestions from code review

1. Tokenize all input at the same time
2. use attention_mask to get the input length
3. other small fixes

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* correct typo and update docstring

* update code style

* remove num sample division constraint

* remove max len calculation

* use accelerator.gather once to speed up

* use accelerate set_seed; update accelerate version

* correct gather bug

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-04-11 11:24:32 +02:00
Yih-Dar
8e93dc7eaf
Fix some doc examples in task summary (#16666)
* Fix some doc examples

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-11 11:20:03 +02:00
SaulLu
1025a9b742
add a warning in SpmConverter for sentencepiece's model using the byte fallback feature (#16629)
* update proto sentencepiece model

* Revert "update proto sentencepiece model"

This reverts commit b07f671747.

* add check

* add test

* Revert "Revert "update proto sentencepiece model""

This reverts commit 46108257b8.

* test for log level

* test for log level 2

* warning at the warning level

* clean

* format

* add explanation in docstring
2022-04-11 11:06:10 +02:00
Steven Liu
7c5d79912a
Update audio examples with MInDS-14 (#16633)
*  update audio examples with minds dataset

* 🖍 make style

* 🖍 minor fixes for doctests
2022-04-08 15:55:42 -05:00
Stas Bekman
4d46106718
[Trainer] tf32 arg doc (#16674)
* [Trainer] tf32 arg doc

* Update src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-08 12:35:39 -07:00
Laura Hanu
f4d4f0a1ec
only load state dict when the checkpoint is not None (#16673) 2022-04-08 13:42:04 -04:00
Zachary Mueller
d57da99237
Add tests for no_trainer and fix existing examples (#16656)
* Fixed some bugs involving saving during epochs
* Added tests mimicking the existing examples tests
* Added in json exporting to all `no_trainer` examples for consistency
2022-04-08 10:03:56 -04:00
Yih-Dar
ab229663b5
Fix QA sample (#16648)
* fix QA sample

* For TF_QUESTION_ANSWERING_SAMPLE

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-08 15:31:43 +02:00
Sylvain Gugger
9a24b97b7f Fix style 2022-04-08 08:07:16 -04:00
Alan Lee
5db2fcc61d
Fix error in doc of DataCollatorWithPadding (#16662)
The defalut value of `padding` in `DataCollatorWithPadding` is `True`, not `False`.
2022-04-08 07:58:02 -04:00
Johannes Kolbe
9db2eebbe2
add vit tf doctest with @add_code_sample_docstrings (#16636)
* add vit tf doctest with @add_code_sample_docstrings

* add labels string back in

Co-authored-by: Johannes Kolbe <johannes.kolbe@tech.better.team>
2022-04-08 07:31:38 -04:00
NielsRogge
4ef0abb738
Add TAPEX (#16473)
* Add TapexTokenizer

* Improve docstrings and provide option to provide answer

* Remove option for pretokenized inputs

* Add TAPEX to README

* Fix copies

* Remove option for pretokenized inputs

* Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.

* - Draft a README file for running the script and introducing some background.
- Remove unused code lines in tabfact script.
- Disable the deafult `pad_to_max_length` option which is memory-consuming.

* * Support `as_target_tokenizer` function for TapexTokenizer.
* Fix the do_lower_case behaviour of TapexTokenizer.
* Add unit tests for target scenarios and cased/uncased scenarios for both source and target.

* * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
* Fix typos in tapex example README.

* * fix the evaluation script - remove the property `task_name`

* * Make the label space more clear for tabfact tasks

* * Using a new fine-tuning script for tapex-base on tabfact.

* * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
* Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql

* * Remove the default tokenizer_name option.
* Provide evaluation command.

* * Support for WikiTableQuestion dataset.

* Fix a typo in README.

* * Fix the datasets's key name in WikiTableQuestions

* Run make fixup and move test to folder

* Fix quality

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply some more suggestions from code review

* Improve docstrings

* Overwrite failing test

* Improve comment in example scripts

* Fix rebase

* Add TAPEX to Auto mapping

* Add TAPEX to auto config mappings

* Put TAPEX higher than BART in auto mapping

* Add TAPEX to doc tests

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: SivilTaram <qianlxc@outlook.com>
Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-04-08 10:57:51 +02:00
Stefan Schweter
33cb21150c
bert: properly mention deprecation of TF2 conversion script (#16171) 2022-04-07 17:35:17 -04:00
Francesco Saverio Zuppichini
af14c61973
RegNet (#16188)
* base model done

* make style

* done

* added files

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Trigger doc build

* resolved conversations

* resolved conversations

* seer models

* minor changes

* minor changes

* make fixup

* glob variables

* minor changes

* fix copies

* config when possibile

* resolved conflicts

* resolved conflicts

* resolved conflicts

* CI

* conversion script for 10b param

* fixed for 10b model

* minor updates in the doc + make style

* removed unused code

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* removed unused code

* removed unused code

* updated modeling_utils from main

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2022-04-07 21:58:00 +02:00
Britney Muller
3e26e78b3b
Update Support image on README.md (#16615)
* Update README.md Support Image

Updates the Support image linking to our EAP page (to give it a refresh + help avoid image fatigue).

Slack thread checking in with #open-source-internal on this update (https://huggingface.slack.com/archives/C021H1P1HKR/p1648838903316709)

* Compressed Updated Support image

* Improves Support Image Logo + Height

Updated the image based on logo + size feedback. Big thanks to Bibi for making quick edits to this image.
2022-04-07 15:06:50 -04:00
Francesco Saverio Zuppichini
4099817bd6
Updated _load_pretrained_model_low_mem to check if keys are in the state_dict (#16643)
* Updated _load_pretrained_model_low_mem to check if keys are in the stored state_dict

* update after conversions
2022-04-07 20:48:04 +02:00
Sylvain Gugger
389f66151d
Remove parent/child tests in auto model tests (#16653) 2022-04-07 11:05:10 -04:00
Stas Bekman
080e42d0ac
[megatron-bert-uncased-345m] fix conversion (#16639) 2022-04-07 07:56:34 -07:00
Laura Vasquez-Rodriguez
09a272b02a
Add inputs vector to calculate metric method (#16461)
* Add inputs vector to calculate metric method

* Include inputs for evaluation metrics with backwards compatibility

* Prevent inputs create OOM issue and documentation details

* Update style and code documentation

* Fix style formatting issues

* Update files format with make style
2022-04-07 10:02:43 -04:00
NielsRogge
dc991805bf
Fix doc example (#16448)
* Fix doc

* Make fixup

Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home>
2022-04-07 10:48:24 +02:00
Zachary Mueller
febe42b5da
Update no_trainer scripts with new Accelerate functionalities (#16617)
Adds logging and save/loading to the Accelerate scripts

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-06 15:29:32 -04:00
Sylvain Gugger
10c15d2d1e
Allow the same config in the auto mapping (#16631) 2022-04-06 14:21:15 -04:00
Anmol Joshi
8ac9b82724
Added Annotations for PyTorch models (#16619)
* Update modeling_mpnet.py

* Update modeling_ctrl.py

* formatting

* Formatting

* Formatting

* annotated FSMT

* Added annotations for LED

* Added Annotations for M2M

* Added annotations for nystromformer

* Added annotations for OpenAI

* Added annotations for RAG

* Removed unused imports

* fix isort errors

* Removed inputs_embeds docstring, corrected original

* flake8 fixes

* doc-builder fixes
2022-04-06 14:12:01 -04:00
Joao Gante
3f43d824b9
TF generate refactor - Beam Search (#16374)
* refactor TF beam search

* refactored generate can now properly use attention masks

* add force bos/eos logit processors
2022-04-06 18:19:34 +01:00
Stas Bekman
4d10083539
[modeling_utils] rearrange text (#16632) 2022-04-06 09:35:42 -07:00
Lysandre Debut
a180efe7fd Dev version 2022-04-06 11:08:12 -04:00
Sylvain Gugger
b9bf91a970 Revert "Allow the same config in the auto mapping"
This reverts commit b1a7dfe099.
2022-04-06 09:58:13 -04:00
Sylvain Gugger
b1a7dfe099 Allow the same config in the auto mapping 2022-04-06 09:57:47 -04:00
Yih-Dar
2aef4cfe58
Fix TFTransfoXLLMHeadModel outputs (#16590)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-06 15:42:15 +02:00
Sanchit Gandhi
8d57c424e0
[FlaxSpeechEncoderDecoderModel] More Rigorous PT-Flax Equivalence Tests (#16589) 2022-04-06 15:33:32 +02:00
Patrick von Platen
c65633156b
[Speech2Text Doc] Fix docs (#16611)
* [Speech2Text Doc] Fix docs

* apply ydshiehs suggestions
2022-04-06 14:19:00 +02:00