Commit Graph

138 Commits

Author SHA1 Message Date
Yih-Dar
8b240a0661
Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222)
* Add cross attentions to TFGPT2Model

* Add TFEncoderDecoderModel

* Add TFBaseModelOutputWithPoolingAndCrossAttentions

* Add cross attentions to TFBertModel

* Fix past or past_key_values argument issue

* Fix generation

* Fix save and load

* Add some checks and comments

* Clean the code that deals with past keys/values

* Add kwargs to processing_inputs

* Add serving_output to TFEncoderDecoderModel

* Some cleaning + fix use_cache value issue

* Fix tests + add bert2bert/bert2gpt2 tests

* Fix more tests

* Ignore crossattention.bias when loading GPT2 weights into TFGPT2

* Fix return_dict_in_generate in tf generation

* Fix is_token_logit_eos_token bug in tf generation

* Finalize the tests after fixing some bugs

* Fix another is_token_logit_eos_token bug in tf generation

* Add/Update docs

* Add TFBertEncoderDecoderModelTest

* Clean test script

* Add TFEncoderDecoderModel to the library

* Add cross attentions to TFRobertaModel

* Add TFRobertaEncoderDecoderModelTest

* make style

* Change the way of position_ids computation

* bug fix

* Fix copies in tf_albert

* Remove some copied from and apply some fix-copies

* Remove some copied

* Add cross attentions to some other TF models

* Remove encoder_hidden_states from TFLayoutLMModel.call for now

* Make style

* Fix TFRemBertForCausalLM

* Revert the change to longformer + Remove copies

* Revert the change to albert and convbert + Remove copies

* make quality

* make style

* Add TFRembertEncoderDecoderModelTest

* make quality and fix-copies

* test TFRobertaForCausalLM

* Fixes for failed tests

* Fixes for failed tests

* fix more tests

* Fixes for failed tests

* Fix Auto mapping order

* Fix TFRemBertEncoder return value

* fix tf_rembert

* Check copies are OK

* Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined

* Add TFEncoderDecoderModelSaveLoadTests

* fix tf weight loading

* check the change of use_cache

* Revert the change

* Add missing test_for_causal_lm for TFRobertaModelTest

* Try cleaning past

* fix _reorder_cache

* Revert some files to original versions

* Keep as many copies as possible

* Apply suggested changes - Use raise ValueError instead of assert

* Move import to top

* Fix wrong require_torch

* Replace more assert by raise ValueError

* Add test_pt_tf_model_equivalence (the test won't pass for now)

* add test for loading/saving

* finish

* finish

* Remove test_pt_tf_model_equivalence

* Update tf modeling template

* Remove pooling, added in the prev. commit, from MainLayer

* Update tf modeling test template

* Move inputs["use_cache"] = False to modeling_tf_utils.py

* Fix torch.Tensor in the comment

* fix use_cache

* Fix missing use_cache in ElectraConfig

* Add a note to from_pretrained

* Fix style

* Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt

* Fix TFMLP (in TFGPT2) activation issue

* Fix None past_key_values value in serving_output

* Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub

* Apply review suggestions - style for cross_attns in serving_output

* Apply review suggestions - change assert + docstrings

* break the error message to respect the char limit

* deprecate the argument past

* fix docstring style

* Update the encoder-decoder rst file

* fix Unknown interpreted text role "method"

* fix typo

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-10-13 00:10:34 +02:00
Mishig Davaadorj
026866df92
Image Segmentation pipeline (#13828)
* Implement img seg pipeline

* Update src/transformers/pipelines/image_segmentation.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/image_segmentation.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update output shape with individual masks

* Rm dev change

* Remove loops in test

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2021-10-08 09:59:53 +02:00
Matt
3a8a8013ad
Keras callback to push to hub each epoch, or after N steps (#13773)
* Keras callback to push to hub each epoch, or after N steps

* Reworked the callback to use Repository

* Use an Enum for save_strategy

* Style pass

* Correct type for tokenizer

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding print message to the final upload

* Adding print message to the final upload

* Change how we wait for the last process to finish

* is_done is a property, not a method, derp

* Docstrings and documentation

* Style pass

* Style edit

* Docstring reformat

* Docstring rewrite

* Replacing print with internal logger

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-09-29 12:47:35 +01:00
Sylvain Gugger
3081d3868e
Push to hub when saving checkpoints (#13503)
* Push to hub when saving checkpoints

* Add model card

* Revert partial model card

* Small fix for checkpoint

* Add tests

* Add documentation

* Fix tests

* Bump huggingface_hub

* Fix test
2021-09-14 08:02:15 -04:00
Nicolas Patry
c63fcabfe9
[Large PR] Entire rework of pipelines. (#13308)
* Enabling dataset iteration on pipelines.

Enabling dataset iteration on pipelines.

Unifying parameters under `set_parameters` function.

Small fix.

Last fixes after rebase

Remove print.

Fixing text2text `generate_kwargs`

No more `self.max_length`.

Fixing tf only conversational.

Consistency in start/stop index over TF/PT.

Speeding up drastically on TF (nasty bug where max_length would increase
a ton.)

Adding test for support for non fast tokenizers.

Fixign GPU usage on zero-shot.

Fix working on Tf.

Update src/transformers/pipelines/base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Update src/transformers/pipelines/base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Small cleanup.

Remove all asserts + simple format.

* Fixing audio-classification for large PR.

* Overly explicity null checking.

* Encapsulating GPU/CPU pytorch manipulation directly within `base.py`.

* Removed internal state for parameters of the  pipeline.

Instead of overriding implicitly internal state, we moved
to real named arguments on every `preprocess`, `_forward`,
`postprocess` function.

Instead `_sanitize_parameters` will be used to split all kwargs
of both __init__ and __call__ into the 3 kinds of named parameters.

* Move import warnings.

* Small fixes.

* Quality.

* Another small fix, using the CI to debug faster.

* Last fixes.

* Last fix.

* Small cleanup of tensor moving.

* is not None.

* Adding a bunch of docs + a iteration test.

* Fixing doc style.

* KeyDataset = None guard.

* RRemoving the Cuda test for pipelines (was testing).

* Even more simple iteration test.

* Correct import .

* Long day.

* Fixes in docs.

* [WIP] migrating object detection.

* Fixed the target_size bug.

* Fixup.

* Bad variable name.

* Fixing `ensure_on_device` respects original ModelOutput.
2021-09-10 14:47:48 +02:00
Aleksander Smywiński-Pohl
c37573806a
Fix typo in deepspeed documentation (#13482)
* Fix typo in deepspeed documentation

* Add missing import in deepspeed configuration
2021-09-08 11:24:10 -07:00
Mohan Zhang
41cd52a768
fixed document (#13414) 2021-09-08 11:48:00 -04:00
Mishig Davaadorj
2a15e8ccfb
Object detection pipeline (#12886)
* Implement object-detection pipeline

* Define threshold const

* Add `threshold` argument

* Refactor

* Uncomment test inputs

* `rm

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better doc

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm unnecessary lines

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Chore better naming

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix typo

* Add `detr-tiny` for tests

* Add `ObjectDetectionPipeline` to `trnsfrmrs/init`

* Implement new bbox format

* Update detr post_process

* Update `load_img` method obj det pipeline

* make style

* Implement new testing format for obj det pipeln

* Add guard pytorch specific code in pipeline

* Add doc

* Make pipeline_obj_tet tests deterministic

* Revert some changes to `post_process` COCO api

* Chore

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/pipelines/object_detection.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Rm timm requirement

* make fixup

* Add timm requirement to test

* Make fixup

* Guard torch.Tensor

* Chore

* Delete unnecessary comment

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2021-09-08 17:17:32 +02:00
Nils Reimers
c8be8a9adb
Update model configs - Allow setters for common properties (#13026)
* refactor GPT Config to allow dyn. properties

* make attribute_map a class attribute

* remove old code

* update unit test to test config: Add test for common properties setter

* update unit test to test config: Add test for common properties passed as parameters to __init__

* update to black code format

* Allow that setters are not defined for certain config classes

* update config classes to implement attribute_map

* bugfix lxmert config - id2labels was not defined when num_labels was set

* update broken configs - add attribute_maps

* update bart config

* update black codestyle

* update documentation on common config attributes

* update GPTJ config to new attribute map

* update docs on common attributes

* gptj config: add max_position_embeddings

* gptj config: format with black

* update speech to text 2 config

* format doc file to max_len 119

* update config template
2021-09-06 16:30:13 +02:00
Anton Lozhkov
b9c6a97694
Add the AudioClassificationPipeline (#13342)
* Add the audio classification pipeline

* Remove autoconfig exception

* Mark ffmpeg test as slow

* Rearrange pipeline tests

* Add small test

* Replace asserts with ValueError
2021-09-01 11:03:48 +03:00
Matt
854260ca44
TF/Numpy variants for all DataCollator classes (#13105)
* Adding a TF variant of the DataCollatorForTokenClassification to get feedback

* Added a Numpy variant and a post_init check to fail early if a missing import is found

* Fixed call to Numpy variant

* Added a couple more of the collators

* Update src/transformers/data/data_collator.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixes, style pass, finished DataCollatorForSeqToSeq

* Added all the LanguageModeling DataCollators, except SOP and PermutationLanguageModeling

* Adding DataCollatorForPermutationLanguageModeling

* Style pass

* Add missing `__call__` for PLM

* Remove `post_init` checks for frameworks because the imports inside them were making us fail code quality checks

* Remove unused imports

* First attempt at some TF tests

* A second attempt to make any of those tests actually work

* TF tests, round three

* TF tests, round four

* TF tests, round five

* TF tests, all enabled!

* Style pass

* Merging tests into `test_data_collator.py`

* Merging tests into `test_data_collator.py`

* Fixing up test imports

* Fixing up test imports

* Trying shuffling the conditionals around

* Commenting out non-functional old tests

* Completed all tests for all three frameworks

* Style pass

* Fixed test typo

* Style pass

* Move standard `__call__` method to mixin

* Rearranged imports for `test_data_collator`

* Fix data collator typo "torch" -> "pt"

* Fixed the most embarrassingly obvious bug

* Update src/transformers/data/data_collator.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Renaming mixin

* Updating docs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Dalton Walker <dalton_walker@icloud.com>
Co-authored-by: Andrew Romans <andrew.romans@hotmail.com>
2021-08-31 13:06:48 +01:00
Serhiy-Shekhovtsov
11fbc32e3e
Fixing a typo in the data_collator documentation (#13309) 2021-08-31 06:01:12 -04:00
arfy slowy
01977466f4
fix: typo spelling grammar (#13212)
* fix: typo spelling grammar

* fix: make fixup
2021-08-30 08:09:14 -04:00
Patrick von Platen
fbf468b057
[Flax] Correct flax docs (#12782)
* fix_torch_device_generate_test

* remove @

* fix flax docs

* correct more docs in flax

* another correction

* fix flax docs

* Apply suggestions from code review
2021-08-04 16:31:23 +02:00
Stas Bekman
807b6bd160
[Deepspeed] warmup_ratio docs (#12830)
* [Deepspeed] warmup_ratio docs

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-21 10:49:29 -07:00
Sylvain Gugger
da72ac6e26
Fix push_to_hub docstring and make it appear in doc (#12770) 2021-07-17 15:52:33 +02:00
Stas Bekman
5dd0c956a8
non-native optimizers are mostly ok with zero-offload (#12690) 2021-07-13 20:18:51 -07:00
Stas Bekman
78f5fe1416
[Deepspeed] adapt multiple models, add zero_to_fp32 tests (#12477)
* zero_to_fp32 tests

* args change

* remove unnecessary work

* use transformers.trainer_utils.get_last_checkpoint

* document the new features

* cleanup

* wip

* fix fsmt

* add bert

* cleanup

* add xlm-roberta

* electra works

* cleanup

* sync

* split off the model zoo tests

* cleanup

* cleanup

* cleanup

* cleanup

* reformat

* cleanup

* casing

* deepspeed>=0.4.3

* adjust distilbert

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-07-13 12:07:32 -07:00
Stas Bekman
7682e97702
[models] respect dtype of the model when instantiating it (#12316)
* [models] respect dtype of the model when instantiating it

* cleanup

* cleanup

* rework to handle non-float dtype

* fix

* switch to fp32 tiny model

* improve

* use dtype.is_floating_point

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix the doc

* recode to use explicit torch_dtype_auto_detect, torch_dtype args

* docs and tweaks

* docs and tweaks

* docs and tweaks

* merge 2 args, add docs

* fix

* fix

* better doc

* better doc

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-28 20:11:21 -07:00
Stas Bekman
4a872caef4
remove extra white space from log format (#12360) 2021-06-25 13:20:14 -07:00
Stas Bekman
07ae6103c3
[Deepspeed] new docs (#12077)
* document sub_group_size

* style

* install + issues reporting

* style

* style

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* indent 4

* restore

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-23 11:07:37 -07:00
Stas Bekman
ebe5413589
[trainer] 2 bug fixes and a rename (#12309)
* bug fixes and a rename

* add extended DDP test
2021-06-22 11:13:23 -07:00
Stas Bekman
dad414d5f9
[trainer + examples] set log level from CLI (#12276)
* set log level from CLI

* add log_level_replica + test + extended docs

* cleanup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename datasets objects to allow datasets module

* improve the doc

* style

* doc improve

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-21 19:30:50 -07:00
Stas Bekman
040283170c
consistent nn. and nn.functional: part 5 docs (#12161) 2021-06-14 13:34:32 -07:00
Stas Bekman
0e82f0cbc2
typo 2021-06-08 12:55:17 -07:00
Stas Bekman
32290d87f6
[Deepspeed] various fixes (#12058)
* replace deprecated config

* sub_group_size was too big

* complete deprecation removal
2021-06-08 08:36:15 -07:00
Stas Bekman
2c73b93099
[Deepspeed] Assert on mismatches between ds and hf args (#12021)
* wip

* add mismatch validation + test

* renames

* Update docs/source/main_classes/deepspeed.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* renames

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-04 08:58:23 -07:00
Stas Bekman
640318befa
[deepspeed] Move code and doc into standalone files (#11984)
* move code and docs

* style

* moved

* restore
2021-06-02 09:56:00 -07:00
Stas Bekman
7ec596ecda
[DeepSpeed] decouple DeepSpeedConfigHF from Trainer (#11966)
* decouple DeepSpeedConfigHF from Trainer

* add LoggingLevel ctx manager; add new test

* cleanup

* add docs

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* implemented suggested renames

* formatter workaround

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-01 13:24:52 -07:00
Stas Bekman
79712e7e7a
[deepspeed] docs (#11940)
* deepspeed docs

* cleanup

* cleanup
2021-06-01 09:21:21 -07:00
Patrick von Platen
996a315e76
Flax Generate (#11777)
* fix_torch_device_generate_test

* remove @

* add

* indexing

* correct a couple of tests

* fix tests

* add logits processor

* finish top_k, top_p, temp

* add docs

* correct flax prng key default

* improve generate

* add generation docs

* add docs

* make style

* revert model outputs change

* make style

* correct typo

* fix tests

* fix slow test

* add raise

* finish generation

Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-05-27 00:18:17 +01:00
Sylvain Gugger
cbbf49f644 Fix doc deployment 2021-05-13 10:34:14 -04:00
Lysandre Debut
39084ca663
Add the ImageClassificationPipeline (#11598)
* Add the ImageClassificationPipeline

* Code review

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

* Have `load_image` at the module level

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-05-07 08:08:40 -04:00
Stas Bekman
c065025c47
[trainer] document resume randomness (#11588)
* document resume randomness

* fix link

* reword

* fix

* reword

* style
2021-05-04 14:17:11 -07:00
Stas Bekman
4e7bf94e72
[DeepSpeed] fp32 support (#11499)
* prep for deepspeed==0.3.16

* new version

* too soon

* support and test fp32 mode

* troubleshooting doc start

* workaround no longer needed

* add fp32 doc

* style

* cleanup, add tf32 note

* clarify

* release was made
2021-04-30 12:51:48 -07:00
Nicolas Patry
db9dd09cf9
Adding AutomaticSpeechRecognitionPipeline. (#11337)
* Adding `AutomaticSpeechRecognitionPipeline`.

- Because we added everything to enable this pipeline, we probably
should add it to `transformers`.
- This PR tries to limit the scope and focuses only on the pipeline part
(what should go in, and out).
- The tests are very specific for S2T and Wav2vec2 to make sure both
architectures are supported by the pipeline. We don't use the mixin for
tests right now, because that requires more work in the `pipeline`
function (will be done in a follow up PR).
- Unsure about the "helper" function `ffmpeg_read`. It makes a lot of
  sense from a user perspective, it does not add any additional
dependencies (as in hard dependency, because users can always use their
own load mechanism). Meanwhile, it feels slightly clunky to have so much
optional preprocessing.
- The pipeline is not done to support streaming audio right now.

Future work:

- Add `automatic-speech-recognition` as a `task`. And add the
FeatureExtractor.from_pretrained within `pipeline` function.
- Add small models within tests
- Add the Mixin to tests.
- Make the logic between ForCTC vs ForConditionalGeneration better.

* Update tests/test_pipelines_automatic_speech_recognition.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Adding docs + main import + type checking + LICENSE.

* Doc style !.

* Fixing TYPE_HINT.

* Specifying waveform shape in the docs.

* Adding asserts + specify in the documentation the shape of the input
np.ndarray.

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding require to tests + move the `feature_extractor` doc.

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-30 11:54:08 +02:00
Hamel Husain
88ac60f7b5
update QuickTour docs to reflect model output object (#11462)
* update docs to reflect model output object

* run make style`
2021-04-26 22:18:37 -04:00
Stas Bekman
bc2571e61c
[Deepspeed] ZeRO-Infinity integration plus config revamp (#11418)
* adding Z-inf

* revamp config process

* up version requirement

* wip

* massive rewrite

* cleanup

* cleanup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* consistent json commas

* act on suggestions

* leave this feature for 0.3.16

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-26 10:40:32 -07:00
Sylvain Gugger
bf2e0cf70b
Trainer push to hub (#11328)
* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-23 09:17:37 -04:00
Sylvain Gugger
dabeb15292
Examples reorg (#11350)
* Base move

* Examples reorganization

* Update references

* Put back test data

* Move conftest

* More fixes

* Move test data to test fixtures

* Update path

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments and clean

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-21 11:11:20 -04:00
Sylvain Gugger
f38cd4373f
Indent code block in the documentation (#11233)
* Indent code block

* Indent code blocks version 2

* Quality
2021-04-13 15:36:36 -04:00
Sylvain Gugger
3312e96bfb
Doc check: a bit of clean up (#11224) 2021-04-13 12:14:25 -04:00
fghuman
0c6fcd3034
Added documentation for data collator. (#10941)
* Added documentation for data collator.

* Update docs/source/data_collator.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Added documentation for data collator.

* Added documentation for the data collator.

* Merge branch 'doc_DataCollator' of C:\Users\mahii\PycharmProjects\transformers with conflicts.

* Update documentation for the data collator.

* Update documentation for the data collator.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Amna <A.A.Ahmad@student.tudelft.nl>
2021-04-12 11:59:46 -04:00
Stas Bekman
0311ba2153
typo (#11152)
* typo

* style
2021-04-08 19:47:31 -07:00
Stas Bekman
c2e0fd5283
[setup] make fairscale and deepspeed setup extras (#11151)
* make fairscale and deepspeed setup extras

* fix default

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* no reason not to ask for the good version

* update the CIs

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-08 15:46:54 -07:00
Stas Bekman
66446909b2
[tests] relocate core integration tests (#11146)
* relocate core integration tests

* add sys.path context manager

* cleanup

* try

* try2

* fix path

* doc

* style

* add dep

* add 2 more deps
2021-04-08 13:13:17 -07:00
Stas Bekman
c6d664849b
[DeepSpeed] ZeRO Stage 3 (#10753)
* synced gpus

* fix

* fix

* need to use t5-small for quality tests

* notes

* complete merge

* fix a disappearing std stream problem

* start zero3 tests

* wip

* tune params

* sorting out the pre-trained model loading

* reworking generate loop wip

* wip

* style

* fix tests

* split the tests

* refactor tests

* wip

* parameterized

* fix

* workout the resume from non-ds checkpoint pass + test

* cleanup

* remove no longer needed code

* split getter/setter functions

* complete the docs

* suggestions

* gpus and their compute capabilities link

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* style

* remove invalid paramgd

* automatically configure zero3 params that rely on hidden size

* make _get_resized_embeddings zero3-aware

* add test exercising resize_token_embeddings()

* add docstring

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-04-08 09:53:01 -07:00
Amala Deshmukh
e1c02e018c
Add example for registering callbacks with trainers (#10928)
* Add example for callback registry

Resolves: #9036

* Update callback registry documentation

* Added comments for other ways to register callback
2021-04-05 12:27:23 -04:00
Lysandre Debut
9f4e0c23d6
Documentation about loading a fast tokenizer within Transformers (#11029)
* Documentation about loading a fast tokenizer within Transformers

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-05 10:51:16 -04:00
Sylvain Gugger
b0595d33c1
Add ImageFeatureExtractionMixin (#10905)
* Add ImageFeatureExtractionMixin

* Add dummy vision objects

* Add require_vision

* Add tests

* Fix test
2021-03-26 11:23:56 -04:00