Commit Graph

8821 Commits

Author SHA1 Message Date
Nicolas Patry
2e12d90b9e
Fixing Dataset for TQA + token-classification. (#14658)
* Fixing Dataset for TQA + token-classification.

* Fixing the tests.

* Making sure `offset_mappings` is a valid argument.
2021-12-08 09:54:24 +01:00
Stas Bekman
fae0b9faef
[trainer] conditional ctx managers into one wrapper (#14663)
* [trainer] conditional ctx managers into one wrapper

* workaround for contextlib.nullcontext for py<3.7

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* one more autocast

* style

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-07 13:04:18 -08:00
TranSirius
39f1dff5a0
Fix a Bug, trainer_seq2seq.py, in the else branch at Line 172, generation_inputs should be a dict (#14546)
* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()

* fix bug, trainer_seq2seq.py, Line 172, generation_inputs must be a dict before feeding into self.model.generation()
2021-12-07 12:09:18 -05:00
Nouamane Tazi
2171695cc2
quick fix SummarizationPipeline error messages (#14618)
* quick fix SummarizationPipeline error messages

Fix error messages to avoid spam errors, and errors of type:
`Your max_length is set to 50, but you input_length is only 46. You might consider decreasing max_length manually, e.g. summarizer('...', max_length=50)`

* correcto SummarizationPipeline error messages fixes
2021-12-07 16:44:28 +01:00
Stas Bekman
b66c5ab20c
[deepspeed] fix --load_best_model_at_end (#14652)
* [deepspeed] fix load_best_model_at_end

* try with pull_request_target

* revert: try with pull_request_target

* style

* add test

* cleanup
2021-12-06 21:57:47 -08:00
Ryokan RI
30646a0a3c
Add mLUKE (#14640)
* implement MLukeTokenizer and LukeForMaskedLM

* update tests

* update docs

* add LukeForMaskedLM to check_repo.py

* update README

* fix test and specify the entity pad id in tokenization_(m)luke

* fix EntityPredictionHeadTransform
2021-12-07 00:25:28 -05:00
Yih-Dar
4cdb67caba
Use cross_attention_hidden_size in Encoder-Decoder models (#14378)
* add cross_attention_hidden_size to text-2-text encoder-decoder models (PT/Flax)

* for TFEncoderDecoderModel

* add equivalence test for TFEncoderDecoderModel

* fix

* fix failed equivalence tests

* remove unused import

* add detailed comment

* Fix check_equivalence_tf_to_pt by using encoder/decoder

* cleaning

* Use cross_attention_hidden_size in speech-to-text

* clean fast init logging msg in encoder decoder models

* increase tol from 1e-5 to 1e-3 for tf test

* style

* style

* make sure projection layer can run

* remove type conversion + add check

* fix conflict (config.output_hidden_size)

* Remove TF -> PT in check_pt_tf_equivalence for TFEncoderDecoderModel

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-12-07 00:27:32 +01:00
Sylvain Gugger
381b05a3f5 Remove nonworking workflow for now 2021-12-06 17:25:28 -05:00
Suraj Patil
75ae287aec
fix flax examples tests (#14646)
* make tensorboard optional

* update test_fetcher for flax examples

* make the tests slow
2021-12-07 00:34:27 +05:30
Sylvain Gugger
03fda7b743
Add a job to test the documentation build (#14645)
* Add a job to the documentation build

* Add caching

* Test cache
2021-12-06 13:55:59 -05:00
Sylvain Gugger
e513c16e82
Fix syntax for class references (#14644) 2021-12-06 13:31:27 -05:00
Lysandre Debut
e9688875bf
Auto processor fix (#14623)
* Add AutoProcessor class
Init and tests
Add doc
Fix init
Update src/transformers/models/auto/processing_auto.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Reverts to tokenizer or feature extractor when available
Adapt test

* Revert "Adapt test"

This reverts commit bbdde5fab0.

* Revert "Reverts to tokenizer or feature extractor when available"

This reverts commit 77659ff5d2.

* Don't revert everything Lysandre!

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-12-06 12:49:50 -05:00
Suraj Patil
cbe6026536
fix flax example tests (#14643) 2021-12-06 23:14:37 +05:30
guhur
df085d8ea8
doc: mismatch between pooler/d_output (#14641)
The model outputs a pooler_output whereas the doctype examples were using a pooled_output.
2021-12-06 11:51:53 -05:00
tucan9389
0f3f045ebd
Add GPTJForQuestionAnswering (#14503)
* Add GPTJForQuestionAnswering

* Reformat for GPTJForQuestionAnswering

* Fix isort error

* make style for GPTJForQA

* Add _keys_to_ignore_on_load_missing

* Change the sequence of qa and classification

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-12-06 11:44:10 -05:00
Jay Zhang
1ccc033c56
Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. (#14310)
* Update code to resolve comments left in previous PR.

* Add README.md file for this example.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update README.md file to resolve comments.

* Add a section name.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Add more comments for _convert_past_list_to_tuple().

* Change the default file name to a consistent one.

* Fix a format issue.

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/run_onnx_exporter.py

Co-authored-by: Gary Miguel <garymm@garymm.org>

* Update examples/onnx/pytorch/translation/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Change the folder to summarization and address some other coments.

* Update the torch version.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Gary Miguel <garymm@garymm.org>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-06 14:01:51 +01:00
Julien Chaumond
6cdc3a7844
[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617)
* Replace outdated model tags with their now-canonical pipeline types

* spam the CI till it's green
2021-12-06 04:35:01 -05:00
Suraj Patil
c824d7ed48
add flax example tests in CI workflow (#14637) 2021-12-06 14:50:43 +05:30
Suraj Patil
bc8a9f415b
fix typo (#14635) 2021-12-06 10:52:43 +05:30
Suraj Patil
c5bd732ac6
Add Flax example tests (#14599)
* add test for glue

* add tests for clm

* fix clm test

* add summrization tests

* more tests

* fix few tests

* add test for t5 mlm

* fix t5 mlm test

* fix tests for multi device

* cleanup

* ci job

* fix metric file name

* make t5 more robust
2021-12-06 10:48:58 +05:30
Kamal Raj
803a8cd18f
updated readme with proper arguments (#14624) 2021-12-05 22:12:51 -05:00
(Bill) Yuchen Lin
3977b58437
fix a typo (#14626) 2021-12-05 11:31:23 +05:30
Matt
73ec4340ec
Make DefaultDataCollator importable from root (#14588)
* Make DefaultDataCollator importable from root

* Add documentation for DefaultDataCollator and add return_tensors argument to all class docstrings

* make style

* Add DefaultDataCollator to data_collator.rst

* Add DefaultDataCollator to data_collator.rst
2021-12-03 15:15:09 -05:00
Stas Bekman
71b1bf7ea8
[trainer] add tf32-mode control (#14606)
* [trainer] add --tf32 support

* it's pt>=.17

* it's pt>=.17

* flip the default to True

* add experimental note

* simplify logic

* style

* switch to 3-state logic

* doc

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* re-style code

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-03 10:08:58 -08:00
Lysandre Debut
aada989ad5
Fix doc builder (#14616)
* Fix doc builder

* Fix doc builder

* Fix doc builder
2021-12-03 12:09:25 -05:00
Lysandre Debut
ec47baeba2
2022 is the year of multi-modality (#14610)
* 2022 is the year of multi-modality

* Small fix

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

* Apply to documentation index

* Apply suggestions from code review

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-03 11:35:44 -05:00
Stas Bekman
e62091d5a7
[CI] move env print to util, add pt, nccl versions (#14607)
* move env print to util, add pt, nccl versions

* style

* version

* align
2021-12-03 08:18:36 -05:00
Li-Huai (Allan) Lin
66ea739168
Improve tokenizer tests (#13594)
* Use new method to acquire tokenizers

* Resolve TODOs.

* Style

* Fix

* Enable do_lower_case in test_tokenize_special_tokens

* Apply suggestion from code review

* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291.

* Fix FNet mask token tokenization

* Complete everything

* Apply suggestions from code review
2021-12-03 08:39:10 +01:00
Nik
6645eb61fa
fix #14524 (IndexError when mask prob is too low) (#14525)
* fix #14524 (IndexError when mask prob is too low)

* fix formatting

* correct documentation, add option for setting min_num_masks

* change the semantic meaning of `mask_prob` in _compute_mask_indices

With this commit the meaing of `mask_prob` actually adhered to the probability for each
vector to be the start of a masked span of length.

* fix check_copies test

* fix documentation to semantic meaning of `upper bound of overall masking percentage`, revert changes to _compute_mask_indices

* fix typo
2021-12-02 17:05:31 +03:00
yis11178
96cc02b51b
change tf.math.divide with int(/) to remove dim_per_head from the TF graph (#14600)
Co-authored-by: yis <yis@graphcore.ai>
2021-12-02 13:13:42 +00:00
Leandro von Werra
43f953cc2e
Add CodeParrot 🦜 codebase (#14536)
* add readme skeleton

* update readme

* add initialization script

* add deduplication script

* add codeparrot training script

* add code generation evaluation

* add validation loss script

* add requirements

* update readme

* tweak readme

* make style

* add highlights to readme

* add CLIs to scripts

* add tokenizer training script

* add docstring to constant length dataset

* fix defaults in arguments

* update readme with cli

* move image to hub

* tweaks of readme

* fix cli commands

* add author

* explain env variables

* fix formatting

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Apply suggestions from code review

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* replace generic with gpt2 tokenizer

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2021-12-02 10:41:35 +01:00
Lysandre Debut
e4c67d60ec
Python 3.6 -> Python 3.7 for TF runs (#14598) 2021-12-02 04:09:17 -05:00
Daniel Stancl
50d909be28
[Flax] Add FlaxBlenderbotSmall (#14576)
* [WIP] Add FlaxBlenderbotSmall

* Revert some unintentionally changed files

Revert some unintentionally files changed by improperly filled cookiecutter instructions.

* Fix repo consistency

* Fix Flax-PT equivalence

* Apply suggestions from code review

* Update index.mdx

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-12-02 14:21:48 +05:30
Lysandre Debut
77d87e732e
Adds a git pull instruction to the documentation builder (#14597)
* Adds a git pull instruction

* master -> main
2021-12-02 03:32:38 -05:00
Mishig Davaadorj
275402bf2b
Update doc img links (#14593)
* Update doc img links

* Rename toctree.yml -> _toctree.yml (#14594)

* Update doc img links

* Update performance.md img link
2021-12-02 09:01:35 +01:00
Mishig Davaadorj
4f68de625c
Rename toctree.yml -> _toctree.yml (#14594) 2021-12-02 08:58:39 +01:00
Stas Bekman
fbe278c76c
[doc] bf16/tf32 guide (#14579)
* [doc] bf16/tf32 guide

* expand

* expand

* Update docs/source/performance.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-12-01 14:18:58 -08:00
Li-Huai (Allan) Lin
934e2799da
Fix mask token handling (#14364)
* Fix mask token handling

* Revert "Fix mask token handling"

This reverts commit daaa3f5291.

* Fix FNet mask token tokenization
2021-12-01 20:16:52 +01:00
Sylvain Gugger
4df7d05a87
Doc new front (#14590)
* Convert PretrainedConfig doc to Markdown

* Use syntax

* Add necessary doc files (#14496)

* Doc fixes (#14499)

* Fixes for the new front

* Convert DETR file for table

* Title is needed

* Simplify a bit

* Even simpler

* Remove imports

* Fix typo in toctree (#14516)

* Fix checkpoints badge

* Update versions.yml format (#14517)

* Doc new front github actions (#14512)

* Doc new front github actions

* Fix docstring

* Fix feature extraction utils import (#14515)

* Address Julien's comments

* Push to doc-builder

* Ready for merge

* Remove old build and deploy

* Doc misc fixes (#14583)

* Rm versions.yml from doc

* Fix converting.rst

* Rm pretrained_models from toctree

* Fix index links (#14567)

* Fix links in README

* Localized READMEs

* Fix copy script

* Fix find doc script

* Update README_ko.md

Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Adapt build command to new CLI tools (#14578)

* Fix typo

* Fix doc interlinks (#14589)

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Rm pattern <[a-z]+(.html).*>

* Rm huggingface.co/transformers/master

* Rm .html

* Rm .html from index.mdx

* Rm .html from model_summary.rst

* Update index.mdx rm html

* Update remove .html

* Fix inner doc links

* Fix interlink in preprocssing.rst

* Update pr_checks

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Add necessary doc files (#14496)

* Doc fixes (#14499)

* Fixes for the new front

* Convert DETR file for table

* Title is needed

* Simplify a bit

* Even simpler

* Remove imports

* Fix checkpoints badge

* Fix typo in toctree (#14516)

* Update versions.yml format (#14517)

* Doc new front github actions (#14512)

* Doc new front github actions

* Fix docstring

* Fix feature extraction utils import (#14515)

* Address Julien's comments

* Push to doc-builder

* Ready for merge

* Remove old build and deploy

* Doc misc fixes (#14583)

* Rm versions.yml from doc

* Fix converting.rst

* Rm pretrained_models from toctree

* Fix index links (#14567)

* Fix links in README

* Localized READMEs

* Fix copy script

* Fix find doc script

* Update README_ko.md

Co-authored-by: Julien Chaumond <julien@huggingface.co>

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Adapt build command to new CLI tools (#14578)

* Fix typo

* Fix doc interlinks (#14589)

* Convert PretrainedConfig doc to Markdown

* Use syntax

* Rm pattern <[a-z]+(.html).*>

* Rm huggingface.co/transformers/master

* Rm .html

* Rm .html from index.mdx

* Rm .html from model_summary.rst

* Update index.mdx rm html

* Update remove .html

* Fix inner doc links

* Fix interlink in preprocssing.rst

* Update pr_checks

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Styling

Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
2021-12-01 14:13:02 -05:00
Stas Bekman
14cc50d081 fix autocast for older pytorch 2021-12-01 09:32:52 -08:00
Suraj Patil
4c0dd199c8
FlaxGPTJ (#14396)
* add flax gptj

* no bias in attention dense

* no wpe

* fix rotary embeddings

* fix rotary embeds

* fix rotray embeds

* quality

* doc and quality

* fix equivalence tests
2021-12-01 10:57:39 +05:30
Jamie DeAntonis
70996a5420
WIP: Support for Training with BF16 (#13207)
* started bf16 integration

* minor changes

* code now runs

* style

* lay foundation for bf16 testing

* lay foundation for bf16 testing

* start the tests

* better bf16 check

* style

* 2 separate checkers - one for bf16 support, another for bf16+autocast

* Update src/transformers/training_args.py

Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* a couple of comment resolutions

* more comment resolutions

* resolved a small bug

* just some print statemtns

* added todo marking

* added a todo

* adjust for API change s/fast_dtype/dtype/

* fix style

* merge 2 bf16 util functions

* bf16 now does scaling too

* Add support for bfloat16

* Revert T5 layernorm to float32

This is based on the comment at https://github.com/huggingface/transformers/pull/14448/files#r752660929 and the PyTorch PR https://github.com/pytorch/pytorch/pull/66920 .

* Add comment about conversion to float32 before returning the numpy data

* Add comment about AMP-bfloat16 incompatibility

* Fix formatting

* typo

* reformer / bf16

* cleanup

* require at least pt-1.10

* fix

* will deal with deepspeed separately

* cleanup

* revert

* cleanup

* fp16_full_eval and bf16_full_eval are separate modes

* proper deprecation

* cleanup

* test and fixes

* spelling

* cleanup

* add a note that this API is experimental

Co-authored-by: jamie <jamie@cortx.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: suriya <suriya@cortx.com>
Co-authored-by: Manuel R. Ciosici <manuelrciosici@gmail.com>
2021-11-30 18:00:47 -08:00
Suraj Patil
fc1d97f29d
VisionTextDualEncoder (#13511)
* init vision_text_dual_encoder

* fix merge

* remove extra heads

* fix tests

* remove VISION_TEXT_DUAL_ENCODER_PRETRAINED_CONFIG_ARCHIVE_MAP

* remove archive map

* fix imports

* fix more imports

* fix init

* delete tokenizers

* fix imports

* clean

* support clip's vision model

* handle None config

* begin tests

* more test and few fixes

* warn about newly init weights

* more tests

* add loss to model

* remove extra classes from doc

* add processor

* doc and small fixes

* add start docstr

* update flax model

* flax tests

* more flax tests

* doc

* quality

* doc and quality

* fix doc

* doc

* remove comments

* update warning

* quality

* fix docs

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* replace asserts, fix imports

* update imports

* fix import

* address some review comments

* fix check

* reduce tolerance

* fix test

* add flax integration test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address Sylvain's comments

* fix style

* add pt_flax_equivalence test in PT tests

* add pt integration test

* update test

* use pre-trained checkpoint in examples

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-30 22:21:48 +05:30
Thomas Viehmann
6ed9882ddb
use functional interface for softmax in attention (#14198)
* use functional interface instead of instantiating module and immediately calling it

* fix torch.nn.functional to nn.functional. Thank you Stas!
2021-11-30 11:47:33 -05:00
giacomo snidero
4176bc161c
Add documentation for multi-label classification (#14168)
* "update example docstring multilabel example

* update example docstring multilabel example
2021-11-30 11:34:41 -05:00
Daniel Stancl
faacd74729
[Flax] Add FlaxBlenderbot (#13633)
* Init Flax implementation for Blenderbot

* Add a majority of stuff except for tests

* make style quality

* Add tests and fix some bugs

* Add tests

* Clean source code and fix some bugs

* Fix copies and docs

* Fix jax device condition for tests

* Fix layer norm in the encoder

* Fix a few typos in the test file

* make fix-copies

* make fix-copies

* fix layer norm

* Fix Flax params dtype (#13090)

* Fix PR reference (#13098)

* make fix-copies

* Update tests/test_modeling_flax_blenderbot.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-11-30 17:36:54 +05:30
Sylvain Gugger
254fef67cf
Fix backend regex (#14566) 2021-11-30 05:32:20 -05:00
Kamal Raj
c468a87a69
Tapas tf (#13393)
* TF Tapas first commit

* updated docs

* updated logger message

* updated pytorch weight conversion
script to support scalar array

* added use_cache to tapas model config to
work properly with tf input_processing

* 1. rm embeddings_sum
2. added # Copied
3. + TFTapasMLMHead
4. and lot other small fixes

* updated docs

* + test for tapas

* updated testing_utils to check
is_tensorflow_probability_available

* converted model logits post processing using
numpy to work with both PT and TF models

* + TFAutoModelForTableQuestionAnswering

* added TF support

* added test for
TFAutoModelForTableQuestionAnswering

* added test for
TFAutoModelForTableQuestionAnswering pipeline

* updated auto model docs

* fixed typo in import

* added tensorflow_probability to run tests

* updated MLM head

* updated tapas.rst with TF  model docs

* fixed optimizer import in docs

* updated convert to np
data from pt model is not
`transformers.tokenization_utils_base.BatchEncoding`
after pipeline upgrade

* updated pipeline:
1. with torch.no_gard removed, pipeline forward handles
2. token_type_ids converted to numpy

* updated docs.

* removed `use_cache` from config

* removed floats_tensor

* updated code comment

* updated Copyright Year and
logits_aggregation Optional

* updated docs and comments

* updated docstring

* fixed model weight loading

* make fixup

* fix indentation

* added tf slow pipeline test

* pip upgrade

* upgrade python to 3.7

* removed from_pt from tests

* revert commit f18cfa9
2021-11-30 11:07:55 +01:00
Matt
6fc38adff2
Add model checkpointing to push_to_hub and PushToHubCallback (#14492)
* Add checkpointing to push_to_hub and PushToHubCallback

* Add checkpoint loading

* Add missing default value

* Correct method name

* make style

* Moving everything to the right location

* make style

* Revert changes to file_utils.py

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Adding docstrings and comments to clarify code

* make style

* Fix organization positional arg

* Fix load_repo_checkpoint to no longer accidentally create empty repos

* make style

* Remove unnecessary 'organization' argument in load_repo_checkpoint

* Avoid private `_create_or_get_repo` method

* make style

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-11-29 17:36:19 +00:00
Rahul Nadkarni
8332327dca
Fix sentinel token IDs in data collator for Flax T5 pretraining script (#14477) 2021-11-29 17:30:17 +01:00