As discussed at https://github.com/huggingface/transformers/issues/6317 codecov currently sends an invalid report when it fails to find a code coverage report for the base it checks against, so this gets fixed by:
- require_base: yes # don't report if there is no base coverage report
let's add this for clarity, this supposedly is already the default.
- require_head: yes # don't report if there is no head coverage report
and perhaps no point reporting on doc changes as they don't make any difference and it just generates noise:
- require_changes: true # only comment if there was change in coverage
* [doc] multiple corrections to "Summary of the tasks"
* fix indentation
* correction
* fix links, add links to examples/seq2seq/README.md instead of non-existing script
* [doc] make the text more readable, fix some typos, add some disambiguation
* Update docs/source/glossary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [testing] switch to a new TestCasePlus + get_auto_remove_tmp_dir() for auto-removal of tmp dirs
* respect after=True for tempfile, simplify code
* comments
* comment fix
* put `before` last in args, so can make debug even faster
Currently with the bug introduced we're taking two optimizer steps per
batch: one global one, where `xm.optimizer_step` injects a CRS between
all cores in training, and one without. This has been affecting training
accuracy (for example, XLNet GLUE on MNLI is not converging, etc.).
* Generation doc
* MBartForConditionalGeneration (#6441)
* add MBartForConditionalGeneration
* style
* rebase and fixes
* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS
* fix docs
* don't ignore mbart
* doc
* fix mbart fairseq link
* put mbart before bart
* apply doc suggestions
* Use hash to clean the test dirs (#6475)
* Use hash to clean the test dirs
* Use hash to clean the test dirs
* Use hash to clean the test dirs
* fix
* [EncoderDecoder] Add Cross Attention for GPT2 (#6415)
* add cross attention layers for gpt2
* make gpt2 cross attention work
* finish bert2gpt2
* add explicit comments
* remove attention mask since not yet supported
* revert attn mask in pipeline
* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Sort unique_no_split_tokens to make it deterministic (#6461)
* change unique_no_split_tokens's type to set
* use sorted list instead of set
* style
* Import accuracy_score (#6480)
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address comments
* Styling
* Generation doc
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address comments
* Styling
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: gijswijnholds <gijswijnholds@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>