* Fix doc examples: cannot import name
* remove copy because of some necessary minor changes (maybe add copy to the individual methods instead)
* Keep copy with some modifications
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
- Do not run image-classification pipeline (_CHECKPOINT_FOR_DOC uses the checkpoint for
langage, which cannot load a FeatureExtractor so current logic fails).
- Add a safeguard to not run tests when `tokenizer_class` or
`feature_extractor_class` **are** defined, but cannot be loaded
This happens for Perceiver for the "FastTokenizer" (which doesn't exist
so None) and FeatureExtractor (which does exist but cannot be loaded
because the checkpoint doesn't define one which is reasonable for the
said checkpoint)
- Added `get_vocab` function to `PerceiverTokenizer` since it is used by
`fill-mask` pipeline when the argument `targets` is used to narrow a
subset of possible values.
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* Add some nicety flags for better controlling evaluation.
* Fix dependency issue with outdated requirement
* Add additional flag to example to ensure eval is done
* Wrap code into main function for accelerate launcher to find
* Fix valid batch size flag in readme
* Add note to install git-lfs when initializing/training the model
* Update examples/research_projects/codeparrot/scripts/arguments.py
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Revert "Wrap code into main function for accelerate launcher to find"
This reverts commit ff11df1c81.
* Fix formatting issue
* Move git-lfs instructions to installation section
* Add a quick check before code generation for code evaluation
* Fix styling issue
* Update examples/research_projects/codeparrot/scripts/human_eval.py
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Make iterable dataset use passed in tokenizer rather than globally defined one
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
Co-authored-by: ncoop57 <nac33@students.uwf.edu>
* Test workflow
* Build doc
* Make a clean build
* Add doc config
* Restore other workflows
* Final job
* Print something in else statements
* Pull before making changes
* Fix doc examples: name '...' is not defined
* remove >>> and ... in some docstrings in visual_bert
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* change args to address overwriting issue
* remove project name from args
* remove passing args as kwargs to experiment object
* remove passing args as kwargs to offline experiment
* fix offline directory assignment in experiment kwargs
* log checkpoint folder on training end
* log entire output_dir as asset folder
* log asset folder recursively
* end experiment at the end of training
* clean up
* clean up
* Default to always log training assets to Comet when using CometCallback
* change logging training assets to be true when running callback setup
* fix so that experiment always ends when training ends
* styling and quality fixes
* update docstring for COMET_LOG_ASSETS environment variable
* run styling and quality checks
* clean up to docstring
* remove merge markers
* change asset logging to false to avoid hitting max assets per experiment limit
* update training asset description
* fix styling
* fix: verify jsonl in run_translation (#14660)
* fix(run_translation.py): json/jsonl validation
Both json and jsonl are to be accepted as valid jsonlines file extension
* fix(run_translation.py): make black happy
* Ran make style
* Convert a few docs
* And another
* Last tutorials
* New syntax for colab links
* Convert a few docs
* And another
* Last tutorials
* New syntax for colab links
* Added support for other features for already supported models
* Partial support for causal and seq2seq models
* Partial support for causal and seq2seq models
* OnnxSeq2SeqConfigWithPast to support seq2seq models
* Parameterized the onnx tests
* Restored run_mlm.py
* Restored run_mlm.py
* [WIP] BART update
* BART and MBART
* Added comments
* Another sequence length of the past_key_values