transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-23 22:38:58 +06:00

Author	SHA1	Message	Date
dependabot[bot]	e49c71fc4c	Bump nbconvert from 6.3.0 to 6.5.1 in /examples/research_projects/lxmert (#18742 ) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.3.0 to 6.5.1. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.3.0...6.5.1) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-24 06:12:56 -04:00
dependabot[bot]	5b24949669	Bump nbconvert in /examples/research_projects/visual_bert (#18741 ) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.3.0 to 6.5.1. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.3.0...6.5.1) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-24 06:12:48 -04:00
Atharva Ingle	d90a36d192	remove check for main process for trackers initialization (#18706 )	2022-08-22 11:16:27 -04:00
Atharva Ingle	e54a1b49aa	`model.tie_weights()` should be applied after `accelerator.prepare()` (#18676 ) * `model.tie_weights()` should be applied after `accelerator.prepare` Weight tying should be done after the model has been moved to XLA device as mentioned on PyTorch/XLA Troubleshooting guide [here](https://github.com/pytorch/xla/blob/master/TROUBLESHOOTING.md#xla-tensor-quirks) * format code	2022-08-18 13:46:57 -04:00
Loubna Ben Allal	bbbb453e58	Add an examples folder for code downstream tasks (#18679 ) * add examples subfolder * mention examples in codeparrot readme * use Trainer optimizer and scheduler type and add output_dir as argument * add example of text-to-python and python-to-text models * mention the downstream examples in the readme * fix typo	2022-08-18 18:24:24 +02:00
Zachary Mueller	358fc18613	Add evaluate to examples requirements (#18666 )	2022-08-18 10:57:39 -04:00
Stefan Schweter	358478e729	Examples: add Bloom support for token classification (#18632 ) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it)	2022-08-17 09:50:57 +02:00
zhoutang776	25e651a2de	Update run_translation_no_trainer.py (#18637 ) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1`	2022-08-16 13:25:57 -04:00
Karim Foda	d6eeb87170	Flax Remat for LongT5 (#17994 ) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>	2022-08-14 16:27:13 +01:00
dependabot[bot]	05d3a43c59	Bump nbconvert in /examples/research_projects/visual_bert (#18566 ) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-11 10:47:31 -04:00
dependabot[bot]	713ab6fde5	Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565 ) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-11 10:47:19 -04:00
Matt	6eb51450fa	TF Examples Rewrite (#18451 ) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2022-08-10 16:49:51 +01:00
Rasmus Arpe Fogh Jensen	a765b68aa6	Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473 ) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script	2022-08-08 15:52:47 -04:00
Sylvain Gugger	70b0d4e193	Fix compatibility with 1.12 (#17925 ) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-08 09:53:08 -04:00
regisss	88a0ce57bb	Add seed setting to image classification example (#18519 )	2022-08-08 08:08:11 -04:00
Julien Chaumond	9129fd0377	`transformers-cli login` => `huggingface-cli login` (#18490 ) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style`	2022-08-06 09:42:55 +02:00
Julien Chaumond	8d1f9039d0	Just re-reading the whole doc every couple of months 😬 (#18489 ) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task	2022-08-06 09:38:55 +02:00
Kian Sierra McGettigan	0bf1e1aca4	Update no trainer examples for QA and Semantic Segmentation (#18474 ) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen * updated examples with gather_for_metrics	2022-08-04 13:22:19 -04:00
Kian Sierra McGettigan	330247ede2	Update no trainer scripts for multiple-choice (#18468 ) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen	2022-08-04 07:29:32 -04:00
LSinev	02b176c4ce	Fix torch version comparisons (#18460 ) Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py	2022-08-03 13:37:18 -04:00
Ritik Nandwal	3db4378bd7	Update no trainer scripts for language modeling and image classification examples (#18443 ) * Update no_trainer script for image-classification * Update no_trainer scripts for language-modeling examples * Remove unused variable * Removing truncation from losses array for language modeling examples	2022-08-03 08:33:18 -04:00
Yih-Dar	5546fb61ab	fix run_clip README (#18332 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 19:14:46 +02:00
Duong A. Nguyen	3909d7f139	Add Flax BART pretraining script (#18297 ) * add bart pretraining flax script * fixup * add bart pretraining flax script * add BART to README * add BART to README * add BART to README * add BART to README * add BART to README * add bos eos document * Update README.md * Update README.md * Update examples/flax/language-modeling/run_bart_dlm_flax.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * final * final * final * remove use_auth_token ing from_config Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2022-08-01 12:06:30 -04:00
Sylvain Gugger	941d233153	Fix ROUGE add example check and update README (#18398 ) * Fix ROUGE add example check and update README * Stay consistent in values	2022-08-01 11:14:49 -04:00
Ogundepo Odunayo	679d68a11b	Correct the spelling of bleu metric (#18375 )	2022-08-01 07:51:27 -04:00
atturaioe	1f84399171	Migrate metric to Evaluate in Pytorch examples (#18369 ) * Migrate metric to Evaluate in pytorch examples * Remove unused imports	2022-08-01 07:40:25 -04:00
dependabot[bot]	25ec12eaf7	Bump mistune from 0.8.4 to 2.0.3 in /examples/research_projects/lxmert (#18370 ) Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 04:46:57 -04:00
dependabot[bot]	a7360385f4	Bump mistune in /examples/research_projects/visual_bert (#18371 ) Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 04:46:31 -04:00
Sylvain Gugger	986526a0e4	Replace `as_target` context managers by direct calls (#18325 ) * Preliminary work on tokenizers * Quality + fix tests * Treat processors * Fix pad * Remove all uses of in tests, docs and examples * Replace all as_target_tokenizer * Fix tests * Fix quality * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by: amyeroberts <amy@huggingface.co> * Style Co-authored-by: amyeroberts <amy@huggingface.co>	2022-07-29 08:09:09 -04:00
Vijay S Kalmath	da503ea02f	Migrate metrics used in flax examples to Evaluate (#18348 ) Currently, tensorflow examples use the `load_metric` function from Datasets library, commit migrates function call to `load` function from Evaluate library.	2022-07-28 15:06:23 -04:00
Vijay S Kalmath	a2586795e5	Migrate metric to Evaluate library for tensorflow examples (#18327 ) * Migrate metric to Evaluate library in tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library. Fix for #18306 * Migrate metric to Evaluate library in tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library. Fix for #18306 * Migrate `metric` to Evaluate for all tf examples Currently tensorflow examples use `load_metric` function from Datasets library , commit migrates function call to `load` function to Evaluate library.	2022-07-28 14:24:27 -04:00
Loubna Ben Allal	286a18fa00	Fix codeparrot deduplication - ignore whitespaces (#18023 ) * ignore whitspaces for hash * reformat code * Update README.md	2022-07-28 15:58:26 +02:00
Lysandre	c89a592e87	Dev version	2022-07-27 17:13:57 +02:00
Sanchit Gandhi	7490a97cac	[Flax] Fix incomplete batches in example scripts (#17863 ) * [Flax] Fix incomplete batches in example scripts * fix dataloader batching * convert jnp batch idxs to np array * add missing `pad_shard_unpad` to final prediction generate step * only `pad_shard_unpad` at inference time * merge conflicts * remove incomplete batch step from eval * fix run_qa.py * add `pad_shard_unpad` to run_flax_ner.py * add `pad_shard_unpad` to run_flax_glue.py * add `pad_shard_unpad` to run_image_classification.py * make style * fix mlm flax eval batches * remove redundant imports	2022-07-27 15:50:47 +01:00
Sylvain Gugger	cf32b2ee42	Remove all uses of six (#18318 ) * Remove all uses of six * fix quality	2022-07-27 08:39:09 -04:00
Duong A. Nguyen	170fcaa604	Generalize decay_mask_fn to apply mask to all LayerNorm params (#18273 ) * generalize decay_mask_fn to find all layernorm params * fixup * generalising decay_mask_fn	2022-07-27 12:23:57 +01:00
Loubna Ben Allal	1d71ad8905	Update CodeParrot readme to include training in Megatron (#17798 ) * add info about megatron training * upload models and datasets from CodeParrot organization * upload models and datasets from CodeParrot organization * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * fix typo and add comment about codeparrot vs megatron Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-07-27 11:59:08 +02:00
Zachary Mueller	99eb9b523f	Fix `no_trainer` CI (#18242 ) * Fix all tests	2022-07-21 14:44:57 -04:00
Duong A. Nguyen	4bea6584e3	Remove use_auth_token from the from_config method (#18192 ) * remove use_auth_token from from_config * restore use_auth_token from_pretrained run_t5_mlm_flax	2022-07-19 08:13:20 +02:00
John Giorgi	a4f97e6ce0	Fix incorrect type hint for lang (#18161 )	2022-07-18 09:53:18 +02:00
John Giorgi	c46d39f390	Fix check for falsey inputs in run_summarization (#18155 )	2022-07-18 09:50:32 +02:00
John Giorgi	fde22c75a1	Add summarization name mapping for MultiNews (#18117 ) * Add summarization name mapping for MultiNews * Add summarization name mapping for MultiNews	2022-07-13 08:19:20 -04:00
Duong A. Nguyen	1e8140caad	Fix RESOURCE_EXHAUSTED error when dealing with large datasets in Flax example scripts (#18069 ) * Fix RESOURCE_EXHAUSTED error for large datasets on Flax example scripts * using np.permutation for creating batch_idx * train_samples_idx -> training_samples_idx * fix type hints	2022-07-11 15:59:08 +02:00
Yulv-git	95113d1365	Fix some typos. (#17560 ) * Fix some typos. Signed-off-by: Yulv-git <yulvchi@qq.com> * Fix typo. Signed-off-by: Yulv-git <yulvchi@qq.com> * make fixup.	2022-07-11 05:00:13 -04:00
ADAning	bf37e5c7f6	Fix T5 incorrect weight decay in Trainer and official summarization example (#18002 ) * Add ALL_LAYERNORM_LAYERS for LayerNorm * fix bug of appending layer norm	2022-07-06 09:44:19 -04:00
Zachary Mueller	7c4c6f6084	Fix all is_torch_tpu_available issues (#17936 ) * Fix all is_torch_tpu_available	2022-06-29 11:03:33 -04:00
Sylvain Gugger	5f1e67a566	Pin PyTorch in requirements as well	2022-06-28 15:56:10 -04:00
Zachary Mueller	75259b44bf	Properly calculate the total train iterations and recalculate num epochs in no_trainer scripts (#17856 )	2022-06-23 15:46:01 -04:00
Zachary Mueller	acb709d551	Change no trainer image_classification test (#17635 ) * Adjust test arguments and use a new example test	2022-06-23 11:11:16 -04:00
dependabot[bot]	c366ce1011	Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (#17817 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst) - [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-22 09:29:40 -04:00
dependabot[bot]	af0d21e741	Bump numpy in /examples/research_projects/visual_bert (#17816 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst) - [Commits](https://github.com/numpy/numpy/compare/v1.21.0...v1.22.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-22 09:29:28 -04:00
Eran Hirsch	1357038164	Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` (#17805 ) * Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict` * Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it * Remove `self._num_beams` from trainer classes * - Run fixup - Fix "Constraint" not exposed - Fix synced_gpus to actually read from param * Use kwargs * Copy kwargs before making changes to it * Fix style issues unused imports	2022-06-22 08:11:39 -04:00
Jia LI	da2bd2ae96	[CodeParrot] Near-deduplication with jaccard similarity (#17054 ) * deduplication draft * update style * update style test * dummy test main * rename modules * rename functions * return extremes in deduplicate_clusters * update style * cast str for gzip * update doc string * time processing * use dataset map to compute minhash * fill value for short token * remove da map method * update style * use share object to multiprocess * update style * use f-string and minor fix Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * update style * use module parameters * change ds_dedup to ds_filter * save ds_dedup * mv test to script tests * make jaccard threshold a parameter of deduplicate_dataset * update style * add doc strings * update style * add doc string for DuplicationIndex * save files into data dir * update readme * Update examples/research_projects/codeparrot/README.md Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * make near deduplication optional * move near deduplication in README * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * use f string Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>	2022-06-21 14:23:36 +02:00
dependabot[bot]	e44a569fef	Bump notebook in /examples/research_projects/lxmert (#17743 ) Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 12:10:33 -04:00
dependabot[bot]	5089a2d412	Bump notebook in /examples/research_projects/visual_bert (#17742 ) Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 12:10:17 -04:00
Sylvain Gugger	7c6ec195ad	v4.21.0.dev0	2022-06-16 12:20:53 -04:00
Jeff Rasley	6ebeeeef81	Update requirements.txt (#17719 )	2022-06-15 13:51:41 -04:00
Shamane Siri	9068fa6c57	Rag end2end new (#17650 ) * check * update the RAG-end2end with new PL and RAY * removed unwanted comments	2022-06-14 14:56:32 +02:00
Loubna Ben Allal	3114df41f4	update README.md (#17657 ) - use CodeParrot scores of v1.1 - change evaluation command to use accelerate	2022-06-10 15:55:24 +02:00
Simon Brandeis	c99ddcc441	🐛 Properly raise `RepoNotFoundError` when not authenticated (#17651 ) * Raise RepoNotFoundError in case of 401 * Include changes from revert-17646-skip_repo_not_found * Add a comment * 💄 Code quality * 💚 Update `get_from_cache` test * 💚 Code quality & skip failing test	2022-06-10 15:41:53 +02:00
dependabot[bot]	1d463303fe	Bump cookiecutter in /examples/research_projects/decision_transformer (#17645 ) Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1. - [Release notes](https://github.com/cookiecutter/cookiecutter/releases) - [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md) - [Commits](https://github.com/cookiecutter/cookiecutter/compare/1.7.2...2.1.1) --- updated-dependencies: - dependency-name: cookiecutter dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-10 04:27:51 -04:00
Sylvain Gugger	3cab90279f	Add examples telemetry (#17552 ) * Add examples telemetry * Alternative approach * Add to all other examples * Add to templates as well * Put framework separately * Same for TensorFlow	2022-06-07 11:57:52 -04:00
bhuang	254d9c068e	Update run_glue_no_trainer.py (#17546 )	2022-06-03 12:29:37 -04:00
Zachary Mueller	3766df4fe1	Fix flakey no-trainer test (#17515 )	2022-06-01 13:40:49 -04:00
fireindark707	028d4b7c8b	Deal with the error when task is regression (#16330 )	2022-06-01 11:15:53 -04:00
Sourab Mangrulkar	d156898f3b	Improve notrainer examples (#17449 ) * improve no-trainer examples * Trigger CI * adding comment to clarify tracker init on main process * Trigger CI * Trigger CI * Trigger CI	2022-05-28 00:06:31 +05:30
Patrick von Platen	a9eca74372	Wav2vec2 finetuning shared file system (#17423 ) * fix_torch_device_generate_test * remove @ * [Fix shared file system] Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2022-05-25 22:04:43 +02:00
dependabot[bot]	1ef9a1ed4a	Bump tensorflow in /examples/research_projects/decision_transformer (#17400 ) Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.8.0 to 2.8.1. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](https://github.com/tensorflow/tensorflow/compare/v2.8.0...v2.8.1) --- updated-dependencies: - dependency-name: tensorflow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-05-24 19:36:55 -04:00
NielsRogge	31ee80d556	Add LayoutLMv3 (#17060 ) * Make forward pass work * More improvements * Remove unused imports * Remove timm dependency * Improve loss calculation of token classifier * Fix most tests * Add docs * Add model integration test * Make all tests pass * Add LayoutLMv3FeatureExtractor * Improve integration test + make fixup * Add example script * Fix style * Add LayoutLMv3Processor * Fix style * Add option to add visual labels * Make more tokenizer tests pass * Fix more tests * Make more tests pass * Fix bug and improve docs * Fix import of processors * Improve docstrings * Fix toctree and improve docs * Fix auto tokenizer * Move tests to model folder * Move tests to model folder * change default behavior add_prefix_space * add prefix space for fast * add_prefix_spcae set to True for Fast * no space before `unique_no_split` token * add test to hightligh special treatment of added tokens * fix `test_batch_encode_dynamic_overflowing` by building a long enough example * fix `test_full_tokenizer` with add_prefix_token * Fix tokenizer integration test * Make the code more readable * Add tests for LayoutLMv3Processor * Fix style * Add model to README and update init * Apply suggestions from code review * Replace asserts by value errors * Add suggestion by @ducviet00 * Add model to doc tests * Simplify script * Improve README * a step ahead to fix * Update pair_input_test * Make all tokenizer tests pass - phew * Make style * Add LayoutLMv3 to CI job * Fix auto mapping * Fix CI job name * Make all processor tests pass * Make tests of LayoutLMv2 and LayoutXLM consistent * Add copied from statements to fast tokenizer * Add copied from statements to slow tokenizer * Remove add_visual_labels attribute * Fix tests * Add link to notebooks * Improve docs of LayoutLMv3Processor * Fix reference to section Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-05-24 09:53:45 +02:00
Loubna Ben Allal	b48ac1a094	Fix CodeParrot training script (#17291 ) * average loss over batches and accumulated steps for tracking * fix layernorm weight decay * use AdamW from Pytorch instead of Transformers * add shuffling of sequences inside the batches * add shuffling of sequences inside the batches * add logging dir and reformat code * fix lr tracking * remove Mistral scaling * keep Mistral scaling * reformat code * fix error * fix error * use shuffling function from Pytorch * remove argument for shuffling batch sequences as it isn't optional * update package versions and install accelerate from source * remove unused package * Update loss average over accumulated steps Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update loss average over accumulated steps Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * use one shuffle buffer argument * compute avg_loss in one line Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-05-23 12:55:35 +02:00
ddobokki	48c22691e3	Fix bug in Wav2Vec2 pretrain example (#17326 )	2022-05-19 22:42:44 +02:00
Zachary Mueller	1762ded30a	Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (#17331 ) * Fix length in no_trainer examples * Add setup and teardown * Use new accelerator config generator to automatically make tests able to run based on environment	2022-05-18 14:17:40 -04:00
Sylvain Gugger	4710702837	Fix style	2022-05-18 10:46:40 -04:00
mraunak	5fdb54ece7	Add Information Gain Filtration algorithm (#16953 ) * Add information gain filtration algorithm * Complying with black requirements * Added author * Fixed import order * flake8 corrections Co-authored-by: Javier Turek <javier.turek@intel.com>	2022-05-18 10:39:02 -04:00
regisss	28a0811652	Improve mismatched sizes management when loading a pretrained model (#17257 ) - Add --ignore_mismatched_sizes argument to classification examples - Expand the error message when loading a model whose head dimensions are different from expected dimensions	2022-05-17 17:58:14 +02:00
Loubna Ben Allal	05a90579a8	CodeParrot data pretokenization (#16932 ) * add pretokenization arguments * add pretokenization script * add support for pretokenized data * reformat code * fix run command for training * fix model call from config * remove a package * add comments on pretokenization in the readme * remove explicit parallelization Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * update readme Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * keep data parallelization * reformat code * reformat code * update readme * reformat code * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>	2022-05-16 15:32:16 +02:00
Loubna Ben Allal	e730e12567	Update codeparrot data preprocessing (#16944 ) * add new preprocessing arguments * add new filters * add new filters to readme * fix config and test count, update function names and docstrings * reformat code * update readme * Update readme * rename config_test filter Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * rename few_assignments filter Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * rename tokenizer in arguments Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * rename functions and add limit_line argument for config_test filter * update threshold for config_test filter Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Loubna ben allal <loubnabenallal@gmail.com>	2022-05-16 14:43:25 +02:00
Kenneth Enevoldsen	71d18d0831	fixed bug in run_mlm_flax_stream.py (#17203 ) * fixed bug run_mlm_flax_stream.py Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output. * Update run_mlm_flax_stream.py * adding missing paranthesis * formatted to black * remove cols from dataset instead * reformat to black * moved rem. columns to map * formatted to black Co-authored-by: KennethEnevoldsen <kennethcenevolsen@gmail.com>	2022-05-16 13:40:27 +02:00
Sylvain Gugger	afe5d42d8d	Black preview (#17217 ) * Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black	2022-05-12 16:25:55 -04:00
Lysandre Debut	5294fa12ee	Dev version	2022-05-12 11:04:23 -04:00
Zachary Mueller	d719bcd46a	Fix all docs for accelerate install directions (#17145 )	2022-05-09 15:45:18 -04:00
Zachary Mueller	ef20390291	Update to build via git for accelerate (#17084 )	2022-05-04 09:42:36 -04:00
dependabot[bot]	2bf95e2b09	Bump notebook from 6.4.1 to 6.4.10 in /examples/research_projects/lxmert (#16634 ) Bumps [notebook](http://jupyter.org) from 6.4.1 to 6.4.10. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-05-04 08:27:40 -04:00
dependabot[bot]	7a229ef446	Bump notebook in /examples/research_projects/visual_bert (#16635 ) Bumps [notebook](http://jupyter.org) from 6.4.1 to 6.4.10. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-05-04 08:27:27 -04:00
Thomas Wang	db034660fb	Fix hashing for deduplication (#17048 )	2022-05-04 08:40:24 +02:00
Pavel Belevich	39f8eafc1b	Remove device parameter from create_extended_attention_mask_for_decoder (#16894 )	2022-05-03 11:06:11 -04:00
Zachary Mueller	f275e593bf	Fix no_trainer examples to properly calculate the number of samples (#17046 ) * Update all examples to properly calculate progress bar	2022-05-02 11:56:25 -04:00
Zachary Mueller	35d48db881	Update no_trainer examples to use new logger (#17044 ) * Propagate and fix imports	2022-05-02 11:56:15 -04:00
yujun	bdd690a74d	add torch.no_grad when in eval mode (#17020 ) * add torch.no_grad when in eval mode * make style quality	2022-05-02 07:49:19 -04:00
Zachary Mueller	3486a92a57	Fix savedir for by epoch (#16996 )	2022-04-28 13:49:45 -04:00
conan1024hao	1be8d56ec6	Add parameter --config_overrides for run_mlm_wwm.py (#16961 ) * dd parameter --config_overrides for run_mlm_wwm.py * linter	2022-04-28 10:44:55 -04:00
Zachary Mueller	60e1d883f1	Fixup no_trainer save logic (#16968 ) * Fixup all examples	2022-04-27 14:46:49 -04:00
Sylvain Gugger	c79bbc3ba5	Fix multiple deletions of the same files in save_pretrained (#16947 ) * Fix multiple deletions of the same files in save_pretrained * Add is_main_process argument	2022-04-27 12:28:42 -04:00
Leonid Boytsov	c82e017aa9	Misc. fixes for Pytorch QA examples: (#16958 ) 1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed). 2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion). 3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions. 4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.	2022-04-27 08:51:39 -04:00
NielsRogge	479fdc4925	Add semantic script, trainer (#16834 ) * Add first draft * Improve script and README * Improve README * Apply suggestions from code review * Improve script, add link to resulting model * Add corresponding test * Adjust learning rate	2022-04-27 10:12:18 +02:00
Anton Lozhkov	a4a88fa09f	[Research] Speed up evaluation for XTREME-S (#16785 ) * Avoid repeated per-lang filtering * Language groups and logits preprocessing * Style	2022-04-27 08:34:21 +02:00
code-review-doctor	6568752039	Fix issue probably-meant-fstring found at https://codereview.doctor (#16913 )	2022-04-25 15:15:00 -04:00
Sanchit Gandhi	fea94d6790	Replace deprecated logger.warn with warning (#16876 )	2022-04-25 15:12:51 -04:00
Loubna Ben Allal	d91841315a	New features for CodeParrot training script (#16851 ) * add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-04-21 18:43:46 +02:00
Zachary Mueller	705d65368f	Fix multiproc metrics in no_trainer examples (#16865 )	2022-04-20 17:26:27 -04:00

1 2 3 4 5 ...

2143 Commits