transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-23 22:38:58 +06:00

Author	SHA1	Message	Date
Suraj Patil	d3bd9ac728	[Flax] improve large model init and loading (#16148 ) * begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-04-19 14:19:55 +02:00
NielsRogge	7db7aab439	Add semantic script no trainer, v2 (#16788 ) * Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review	2022-04-19 09:07:29 +02:00
NielsRogge	78f346c2b5	Update README.md (#16797 )	2022-04-15 14:10:16 +02:00
NielsRogge	048443db86	Improve image classification example (#16585 ) * Improve README * Make dataset_name argument optional * Improve local data * Fix bug * Improve README some more * Apply suggestions from code review * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-14 18:10:52 +02:00
Zachary Mueller	be752d12f8	Fixup no_trainer examples scripts and add more tests (#16765 ) * Change tracking to store_true * Remove step param and use it in the log dictionary directly * use vars(args) when passing args to init_trackers * Include tracking tests since tensorboard is already a dep	2022-04-13 14:40:48 -04:00
Tu Vu	34ef029dc0	Add self training code for text classification (#16738 ) * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Delete strata	2022-04-13 12:03:24 -04:00
Shang Zhang	14daa6102a	Qdqbert example add benchmark script with ORT-TRT (#16592 ) * add ort-trt benchmark script * Update README.md * ort version can be newer * formatting * specify ORT version	2022-04-12 11:13:59 -04:00
Heerak Son	db3edd050b	Update run_translation_no_trainer.py (#16652 ) args.model_name_or_path -> args.config_name fix it	2022-04-12 08:55:12 -04:00
Zachary Mueller	69233cf03b	Fix example logs repeating themselves (#16669 ) Move declaration of log streams to before tests, so that results won't get compounded on top of each other	2022-04-11 16:25:16 -04:00
Zachary Mueller	d4b3e359aa	Don't push checkpoints to hub in `no_trainer` scripts (#16703 ) Adds checkpoint prefixes to the gitignore if `push_to_hub` is used along with `checkpointint_steps`	2022-04-11 12:42:45 -04:00
Ahmed Elnaggar	5e68675755	Fix t5 shard on TPU Pods (#16527 ) * Fix t5 shard on TPU Pods The current script doesn't work properly on a TPU pod because the global batch is not divided correctly per host. This pull request fixes this issue by dividing the global batch to each host before it is shared on each host. * fix style Co-authored-by: ahmed-elnaggar <ahmed.elnaggar@allianz.com>	2022-04-11 16:45:20 +02:00
Jia LI	4868a830db	Jia multi gpu eval (#16428 ) * add simple multi gpu complet * add human_eval_multi_gpu * use copy strategy to distribute across gpu, to avoid padding * add doc string * update code style * use task id to arrange output * truncate input to avoid zero pad * Stop the copy mechanism * update style * restore copies to scale better in distributed mode * update style * replace human eval * Apply suggestions from code review 1. Tokenize all input at the same time 2. use attention_mask to get the input length 3. other small fixes Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * correct typo and update docstring * update code style * remove num sample division constraint * remove max len calculation * use accelerator.gather once to speed up * use accelerate set_seed; update accelerate version * correct gather bug Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-04-11 11:24:32 +02:00
Zachary Mueller	d57da99237	Add tests for no_trainer and fix existing examples (#16656 ) * Fixed some bugs involving saving during epochs * Added tests mimicking the existing examples tests * Added in json exporting to all `no_trainer` examples for consistency	2022-04-08 10:03:56 -04:00
NielsRogge	4ef0abb738	Add TAPEX (#16473 ) * Add TapexTokenizer * Improve docstrings and provide option to provide answer * Remove option for pretokenized inputs * Add TAPEX to README * Fix copies * Remove option for pretokenized inputs * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification. * - Draft a README file for running the script and introducing some background. - Remove unused code lines in tabfact script. - Disable the deafult `pad_to_max_length` option which is memory-consuming. * * Support `as_target_tokenizer` function for TapexTokenizer. * Fix the do_lower_case behaviour of TapexTokenizer. * Add unit tests for target scenarios and cased/uncased scenarios for both source and target. * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function. * Fix typos in tapex example README. * * fix the evaluation script - remove the property `task_name` * * Make the label space more clear for tabfact tasks * * Using a new fine-tuning script for tapex-base on tabfact. * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql * * Remove the default tokenizer_name option. * Provide evaluation command. * * Support for WikiTableQuestion dataset. * Fix a typo in README. * * Fix the datasets's key name in WikiTableQuestions * Run make fixup and move test to folder * Fix quality * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply some more suggestions from code review * Improve docstrings * Overwrite failing test * Improve comment in example scripts * Fix rebase * Add TAPEX to Auto mapping * Add TAPEX to auto config mappings * Put TAPEX higher than BART in auto mapping * Add TAPEX to doc tests Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: SivilTaram <qianlxc@outlook.com> Co-authored-by: Niels Rogge <nielsrogge@nielss-mbp.home> Co-authored-by: Suraj Patil <surajp815@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-04-08 10:57:51 +02:00
Zachary Mueller	febe42b5da	Update no_trainer scripts with new Accelerate functionalities (#16617 ) Adds logging and save/loading to the Accelerate scripts Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-04-06 15:29:32 -04:00
Lysandre Debut	a180efe7fd	Dev version	2022-04-06 11:08:12 -04:00
Karim Foda	24a85cca61	Add use_auth to load_datasets for private datasets to PT and TF examples (#16521 ) * fix formatting and remove use_auth * Add use_auth_token to Flax examples	2022-04-04 10:27:45 -04:00
Cathy	bfeff6cc6a	Fixed a typo in legacy seq2seq_trainer.py (#16531 )	2022-04-01 09:17:31 +02:00
Anton Lozhkov	5807054bd3	[research] link to the XTREME-S paper (#16519 ) * [research] link to the XTREME-S paper * Update examples/research_projects/xtreme-s/README.md Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-31 23:26:50 +04:00
Bhadresh Savani	05b4c32908	fixed a typo (#16508 )	2022-03-31 07:49:02 -04:00
Stas Bekman	a73281e3e4	[examples] max samples can't be bigger than the len of dataset (#16501 ) * [examples] max samples can't be bigger than then len of dataset * do tf and flax	2022-03-30 12:33:16 -07:00
Sylvain Gugger	b62ac4d240	Fix example test and test_fetcher for examples (#16478 )	2022-03-29 12:21:19 -04:00
Eldar Kurtic	5216607f8a	[MNLI example] Prevent overwriting matched with mismatched metrics (#16475 ) * Prevent overwriting matched with mismatched metrics * Fix style	2022-03-29 10:38:14 -04:00
Yongrae Jo	8049dfa427	Update run_t5_mlm_flax.py (#16421 ) Fix typo in comment: proprocessed -> preprocessed	2022-03-28 06:00:53 -04:00
Shang Zhang	7ecbb9c5e4	QDQBert example update (#16395 ) * update Dockerfile and utils_qa * Update README.md	2022-03-28 05:47:52 -04:00
Sylvain Gugger	867f3950fa	Rename master to main for notebooks links and leftovers (#16397 )	2022-03-25 09:12:23 -04:00
Sylvain Gugger	088c1880b7	Big file_utils cleanup (#16396 ) * Big file_utils cleanup * This one still needs to be treated separately	2022-03-25 07:25:20 -04:00
Nathan Cooper	f5e8c9bdea	Update readme with how to train offline and fix BPE command (#15897 ) * Update readme with how to train offline and fix BPE command * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>	2022-03-24 11:00:46 +01:00
Edward Beeching	aff9bc405a	Decision transformer gym (#15845 ) * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Add type hints for Pegasus (#16324) * Funnel type hints (#16323) * add pt funnel type hints * add tf funnel type hints * Add type hints for ProphetNet PyTorch (#16272) * [GLPN] Improve docs (#16331) * Add link to notebook * Add link * Fix bug Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Added type hints for Pytorch Marian calls (#16200) * Added type hinting for forward functions in pytorch marian * typo correction * Removed type hints on functions from BART per Suraj Patil request * fix import pb * fix typo * corrected tuple call * ran black * after fix-copies Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List * Fixing copies to roformer and pegasus Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * done (#16340) * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type annotations for Rembert/Splinter and copies (#16338) * undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by: matt <rocketknight1@gmail.com> * [Bug template] Shift responsibilities for long-range (#16344) * Fix code repetition in serialization guide (#16346) * Adopt framework-specific blocks for content (#16342) * ✨ refactor code samples with framework-specific blocks * ✨ update training.mdx * 🖍 apply feedback * Updates the default branch from master to main (#16326) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Created the Decision Transformer Modle * updating tests, copy to other machine * Added last hidden size to Decision Transformer modelling outputs * Removed copy of original DT file * made a temporary change to gpt2 to have it conform with the Decision Transformer version * Updated tests * Ignoring a file used to test the DT model * added comments to config file * added comments and argument descriptions to decision transformer file * Updated doc * Ran "make style" * Remove old model imports * Removed unused imports, cleaned up init file * Update docs/source/model_doc/decision_transformer.mdx added my username Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Reverted changes made to gpt2 * Removed datasets submodule * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states * Added support for return of hidden states, attentions and return dict of gpt2 model. * Updated tests to include many of the ModelTesterMixin tests. The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes * Added missing line to the end of gpt2 file * Added an integration test for the Decision Transformer Test performs and autoregressive evaluation for two time steps * Set done and info to _ to fix failing test * Updated integration test to be deterministic and check expected outputs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unnecessary config options * Cleaned up commented code and old comments. * Cleaned up commented code. * Changed DecisionTransformer to Decision Transformer * Added Decision Transformer to the main README file * Added copy of GTP2 called DecisionTranformerGPT2Model * isorted imports * isorted imports * Added model to non-English README files * Ran make fix-copies and corrected some cases. * Updated index file to include Decision Transformer * Added gpt2 model as copy inside the Decision Transformer model file * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS * Deleted redundant checkpoint files (I don't know how these got committed) * Removed testing files. (These should have never been committed) * Removed accidentally committed files * Moved the Decision Transformer test to its own directory * Moved DecisionTransformOutput to modeling_decision_transformer * Moved the example usage to research project and cleaned comments * Made tests ignore the copy of gpt2 in Decision Transformer * Added module output to modelling decision transformer * removed copied gpt2 model from list of transformers models * Updated tests and created __init__ file for new test location * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Removed unneeded summary type from config file * Fixed copies * Updated pretrained config map to refer to hopper-medium checkpoint * Added Decision transformer to model docs * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updated model with custom docstring example * Updated copies, config auto, and readme files. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Tegzes <48134725+Tegzes@users.noreply.github.com> Co-authored-by: Adam Montgomerie <adam@avanssion.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com> Co-authored-by: Clementine Fourrier <cfourrie@inria.fr> Co-authored-by: matt <rocketknight1@gmail.com> Co-authored-by: Francesco Saverio Zuppichini <francesco.zuppichini@gmail.com> Co-authored-by: Jacob Dineen <54680234+jacobdineen@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-03-23 16:18:43 -04:00
Sylvain Gugger	4975002df5	Reorganize file utils (#16264 ) * Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit	2022-03-23 10:26:33 -04:00
Lysandre Debut	eca77f4719	Updates the default branch from master to main (#16326 ) * Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-23 03:46:59 -04:00
Anton Lozhkov	e226a24f84	[xtreme-s] Update Minds14 results (#16241 ) * update results * per-language metrics * Format the per-language metrics	2022-03-21 19:33:59 +01:00
Suraj Patil	93d3fd8645	remove jax.ops.index (#16220 )	2022-03-17 17:51:43 +01:00
Anton Lozhkov	d35e0c6247	Minor fixes to XTREME-S (#16193 ) * Minor fixes * Fix vocab union * Update examples/research_projects/xtreme-s/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update README * unused import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-03-16 17:23:00 +04:00
Sanchit Gandhi	ee27b3d7df	Replace all deprecated `jax.ops` operations with jnp's `at` (#16078 ) * Replace all deprecated `jax.ops` operations with jnp's `at` * np to jnp scores * suggested changes	2022-03-16 09:08:55 +00:00
Patrick von Platen	c2dc89be62	[Xtreme-S] fix some namings (#16183 )	2022-03-16 01:21:31 +01:00
Anton Lozhkov	99fd3eb4a5	Add the XTREME-S fine-tuning example (#15985 ) * CTC+classification draft * CTC+classification draft * style * multilingual runs * Fix race condition during processor.from_reatrained * Merge covost experiments * Add README * Quality * Switch to .all configs * Fix typos	2022-03-16 00:21:06 +01:00
Stas Bekman	580dd87c55	[Deepspeed] add support for bf16 mode (#14569 ) * [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-03-11 17:53:53 -08:00
Sylvain Gugger	19597998f6	Don't compute metrics in LM examples on TPU (#16029 )	2022-03-10 07:44:51 -05:00
Sanchit Gandhi	6c9010ef63	Update README.md	2022-03-10 10:20:37 +01:00
Shotaro Ishihara	8feede229c	Fix broken code blocks in README.md (#15967 ) at transformers/examples/pytorch/contrastive-image-text	2022-03-09 17:07:52 +01:00
Joao Gante	e7f34ccd4f	Swag example: Update doc format (#16014 )	2022-03-09 13:25:34 +00:00
Joao Gante	62d847602a	Update TF multiple choice example (#15868 )	2022-03-08 13:16:34 +00:00
Yeb Havinga	91fb62d01c	Speedup training by using numpy instead of jnp for batch shuffling (#15963 ) Speedup training by using numpy instead of jnp for batch shuffling Co-authored-by: Yeb Havinga <y.t.havinga@mgrid.net>	2022-03-08 12:18:38 +01:00
Patrick von Platen	10b76987fc	[FlaxT5 Example] fix flax t5 example pretraining (#15835 )	2022-03-04 17:04:43 +01:00
Sanchit Gandhi	b71474895d	Update README.md	2022-03-04 09:58:45 +01:00
davidleonfdez	c0281feb50	Fix #15898 (#15928 )	2022-03-03 14:41:03 -05:00
Sylvain Gugger	79d28e80b6	v4.18.0.dev.0	2022-03-03 10:19:58 -05:00
Ross Johnstone	e535c389aa	Fix tiny typo (#15884 )	2022-03-02 15:37:05 +01:00
Joao Gante	05c237ea94	Update TF QA example (#15870 )	2022-03-02 10:38:13 +00:00
Joao Gante	3f2e636850	Update TF LM examples (#15855 )	2022-03-01 14:12:58 +00:00
Suraj Patil	bf1fe32824	[examples/summarization and translation] fix readme (#15833 )	2022-02-25 17:28:16 +01:00
Lysandre Debut	29c10a41d0	[Test refactor 1/5] Per-folder tests reorganization (#15725 ) * Per-folder tests reorganization Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Stas Bekman <stas@stason.org>	2022-02-23 15:46:28 -05:00
Yongrae Jo	3db2e8f92b	Fix typo on examples/pytorch/question-answering (#15644 ) cna -> can	2022-02-22 13:51:07 -05:00
Joao Gante	3956b133b6	TF text classification examples (#15704 ) * Working example with to_tf_dataset * updated text_classification * more comments	2022-02-21 17:17:59 +00:00
Suraj Patil	86119c1154	add VisionTextDualEncoder and CLIP fine-tuning script (#15701 ) * begin script * update script * fix features and data args * main * add requirements * add column name args * fix captions * don't jit transforms * fix caption * fix labels, handle attention mask * convert pixel values to numpy * labels => input_ids * transform images on the fly * use AutoModel class, create the hybird model outside of the script * fix version message * add readme * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * adderss review comments * add more comments * allow freezing vision and text models Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-21 16:10:59 +01:00
Ivan Agarský	5444687f0f	Fix minor comment typos (#15740 )	2022-02-21 12:41:27 +01:00
Simon Sardorf	a63bd3675f	Remove input and target reset after preprocessing (#15741 ) Remove input and target reset after preprocessing	2022-02-21 11:10:15 +01:00
NielsRogge	57882177be	Add SimMIM (#15586 ) * Add first draft * Make model importable * Make SwinForMaskedImageModeling importable * Fix imports * Add missing inits * Add support for Swin * Fix bug * Fix bug * Fix another bug * Fix Swin MIM implementation * Fix default encoder stride * Fix Swin * Add print statements for debugging * Add image_size data argument * Fix Swin * Fix image_size * Add print statements for debugging * Fix print statement * Remove print statements * Improve reshaping of bool_masked_pos * Add support for DeiT, fix tests * Improve docstrings * Apply new black version * Improve script * Fix bug * Improve README * Apply suggestions from code review * Remove DS_Store and add to gitignore * Apply suggestions from code review + fix BEiT Flax * Revert BEiT changes * Improve README * Fix code quality * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 19:44:55 +01:00
NielsRogge	0e91f885c3	Add image classification notebook (#15667 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-02-17 13:14:01 +01:00
Shamane Siri	80f1a59168	updated with latest PL and Ray (#15653 )	2022-02-15 16:53:05 +01:00
Stas Bekman	fcb0f74397	[research_projects] deal with security alerts (#15594 ) * [research_projects] deal with security alerts * add a note of the original PL ver and warning	2022-02-11 14:31:09 -05:00
Patrick von Platen	3d5dea9bf0	Add example batch size to all commands (#15596 )	2022-02-10 08:52:07 -05:00
Lysandre Debut	7732d0fe7a	Upgrade black to version ~=22.0 (#15565 ) * Upgrade black to version ~=22.0 * Check copies * Fix code	2022-02-09 09:28:57 -05:00
Anton Lozhkov	a459f7f97d	Add ASR CTC streaming example (#15309 ) * Single-epoch run * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Infinite dataset * Trainer fix + distributed benchmark * Benchmark fix * unused import * interleaved splits * interleaved splits * has_length util * Move to research projects * Leftover Sized checks * Bump min version * Unused import * Revert trainer changes Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-02-07 18:35:37 +03:00
davidleonfdez	f1a4c4ead5	[WIP] Add preprocess_logits_for_metrics Trainer param (#15473 ) * Add preprocess_logits_for_metrics Trainer param * Compute accuracy in LM examples * Improve comments	2022-02-03 12:07:20 -05:00
Sylvain Gugger	45cac3fade	Fix labels stored in model config for token classification examples (#15482 ) * Playing * Properly set labels in model config for token classification example * Port to run_ner_no_trainer * Quality	2022-02-02 14:23:43 -05:00
Sylvain Gugger	d0b5ed110a	Harder check for IndexErrors in QA scripts (#15438 ) * Harder check for IndexErrors in QA scripts * Make test stronger	2022-02-01 15:49:13 -05:00
Kamal Raj	d2749cf72e	Update README.md (#15462 ) fix typo	2022-02-01 10:04:30 -05:00
Suraj Patil	87918d3221	[examples/Flax] add a section about GPUs (#15198 ) * add a section about GPUs * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-31 19:20:53 +01:00
Jonatas Grosman	f624249d8b	[Robust Speech Challenge] Add missing LR parameter (#15428 )	2022-01-31 15:50:56 +01:00
Julien Plu	aa19f478ac	Add (M)Luke model training for Token Classification in the examples (#14880 ) * Add Luke training * Fix true label tags * Fix true label tags * Fix true label tags * Update the data collator for Luke * Some training refactor for Luke * Improve data collator for Luke * Fix import * Fix datasets concatenation * Add the --max_entity_length argument for Luke models * Remove unused code * Fix style issues * Fix style issues * Move the Luke training into a separate folder * Fix style * Fix naming * Fix filtering * Fix filtering * Fix filter * Update some preprocessing * Move luke to research_projects * Checkstyle * Address comments * Fix style	2022-01-31 07:58:18 -05:00
François REMY	0094eba363	Fix additional DataTrainingArguments documentation (#15408 ) (This is an editorial change only)	2022-01-31 07:45:11 -05:00
Sylvain Gugger	c98a6ac211	Use argument for preprocessing workers in run_summairzation (#15394 )	2022-01-28 18:34:10 -05:00
Matt	b6b79faa7e	Make links explicit (#15395 ) * Make links explicit * Removing reference to compute_metrics() since it's kind of PyTorch-specific	2022-01-28 17:31:22 +00:00
dependabot[bot]	628b59e51d	Bump numpy from 1.19.2 to 1.21.0 in /examples/research_projects/lxmert (#15369 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-27 14:46:15 -05:00
dependabot[bot]	ca0848b2ff	Bump notebook in /examples/research_projects/visual_bert (#15368 ) Bumps [notebook](http://jupyter.org) from 6.1.5 to 6.4.1. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2022-01-27 14:45:58 -05:00
dependabot[bot]	7d45a2e81c	Bump numpy in /examples/research_projects/visual_bert (#15367 ) Bumps [numpy](https://github.com/numpy/numpy) from 1.19.2 to 1.21.0. - [Release notes](https://github.com/numpy/numpy/releases) - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst.txt) - [Commits](https://github.com/numpy/numpy/compare/v1.19.2...v1.21.0) --- updated-dependencies: - dependency-name: numpy dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-27 14:45:18 -05:00
Lysandre	eab338104d	Docs for version v4.16.0	2022-01-27 13:11:51 -05:00
Lysandre	f87db5e412	Release: v4.16.0	2022-01-27 13:06:33 -05:00
Anton Lozhkov	196cce6e9b	Add a device argument to the eval script (#15371 ) * Device argument for the eval script * Default to none * isort	2022-01-27 15:58:55 +01:00
François REMY	19732cc07a	Fix 'eval_split_name' described as defaulting to 'train' (#15348 ) The default is correct (`test`) but the description is not.	2022-01-26 10:19:38 -05:00
Patrick von Platen	457dd4392b	[Examples] Correct run ner label2id for fine-tuned models (#15017 ) * up * up * make style * apply sylvains suggestions * apply changes to accelerate as well * more changes * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-01-24 21:18:04 +01:00
Patrick von Platen	4bf97415a4	Update eval.py (#15310 )	2022-01-24 11:46:38 +01:00
Sylvain Gugger	4cff3fae11	Second failing test	2022-01-21 12:19:28 -05:00
Sylvain Gugger	f6253147df	Skip failing test	2022-01-21 12:03:21 -05:00
Patrick von Platen	11afb709ec	[Robust Speech Challenge] Add timeline (#15274 )	2022-01-21 17:12:09 +01:00
lewtun	833635e259	Move BART + ONNX example to research_projects (#15271 ) * Move BART + ONNX example to research_projects * Add author information	2022-01-21 14:47:34 +01:00
NielsRogge	6c7b68d414	[ViTMAE] Add image pretraining script (#15242 ) * Add script * Improve script * Fix data collator * Update README * Add label_names argument * Apply suggestions from code review * Add config parameters * Update script * Fix bug * Improve README * Improve README and add test * Fix import * Add image_column_name	2022-01-21 12:11:08 +01:00
Anton Lozhkov	85ea462c08	Update README.md (#15246 ) Clarify OVH instruction	2022-01-20 13:40:26 +03:00
Anton Lozhkov	e57468b8a8	Update README.md (#15239 ) Add an OVHcloud tutorial URL for the Robust Speech Challenge	2022-01-20 11:46:50 +03:00
Patrick von Platen	691878ee2f	Update README.md (#15233 )	2022-01-19 18:03:17 +01:00
Suraj Patil	2a5a384970	fix speech event readme (#15227 )	2022-01-19 15:30:03 +01:00
Patrick von Platen	6d92c429c7	Update README.md (#15226 )	2022-01-19 15:23:00 +01:00
Patrick von Platen	19c217b4b7	Update README.md	2022-01-19 15:21:03 +01:00
Patrick von Platen	5439cda7f0	Update README.md	2022-01-19 15:19:57 +01:00
Kamal Raj	d1f5ca1afd	[FLAX] glue training example refactor (#13815 ) * refactor run_flax_glue.py * updated readme * rm unused import and args typo fix * refactor * make consistent arg name across task * has_tensorboard check * argparse -> argument dataclasses * refactor according to review * fix	2022-01-19 12:04:51 +01:00
Patrick von Platen	e118e085ea	[Robust Speech Event] Add guides (#15155 ) * up * improve readme * up * up * more info * up * up * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * add more stuff for eval * update * up * Update README.md * Update examples/research_projects/xls_r/README.md Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com> * apply omar's suggestions Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>	2022-01-18 18:44:48 +01:00
Sylvain Gugger	6f0a9b41ef	Remove dependency to quiet Dependabot (#15205 )	2022-01-18 09:44:35 -05:00
Sylvain Gugger	531336bbfd	Fix deprecation warnings for int div (#15180 ) * Fix deprecation warnings for int div Co-authored-by: mgoldey <matthew.goldey@gmail.com> * Fix import * ensure that tensor output is python scalar * make backward compatible * make code more readable * adapt test functions Co-authored-by: mgoldey <matthew.goldey@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2022-01-18 07:28:53 -05:00
Sylvain Gugger	96881729ce	Remove assert on optional arg	2022-01-13 17:34:41 -05:00
Stas Bekman	762416ffa8	[examples/flax/language-modeling] set loglevel (#15129 )	2022-01-13 15:17:28 +01:00
Edoardo Federici	9a94bb8e21	mBART support for run_summarization.py (#15125 ) * Update run_summarization.py * Fixed languages and added missing code * fixed obj, docs, removed source_lang and target_lang * make style, run_summarization.py reformatted	2022-01-12 16:39:33 -05:00
Leandro von Werra	aa0135f2e0	fix: switch from slow to generic tokenizer class (#15122 )	2022-01-12 09:12:43 -05:00
Russell Klopfer	27b819b0e3	use block_size instead of max_seq_length in tf run_clm example (#15036 ) * use block_size instead of max_seq_length * fixup * remove pad_to_block_size Co-authored-by: Russell Klopfer <russell@kloper.us>	2022-01-12 08:57:00 -05:00
Patrick von Platen	d72343d2b8	[Wav2Vec2 Speech Event] Add speech event v2 (#15083 ) * up * up * up * up * up * up * improve * up * up * Update src/transformers/trainer.py * up * up * up	2022-01-10 10:46:21 +01:00
flozi00	b67f345d00	Update run_speech_recognition_seq2seq.py (#14967 )	2022-01-06 19:26:45 +03:00
Yih-Dar	9f89fa02ed	Add Flax image captioning example (#14864 ) * add image captioning example * update README * fix style & quality * simplify * apply review suggestions * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by: Suraj Patil <surajp815@gmail.com> * Apply review suggestions * add comments about using np instead jax array * remove unused lines * add model creation script * only support from_pretrained * fix style * fix * not use cache_dir when creating model * fix tokenizer creation * update README * fix quality * apply suggestion * simplify some blocks * Update examples/flax/image-captioning/README.md * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by: Suraj Patil <surajp815@gmail.com> * apply suggestion Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2022-01-06 14:00:54 +01:00
flozi00	774ed4a027	Fix Code block (#14983 )	2022-01-04 12:59:20 +01:00
Patrick von Platen	600496fa50	[Wav2Vec2] Rename model's feature extractor to feature encoder (#14959 ) * rename classes * clean up more namings * remove bogus file * Apply suggestions from code review * Apply suggestions from code review * replace more names * more regex replace * make style * correct * correct more * make style * finish * correct more in wav2vec2 * make style * improve freeze_extractor * add aliases * add tf aliases	2021-12-28 20:33:23 +01:00
Patrick von Platen	f80775df2b	Update README.md (#14965 )	2021-12-28 13:41:27 +01:00
Patrick von Platen	1c121916f3	Add Speech Seq2Seq Training script (#14792 ) * start * add gradient checkpointing and feature extractor freezing * Apply suggestions from code review * up * up * up * correct * up * more changes * up * up * up * remove rst	2021-12-28 10:20:51 +01:00
Leandro von Werra	1d651868d6	add custom stopping criteria to human eval script (#14897 )	2021-12-23 14:59:11 +01:00
lewtun	355dc0ce67	Fix installation instructions for BART ONNX example (#14885 )	2021-12-23 04:05:32 -05:00
Patrick von Platen	fa39ff9fc4	Docs for v4.16.0dev0	2021-12-22 20:39:44 +01:00
Patrick von Platen	05fa1a7ac1	Release: v4.15.0	2021-12-22 18:43:15 +01:00
Mario Šaško	1045a36c1f	Fix pytorch image classification example (#14883 ) * Update example * Remove skip in tests	2021-12-22 14:42:19 +01:00
Sylvain Gugger	e51c7b5872	Skip failing test	2021-12-21 15:15:17 -05:00
Stas Bekman	033c3ed95a	[examples/summarization] deal with None in data records (#14816 ) * [examples/summarization] deal with None in data records * rewrite to use a simpler (slower) variant	2021-12-21 09:17:28 -08:00
Patrick von Platen	7ae6f07004	[ASR example] Improve example + add more examples (#14848 ) * up * load up * up	2021-12-21 13:12:22 +01:00
Patrick von Platen	c4a96cecbc	Wav2Vec2 meets phonemes (#14353 ) * up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>	2021-12-17 19:56:44 +01:00
Lysandre	7c9c41f43c	Docs for v4.14.0	2021-12-15 18:29:53 +01:00
Lysandre	960d8cb41d	Release: v4.14.0	2021-12-15 18:20:35 +01:00
Yih-Dar	a94105f95f	Fix preprocess_function in run_summarization_flax.py (#14769 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2021-12-15 11:36:28 +01:00
Benjamin Minixhofer	2a606f9974	Make data shuffling in `run_clm_flax.py` respect global seed (#13410 ) * use jax and jnp instead of numpy in data_loader * return batches as np.ndarray	2021-12-14 11:04:43 +01:00
Josué Nascimento	971e36667a	Change how to load config of XLNetLMHeadModel (#14746 )	2021-12-13 12:34:26 -05:00
Nathan Cooper	48bf7e47a0	Code parrot minor fixes/niceties (#14666 ) * Add some nicety flags for better controlling evaluation. * Fix dependency issue with outdated requirement * Add additional flag to example to ensure eval is done * Wrap code into main function for accelerate launcher to find * Fix valid batch size flag in readme * Add note to install git-lfs when initializing/training the model * Update examples/research_projects/codeparrot/scripts/arguments.py Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Revert "Wrap code into main function for accelerate launcher to find" This reverts commit `ff11df1c81`. * Fix formatting issue * Move git-lfs instructions to installation section * Add a quick check before code generation for code evaluation * Fix styling issue * Update examples/research_projects/codeparrot/scripts/human_eval.py Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> * Make iterable dataset use passed in tokenizer rather than globally defined one Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by: ncoop57 <nac33@students.uwf.edu>	2021-12-13 09:30:50 +01:00
Suraj Patil	6a025487a6	[Flax examples] remove dependancy on pytorch training args (#14636 ) * use custom training arguments * update tests	2021-12-12 09:19:12 +05:30
Lysandre	ab31b3e41b	Docs for v4.14.0dev0	2021-12-09 17:09:23 +01:00
Lysandre	4da3a696e4	Release: v4.13.0	2021-12-09 16:55:21 +01:00
Gaurang Tandon	4ea19de80c	fix: verify jsonlines file in run_translation (#14660 ) (#14661 ) * fix: verify jsonl in run_translation (#14660) * fix(run_translation.py): json/jsonl validation Both json and jsonl are to be accepted as valid jsonlines file extension * fix(run_translation.py): make black happy * Ran make style	2021-12-08 13:25:30 -05:00
Suraj Patil	75ae287aec	fix flax examples tests (#14646 ) * make tensorboard optional * update test_fetcher for flax examples * make the tests slow	2021-12-07 00:34:27 +05:30
Suraj Patil	cbe6026536	fix flax example tests (#14643 )	2021-12-06 23:14:37 +05:30
Jay Zhang	1ccc033c56	Update the example of exporting Bart + BeamSearch to ONNX module to resolve comments. (#14310 ) * Update code to resolve comments left in previous PR. * Add README.md file for this example. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update README.md file to resolve comments. * Add a section name. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: Gary Miguel <garymm@garymm.org> * Add more comments for _convert_past_list_to_tuple(). * Change the default file name to a consistent one. * Fix a format issue. * Update examples/onnx/pytorch/translation/README.md Co-authored-by: Gary Miguel <garymm@garymm.org> * Update examples/onnx/pytorch/translation/run_onnx_exporter.py Co-authored-by: Gary Miguel <garymm@garymm.org> * Update examples/onnx/pytorch/translation/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Change the folder to summarization and address some other coments. * Update the torch version. Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Gary Miguel <garymm@garymm.org> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-06 14:01:51 +01:00
Julien Chaumond	6cdc3a7844	[urls to hub] Replace outdated model tags with their now-canonical pipeline types (#14617 ) * Replace outdated model tags with their now-canonical pipeline types * spam the CI till it's green	2021-12-06 04:35:01 -05:00
Suraj Patil	c5bd732ac6	Add Flax example tests (#14599 ) * add test for glue * add tests for clm * fix clm test * add summrization tests * more tests * fix few tests * add test for t5 mlm * fix t5 mlm test * fix tests for multi device * cleanup * ci job * fix metric file name * make t5 more robust	2021-12-06 10:48:58 +05:30
Kamal Raj	803a8cd18f	updated readme with proper arguments (#14624 )	2021-12-05 22:12:51 -05:00
(Bill) Yuchen Lin	3977b58437	fix a typo (#14626 )	2021-12-05 11:31:23 +05:30
Leandro von Werra	43f953cc2e	Add CodeParrot 🦜 codebase (#14536 ) * add readme skeleton * update readme * add initialization script * add deduplication script * add codeparrot training script * add code generation evaluation * add validation loss script * add requirements * update readme * tweak readme * make style * add highlights to readme * add CLIs to scripts * add tokenizer training script * add docstring to constant length dataset * fix defaults in arguments * update readme with cli * move image to hub * tweaks of readme * fix cli commands * add author * explain env variables * fix formatting * Update examples/research_projects/codeparrot/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * replace generic with gpt2 tokenizer Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2021-12-02 10:41:35 +01:00
Sylvain Gugger	4df7d05a87	Doc new front (#14590 ) * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix typo in toctree (#14516) * Fix checkpoints badge * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Convert PretrainedConfig doc to Markdown * Use syntax * Add necessary doc files (#14496) * Doc fixes (#14499) * Fixes for the new front * Convert DETR file for table * Title is needed * Simplify a bit * Even simpler * Remove imports * Fix checkpoints badge * Fix typo in toctree (#14516) * Update versions.yml format (#14517) * Doc new front github actions (#14512) * Doc new front github actions * Fix docstring * Fix feature extraction utils import (#14515) * Address Julien's comments * Push to doc-builder * Ready for merge * Remove old build and deploy * Doc misc fixes (#14583) * Rm versions.yml from doc * Fix converting.rst * Rm pretrained_models from toctree * Fix index links (#14567) * Fix links in README * Localized READMEs * Fix copy script * Fix find doc script * Update README_ko.md Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co> * Adapt build command to new CLI tools (#14578) * Fix typo * Fix doc interlinks (#14589) * Convert PretrainedConfig doc to Markdown * Use syntax * Rm pattern <[a-z]+(.html).> Rm huggingface.co/transformers/master * Rm .html * Rm .html from index.mdx * Rm .html from model_summary.rst * Update index.mdx rm html * Update remove .html * Fix inner doc links * Fix interlink in preprocssing.rst * Update pr_checks Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Styling Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <julien@huggingface.co>	2021-12-01 14:13:02 -05:00
Thomas Viehmann	6ed9882ddb	use functional interface for softmax in attention (#14198 ) * use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!	2021-11-30 11:47:33 -05:00
Rahul Nadkarni	8332327dca	Fix sentinel token IDs in data collator for Flax T5 pretraining script (#14477 )	2021-11-29 17:30:17 +01:00
Kamal Raj	2bd950ca47	[Flax] token-classification model steps enumerate start from 1 (#14547 ) * step start from 1 * Updated cur_step calcualtion	2021-11-29 21:55:59 +05:30
Nicholas Broad	69e16abf98	Switch from using sum for flattening lists of lists in group_texts (#14472 ) * remove sum for list flattening * change to chain() make chain object a list * delete empty lines per sgugger's suggestions Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Nicholas Broad <nicholas@nmbroad.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-11-22 16:17:26 -05:00
Stas Bekman	11f65d4158	[test] add test for --config_overrides (#14466 ) * add test for --config_overrides * remove unneeded parts of the test	2021-11-22 11:33:43 -05:00
Shang Zhang	a59e7c1ed4	Add QDQBert model and quantization examples of SQUAD task (#14066 ) * clean up branch for add-qdqbert-model * README update for QAT example; update docstrings in modeling_qdqbert.py * Update qdqbert.rst * Update README.md * Update README.md * calibration data using traning set; QAT example runs in fp32 * re-use BERTtokenizer for qdqbert * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/model_doc/qdqbert.rst Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove qdqbert tokenizer * Update qdqbert.rst * update evaluate-hf-trt-qa.py * update configuration_qdqbert.py * update modeling_qdqbert.py: add copied statement; replace assert with ValueError * update copied from statement * add is_quantization_available; run make fix-copies * unittest add require_quantization * add backend dependency to qdqbert model * update README; update evaluate script; make style * lint * docs qdqbert update * circleci build_doc add pytorch-quantization for qdqbert * update README * update example readme with instructions to upgrade TensorRT to 8.2 * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/qdqbert/configuration_qdqbert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change quantization to pytorch_quantization for backend requirement * feed_forward_chunking not supported in QDQBert * make style * update model docstrings and comments in testing scripts * rename example to quantization-qdqbert; rename example scripts from qat to quant * Update src/transformers/models/qdqbert/modeling_qdqbert.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * rm experimental functions in quant_trainer * qa cleanup * make fix-copies for docs index.rst * fix doctree; use post_init() for qdqbert * fix early device assignment for qdqbert * fix CI:Model templates runner Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-19 13:33:39 -05:00
Patrick von Platen	efea0f868b	[Speech Recognition] More examples Add more XLS-R training runs to the official examples	2021-11-18 23:42:02 +01:00
William Held	01f8e639d3	Recover Deleted XNLI Instructions (#14437 )	2021-11-17 20:16:47 -05:00
Antonio Carlos Falcão Petri	7544efc92e	[Gradient checkpoining] Update Wav2Vec scripts (#14036 ) Co-authored-by: Stas Bekman <stas@stason.org>	2021-11-17 18:37:21 +01:00
Eldar Kurtic	9fd937ead1	Replace BertLayerNorm with LayerNorm (#14385 ) Running Movement pruning experiments with the newest HuggingFace would crash due to non-existing BertLayerNorm.	2021-11-15 13:25:10 -05:00
Matt	267867e851	Quick fix to TF summarization example (#14401 )	2021-11-15 13:45:51 +00:00
Patrick von Platen	55f49c5f4b	[Wav2Vec2 Example] Improve fine-tuning script (#14373 ) * improve some stuff * finish * correct last	2021-11-12 16:35:57 +01:00
Stas Bekman	77262ef750	fix --gradient_checkpointing (#13964 )	2021-11-11 17:50:21 +01:00
Matt	7f20bf0d43	Fixing requirements for TF LM models and use correct model mappings (#14372 ) * Fixing requirements for TF LM models and use correct model mappings * make style	2021-11-11 15:34:00 +00:00
Suraj Patil	e92190c0f8	Fix Flax params dtype (#13098 ) * fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-11-11 14:45:20 +05:30
Suraj Patil	85a4bda4f4	bump flax version (#14343 )	2021-11-09 22:15:22 +05:30
karthikrangasai	4f24058c58	Update Seq2Seq QA example script to use SQuAD metric. (#14335 ) * Update postporcessing accordingly to use SQuAD metric. * Update assets accordingly based on SQuAD metrics. * Fix function naming error.	2021-11-09 08:04:23 -05:00
Junbum Lee	c016dbdbda	Fix execution PATH for PPLM Example (#14287 )	2021-11-06 10:33:47 -04:00
Sylvain Gugger	08a5f57567	Add new LFS prune API (#14294 )	2021-11-05 18:58:51 -04:00
Sylvain Gugger	558f8543ba	Update Transformers to huggingface_hub >= 0.1.0 (#14251 ) * Update Transformers to huggingface_hub >= 0.1.0 * Forgot to save... * Style * Fix test	2021-11-02 18:58:42 -04:00
NielsRogge	7396095af7	Update README of QA examples (#14172 )	2021-11-01 12:52:22 +01:00
Thomas Wang	5b45422b58	Remove n_ctx from configs (#14165 ) * Remove n_ctx from configs * Fix GPTJ and OpenAIGPT, both are acceptable breaking changes as there are no configs such that it breaks * Remove unecessary n_positions from TFOpenAIGPT	2021-10-29 11:50:25 +02:00
Patrick von Platen	ba71f1b57f	Update README.md	2021-10-28 19:43:05 +02:00
Lysandre	b8fad022a0	v4.13.0.dev0	2021-10-28 12:56:46 -04:00
Lysandre	62bf536631	Release v4.12.0	2021-10-28 12:09:49 -04:00
Anton Lozhkov	78b6a2ecbd	Add audio-classification benchmarking results (#14192 )	2021-10-28 15:59:18 +03:00
Patrick von Platen	88cd82e801	Update README.md	2021-10-28 02:35:01 +02:00
Patrick von Platen	e118db15d6	Update README.md	2021-10-28 01:59:27 +02:00
Patrick von Platen	01b1466983	[TPU tests] Enable first TPU examples pytorch (#14121 ) * up * up * fix * up * Update examples/pytorch/test_xla_examples.py * correct labels * up * up * up * up * up * up	2021-10-28 01:22:28 +02:00
Emanuel Huber	ebd48c6de5	Replace assertions with ValueError exception (#14142 ) Updated masked-language modeling examples in pytorch with convention defined by #12789	2021-10-26 17:14:29 -04:00
Matthew Goldey	42bfb83d74	fix typos in error messages in speech recognition example and modelcard.py (#14166 ) * specify the text column name in the error message * pluralize the word fields	2021-10-26 16:36:26 -04:00
Jangwon Park	41dad89f70	chore: typo on ner accelerate example code (#14150 )	2021-10-26 16:23:41 -04:00
Patrick von Platen	9799f4e150	Update README.md	2021-10-26 18:59:25 +02:00
Patrick von Platen	f5ed19f57d	[Speech Recognition] - Distributed training: Make sure vocab file removal and creation don't interfer (#14161 ) * up * better	2021-10-26 15:59:33 +02:00
Patrick von Platen	e248e9b042	up (#14154 )	2021-10-26 13:08:18 +02:00
Patrick von Platen	c99a2832ed	Update README.md	2021-10-25 19:50:36 +02:00
Patrick von Platen	1a9381c60d	Update README.md	2021-10-25 19:49:51 +02:00
Reza Gharibi	2ac65551ea	Fix rendering of examples version links (#14134 )	2021-10-25 07:45:44 -04:00
karthikrangasai	1b871e091b	Supporting Seq2Seq model for question answering task (#13432 ) * Add seq2seq example for QnA on SQuAD Dataset. * Changes from review - Fixing styling mistakes. * Added how to example in README, simplified the access to dataset's preprocess function. * Added tests for the seq2seq QA example. * Change dataset column name to fix tests. * Fix test command mistake. * Add missing argument 'ignore_pad_token_for_loss' from DataTrainingArguments. * Add missing argument 'num_beams' from DataTrainingArguments. * Fix processing of output predicted token ids so that tokenizer decode gets appropriate input. Updated assertion conditions on the tests.	2021-10-25 07:42:53 -04:00
Antonio Carlos Falcão Petri	05a2afc252	Add missing --validation_split_percentage data args (#14119 )	2021-10-22 19:04:54 +02:00
lee1jun	d432a654f6	fix typo in license docstring (#14094 ) last line: "# limitations under the License." is missing	2021-10-21 15:31:32 -04:00
Anton Lozhkov	e03544a138	[Examples] Add audio classification notebooks (#14099 ) * Update SEW integration test tolerance * Add audio classification notebooks	2021-10-21 19:15:46 +03:00
Christopher Akiki	f9c16b02e3	Replace "Masked" with "Causal" in TF CLM example (#14014 )	2021-10-21 16:19:30 +01:00
Patrick von Platen	e9d2a639f4	up (#14093 )	2021-10-21 10:30:02 +02:00
Sylvain Gugger	f875fb0e5f	Fix label attribution in token classification examples (#14055 )	2021-10-20 07:55:14 -04:00
Patrick von Platen	53dc39d821	up (#14079 )	2021-10-20 13:01:42 +02:00
Patrick von Platen	0bc2e54f00	Add ASR colabs (#14067 ) * up * Update notebooks/README.md	2021-10-20 11:51:41 +02:00
Anton Lozhkov	dbaf49203e	[Examples] Use Audio feature in speech classification (#14052 ) * Update SEW integration test tolerance * Update audio classification * Update test * Remove torchaudio * Add dataset revision * Hub branch naming * Revert dataset revisions * Update datasets	2021-10-20 12:22:43 +03:00
Weizhe Yuan	7a3147e9b8	fix typo (#14049 )	2021-10-18 18:03:11 -04:00
Patrick von Platen	bdf31d6e0a	[Speech] Move all examples to new audio feature (#14045 ) * up * up * up * finish	2021-10-18 12:52:40 +02:00
Patrick von Platen	37c5759cbe	[Speech Examples] Add new audio feature (#14027 ) * finish * up * finish all * up	2021-10-17 23:01:03 +02:00
jacksukk	d5b82bb70c	Fixed horizon_length for PPLM (#13886 ) * fixed horizon_length * fixed horizon_length * fix style	2021-10-14 21:46:09 -04:00
Patrick von Platen	7fb2a8b3d9	up (#14008 )	2021-10-14 15:46:22 +02:00
Sylvain Gugger	0ef61d392c	Revert "Skip faulty test" This reverts commit `5b6bd4e788`.	2021-10-14 09:02:41 -04:00
Sylvain Gugger	5b6bd4e788	Skip faulty test	2021-10-13 22:04:40 -04:00
Patrick von Platen	d45fc7da3d	[Speech Examples] Add pytorch speech pretraining (#13877 ) * adapt wav2vec2 * add example * add files * adapt * remove bogus file * Apply suggestions from code review * adapt files more * upload changes * del old files * up * up * up * up * up * correct gradient checkpoitning * add readme * finish * finish * up * more fixes * up * up * add demo run to readme * up	2021-10-12 00:46:32 +02:00
Chungman Lee	46dfe99e44	Fix typo in README.md (#13883 )	2021-10-08 14:25:32 -04:00
Dhananjay Shettigar	319beb64eb	#12789 Replace assert statements with exceptions (#13909 ) * #12789 Replace assert statements with exceptions * fix-copies: made copy changes to utils_qa.py in examples/pytorch/question-answering and examples/tensorflow/question-answering * minor refactor for clarity	2021-10-07 09:09:01 -04:00
Jay Zhang	279ce5b705	Add an example of exporting BartModel + BeamSearch to ONNX module. (#13765 ) * Add all example files. * Reformat files by black. * Style. * Remove unused imports. Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>	2021-10-07 12:07:02 +02:00
Akul Agrawal	dac7798144	Update run_qa.py (#13857 )	2021-10-05 23:10:24 -04:00

... 2 3 4 5 6 ...

2140 Commits