transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
Lysandre Debut	d51302cca0	Fix slow dpr test (#10059 ) * Correct cast to device * Comment back the slow test	2021-02-08 04:43:25 -05:00
sandip	12e44af5d3	Integration test for FlauBert (#10022 )	2021-02-08 04:36:50 -05:00
Stas Bekman	24db8cc329	Can't mix --fp16 and --device cpu (#10041 )	2021-02-07 17:54:20 -08:00
Stas Bekman	769948fad2	json to jsonlines, and doc, and typo (#10043 )	2021-02-07 17:51:34 -08:00
Stas Bekman	8ea412a86f	[examples] make run scripts executable (#10037 ) * make executable * make executable * same for the template * cleanup	2021-02-05 15:51:18 -08:00
Suraj Patil	1cd16512dc	[examples/seq2seq] support label smoothing (#9844 ) * add prepare_decoder_input_ids_from_labels in s2s models * support lbl smoothing and enc/emb freezing * fix freezing * use pad_token_id from config * remove embed freezing and add warning * prepare decoder_input_ids inside DataCollatorForSeq2Seq	2021-02-05 23:21:57 +05:30
Patrick von Platen	b9720dd6f2	Bump minimum Jax requirement to 2.8.0 (#10027 ) * Bump minimum Jax requirement to 2.8.0 * update table	2021-02-05 16:20:26 +03:00
Patrick von Platen	89be094e29	[Templates] Add template "call-for-model" markdown and "call-for-big-bird" markdown (#9921 ) * add big bird * change teacher to mentor * add proposal template * adapt template * delete old template * correct some links * finish template * create big bird from template * add big bird * improve boxes * finish boxes * add pointers for BigBird * finish big bird * up * up * up * up * apply lysandres and sylvains suggestions * delete bogus file * correct markdown * try different style * try different style * finalize	2021-02-05 15:47:54 +03:00
Lysandre Debut	4bbad604eb	Clarify QA pipeline output based on character (#10021 ) * Clarify QA pipeline output based on character * Style	2021-02-05 05:40:30 -05:00
Lysandre	ad2c431097	Update doc deployment script path	2021-02-05 11:18:59 +01:00
Lysandre	95a5f271e5	Update doc deployment script	2021-02-05 11:10:29 +01:00
Sylvain Gugger	3be965c5db	Update doc for pre-release (#10014 ) * Update doc for pre-release * Use stable as default * Use the right commit :facepalms:	2021-02-04 16:52:27 -05:00
Sylvain Gugger	ba607db180	Bump version	2021-02-04 16:23:05 -05:00
Sylvain Gugger	4cd22512de	Release: 4.3.0.rc1	2021-02-04 15:41:19 -05:00
Sylvain Gugger	4739ce177d	Fix test for sagemaker and TPU integrations	2021-02-04 15:06:58 -05:00
Sylvain Gugger	21b3922e35	Authorize last version of tokenizer (#9799 ) * Authorize last version of tokenizer * Update version table * Fix conversion of spm tokenizers and fix some hub links * Bump tokenizers version to 0.10.1rc1 * Add script to check tokenizers conversion with XNLI * Add some more mask_token lstrip support * Must modify mask_token in slow tokenizers too * Keep using the old method for Pegasus * add missing import Co-authored-by: Anthony MOI <m.anthony.moi@gmail.com>	2021-02-04 14:18:33 -05:00
Nicolas Patry	d5888ef0ab	Hotfixing tests (blenderbot decoderonly tests, also need to remove (#10003 ) `encoder_no_repeat_ngram_size` from their config.	2021-02-04 11:41:34 -05:00
Stas Bekman	8c3b1fcb67	[trainer] a few fixes (#9993 ) * trainer fixes * don't switch the model just for deepspeed and mp * correct the fix	2021-02-04 07:44:56 -08:00
Daniel Stancl	714855bd8f	Remove "double" assignment in TF-BART like models (#9997 ) * Replace `attn_weights = attn_wegihts = tf.reshape(...)` with `attn_weights = tf.reshape(...)` and thus remove unintentionally used "double" assignment.	2021-02-04 10:24:47 -05:00
Sylvain Gugger	b72f16b3ec	Fix doc for TFConverBertModel	2021-02-04 10:14:46 -05:00
Nicolas Patry	aeb18b9224	Adding new `encoder_no_repeat_ngram_size` to `generate`. (#9984 ) Adding new `encoder_no_repeat_ngram_size` to `generate`. Blenderbot results seemed off compared to original ParlAI script: `https://parl.ai/projects/recipes/`. Notably the model seems to repeat a lot what was said during the conversation. The actual problem was that `no_repeat_ngram_size` actually applies to the `encoder_input_ids` but HF's `no_repeat_ngram_size` applies to the previously generated ids (within the decoder). The history conversation of blenderbot is within the `encoder` part so that explains why HF's implementation had the repetitions. This fix was focused on blenderbot not small and added tests for those because they are quite different in configuration. This change includes: - Adding a new EncoderNoRepeatLogitProcessor. - Adding 1 new arg to `generate` (`encoder_no_repeat_ngram_size`) - Adding 1 new config parameter `encoder_no_repeat_ngram_size`. - Adding 2 tests, one for the pipeline (high level, inputs exhibited repeat behavior, one low level for EncoderNoRepeatLogitProcessor) - Factored NoRepeatLogitProcessor so that logic could be reused. Further work: - Blenderbot conversational pipeline still does not behave correctly as they way input is prepared within the pipeline is still incorrect (follow up PR) - Blenderbot allows the bot to have personas, which is done by prepending "your personna: XXXX" to the input, this could be explored too in a follow up PR. @patrickvonplaten @LysandreJik * Update src/transformers/generation_logits_process.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/configuration_utils.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Doc quality. * Fixing test. * Last fixes. * Fixing to account for batch_size. * Update src/transformers/configuration_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/generation_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-04 15:00:18 +01:00
Lysandre Debut	e89c959af9	Fix model templates (#9999 )	2021-02-04 07:47:26 -05:00
Daniel Hug	804cd185d8	Added Integration testing for DistilBert model from issue #9948 ' (#9995 )	2021-02-04 04:24:59 -05:00
demSd	00031785a8	BartForCausalLM analogs to `ProphetNetForCausalLM` (#9128 ) * initiliaze bart4causalLM * create BartDecoderWrapper, setters/getters * delete spaces * forward and additional methods * update cache function, loss function, remove ngram* params in data class. * add bartcausallm, bartdecoder testing * correct bart for causal lm * remove at * add mbart as well * up * fix typo * up * correct * add pegasusforcausallm * add blenderbotforcausallm * add blenderbotsmallforcausallm * add marianforcausallm * add test for MarianForCausalLM * add Pegasus test * add BlenderbotSmall test * add blenderbot test * fix a fail * fix an import fail * a fix * fix * Update modeling_pegasus.py * fix models * fix inputs_embeds setting getter * adapt tests * correct repo utils check * finish test improvement * fix tf models as well * make style * make fix-copies * fix copies * run all tests * last changes * fix all tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-02-04 11:56:12 +03:00
Sylvain Gugger	7898fc03b1	Add `from_slow` in fast tokenizers build and fixes some bugs (#9987 )	2021-02-04 03:34:23 -05:00
Stefan Schweter	6244727e05	distilbert: fix creation of sinusoidal embeddings when using PyTorch 1.8+ (#9917 )	2021-02-03 11:42:16 -05:00
sandip	2f06f2bcd6	Alber model integration testing added (#9980 )	2021-02-03 11:41:10 -05:00
sandip	75fd00fb25	Integration test added for TF MPnet (#9979 )	2021-02-03 11:39:40 -05:00
sandip	ce08043f7a	Integration test for mobilebert (#9978 )	2021-02-03 11:36:45 -05:00
sandip	1486205d23	TF DistilBERT integration tests (#9975 ) * TF DistilBERT integration test * Update test_modeling_tf_distilbert.py	2021-02-03 09:51:00 -05:00
sandip	f2d5c04e1f	Added integration tests for TensorFlow implementation of the ALBERT model (#9976 ) * TF Albert integration test * TF Alber integration test added	2021-02-03 09:49:18 -05:00
Suraj Patil	bca0dd5ee3	[run_clm.py] fix getting extention	2021-02-03 20:14:42 +05:30
yylun	5442a11f5f	fix steps_in_epoch variable in trainer when using max_steps (#9969 ) * fix steps_in_epoch variable when using max_steps * redundant sentence * Revert "redundant sentence" This reverts commit `ad5c0e9b6e`. * remove redundant sentence Co-authored-by: wujindou <wujindou@sogou-inc.com>	2021-02-03 09:30:37 -05:00
Julien Plu	3f77c26d74	Fix Longformer and LED (#9942 ) * Fix Longformer and LED * Add a test for graph execution with inputs_embeds * Apply style	2021-02-03 12:26:32 +01:00
Stas Bekman	d55e10beab	[research proj] [lxmert] rm bleach dependency (#9970 ) Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps. https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open	2021-02-03 05:24:40 -05:00
abhishek thakur	a1a67a3ced	Fix GroupedLinearLayer in TF ConvBERT (#9972 )	2021-02-03 04:49:07 -05:00
Daniel Stancl	71bdc076dd	Add head_mask and decoder_head_mask to PyTorch LED (#9856 ) * Add {decoder_,}head_mask to LED * Fix create_custom_forward signatue in encoder * Add head_mask to longformer * Add head_mask to longformer to fix dependencies of LED on Longformer. * Not working yet * Add mising one input in longofrmer_modeling.py * make fix-copies	2021-02-02 11:06:52 -08:00
Patrick von Platen	d6217fb30c	Wav2Vec2 (#9659 ) * add raw scaffold * implement feat extract layers * make style * remove + * correctly convert weights * make feat extractor work * make feature extraction proj work * run forward pass * finish forward pass * Succesful decoding example * remove unused files * more changes * add wav2vec tokenizer * add new structure * fix run forward * add other layer norm architecture * finish 2nd structure * add model tests * finish tests for tok and model * clean-up * make style * finish docstring for model and config * make style * correct docstring * correct tests * change checkpoints to fairseq * fix examples * finish wav2vec2 * make style * apply sylvains suggestions * apply lysandres suggestions * change print to log.info * re-add assert statement * add input_values as required input name * finish wav2vec2 tokenizer * Update tests/test_tokenization_wav2vec2.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * apply sylvains suggestions Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-02-02 15:52:10 +03:00
Sylvain Gugger	d996024af7	Use compute_loss in prediction_step (#9935 )	2021-02-02 07:00:17 -05:00
Stefan Schweter	aa438a4265	convbert: minor fixes for conversion script (#9937 )	2021-02-02 06:09:24 -05:00
Sylvain Gugger	62024453c3	Bump numpy (#9934 )	2021-02-02 05:46:33 -05:00
Sylvain Gugger	de38a6e4d2	Fix 9918 (#9932 ) * Initial work * Fix doc styler and other models	2021-02-02 05:22:20 -05:00
Lysandre Debut	1809de5165	ALBERT Tokenizer integration test (#9943 ) * ALBERT Tokenizer integration test * Batching * Style	2021-02-02 04:39:33 -05:00
Patrick von Platen	0f4dc5d864	fix typo in naming (#9944 )	2021-02-02 12:22:42 +03:00
Patrick von Platen	538b3b4607	[Tokenizer Utils Base] Make pad function more flexible (#9928 ) * change tokenizer requirement * split line * Correct typo from list to str * improve style * make other function pretty as well * add comment * correct typo * add new test * pass tests for tok without padding token * Apply suggestions from code review	2021-02-02 10:35:27 +03:00
Jan Jitse Venselaar	d1b14c9b54	Tensorflow doc changes on loss output size (#9922 ) * Change documentation to correctly specify loss tensor size * Change documentation to correct input format for labels * Corrected output size of loss tensor for sequence classifier, multiple choice model and question answering	2021-02-01 11:17:50 -05:00
Suraj Patil	343057e141	Fix bart conversion script (#9923 ) * fix conversion script * typo * import nn	2021-02-01 19:17:14 +03:00
Patrick von Platen	0e3be1ac8f	Add new model docs (#9667 ) * add new model logic * fix docs * change structure * improve add_new_model * push new changes * up * up * correct spelling * improve docstring * correct line length * update readme * correct links * correct typos * only add rst file for now * Apply suggestions from code review 1 Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> * finish adding all suggestions * make style * apply Niels feedback * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply sylvains suggestions Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bram Vanroy <Bram.Vanroy@UGent.be> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-02-01 17:55:10 +03:00
Suraj Patil	0842c33edd	fix typos (#9924 )	2021-02-01 08:17:45 -05:00
CeShine Lee	8672bcda1f	Adafactor: avoid updating group["lr"] attributes (#9751 ) This affects Adafactor with relative_step=False and scale_parameter=True. Updating group["lr"] makes the result of ._get_lr() depends on the previous call, i.e., on the scale of other parameters. This isn't supposed to happen.	2021-02-01 08:07:33 -05:00

1 2 3 4 5 ...

6494 Commits