transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-18 03:58:25 +06:00

History

Yih-Dar 8b240a0661 Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 ) * Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>		2021-10-13 00:10:34 +02:00
..
albert.rst	albert flax (#13294 )	2021-08-30 17:29:27 +02:00
auto.rst	Image Segmentation pipeline (#13828 )	2021-10-08 09:59:53 +02:00
bart.rst	FlaxBart (#11537 )	2021-06-14 15:16:08 +05:30
barthez.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
beit.rst	beit-flax (#13515 )	2021-09-21 13:34:19 +02:00
bert_japanese.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
bert.rst	[Flax] Correct flax docs (#12782 )	2021-08-04 16:31:23 +02:00
bertgeneration.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
bertweet.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
bigbird_pegasus.rst	Add BigBirdPegasus (#10991 )	2021-05-07 09:27:43 +02:00
bigbird.rst	Flax Big Bird (#11967 )	2021-06-14 20:01:03 +01:00
blenderbot_small.rst	Add BlenderBot small tokenizer to the init (#13367 )	2021-09-22 19:00:47 -04:00
blenderbot.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
bort.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
byt5.rst	Improve T5 docs (#13240 )	2021-09-01 15:05:40 +02:00
camembert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
canine.rst	Wrong model is used in example, should be character instead of subword model (#12676 )	2021-07-13 08:40:27 -04:00
clip.rst	add and fix examples (#12810 )	2021-07-20 09:28:50 -04:00
convbert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
cpm.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
ctrl.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
deberta_v2.rst	Deberta_v2 tf (#13120 )	2021-08-31 06:32:47 -04:00
deberta.rst	Deberta tf (#12972 )	2021-08-12 05:01:26 -04:00
deit.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
detr.rst	Improve detr (#12147 )	2021-06-17 10:37:54 -04:00
dialogpt.rst	ADD BORT (#9813 )	2021-01-27 21:25:11 +03:00
distilbert.rst	Fix typo distilbert doc (#13643 )	2021-09-20 15:10:33 -04:00
dpr.rst	[DPR] Correct init (#13796 )	2021-09-30 18:55:20 +02:00
electra.rst	[Flax] Add Electra models (#11426 )	2021-05-04 20:56:09 +02:00
encoderdecoder.rst	Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 )	2021-10-13 00:10:34 +02:00
flaubert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
fnet.rst	Add FNet (#13045 )	2021-09-20 13:24:30 +02:00
fsmt.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
funnel.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
gpt_neo.rst	FlaxGPTNeo (#12493 )	2021-07-06 18:55:18 +05:30
gpt.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
gpt2.rst	Add Mistral GPT-2 Stability Tweaks (#13573 )	2021-10-04 07:37:09 -04:00
gptj.rst	[docs/gpt-j] fix typo (#13851 )	2021-10-04 12:30:50 +02:00
herbert.rst	Fixed typo: herBERT -> HerBERT (#13936 )	2021-10-08 10:27:32 -04:00
hubert.rst	Add Wav2Vec2 & Hubert ForSequenceClassification (#13153 )	2021-08-27 20:52:51 +03:00
ibert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
layoutlm.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
layoutlmv2.rst	Add LayoutLMv2 + LayoutXLM (#12604 )	2021-08-30 12:35:42 +02:00
layoutxlm.rst	Add tokenizer docs (#13373 )	2021-09-02 09:46:05 +02:00
led.rst	Make gradient_checkpointing a training argument (#13657 )	2021-09-22 07:51:38 -04:00
longformer.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
luke.rst	Add LUKE (#11223 )	2021-05-03 09:07:29 -04:00
lxmert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
m2m_100.rst	replace tgt_lang by tgt_text (#13061 )	2021-08-09 22:47:05 +05:30
marian.rst	Rely on huggingface_hub for common tools (#13100 )	2021-08-12 14:59:02 +02:00
mbart.rst	fix example (#13387 )	2021-09-02 11:32:18 +02:00
megatron_bert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
megatron_gpt2.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
mobilebert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
mpnet.rst	MPNet copyright files (#9015 )	2020-12-10 09:29:38 -05:00
mt5.rst	Fix mT5 documentation (#13639 )	2021-09-20 07:53:31 -04:00
pegasus.rst	[Flax] Addition of FlaxPegasus (#13420 )	2021-09-14 17:15:19 +02:00
phobert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
prophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
rag.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
reformer.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
rembert.rst	Add RemBERT model code to huggingface (#10692 )	2021-07-24 11:31:42 -04:00
retribert.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
roberta.rst	Add TFEncoderDecoderModel + Add cross-attention to some TF models (#13222 )	2021-10-13 00:10:34 +02:00
roformer.rst	[RoFormer] Fix some issues (#12397 )	2021-07-06 03:31:57 -04:00
speech_to_text_2.rst	Add SpeechEncoderDecoder & Speech2Text2 (#13186 )	2021-09-01 13:33:31 +02:00
speech_to_text.rst	fix: typo spelling grammar (#13212 )	2021-08-30 08:09:14 -04:00
speechencoderdecoder.rst	Add SpeechEncoderDecoder & Speech2Text2 (#13186 )	2021-09-01 13:33:31 +02:00
splinter.rst	Add splinter (#12955 )	2021-08-17 08:29:01 -04:00
squeezebert.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
t5.rst	Improve T5 docs (#13240 )	2021-09-01 15:05:40 +02:00
t5v1.1.rst	Improve T5 docs (#13240 )	2021-09-01 15:05:40 +02:00
tapas.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
transformerxl.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
visual_bert.rst	Fix VisualBERT docs (#13106 )	2021-08-13 11:44:04 +05:30
vit.rst	Add DINO conversion script (#13265 )	2021-08-26 17:25:20 +02:00
wav2vec2.rst	Add Wav2Vec2 & Hubert ForSequenceClassification (#13153 )	2021-08-27 20:52:51 +03:00
xlm.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
xlmprophetnet.rst	Copyright (#8970 )	2020-12-07 18:36:34 -05:00
xlmroberta.rst	Honor contributors to models (#11329 )	2021-04-21 09:47:27 -04:00
xlnet.rst	Examples reorg (#11350 )	2021-04-21 11:11:20 -04:00
xlsr_wav2vec2.rst	[XLSR-Wav2Vec2] Add multi-lingual Wav2Vec2 models (#10648 )	2021-03-11 17:44:18 +03:00