transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

History

Thomas Wang abc400b06a Add final_layer_norm to OPT model (#17785 ) * Add final_layer_norm to OPT model * Add JAX and TF version * Fix Keras name * Woops * Allow for non breaking change * Apply suggestions from code review * add tests Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>		2022-06-21 20:26:36 +02:00
..
albert
auto	Prepare transformers for v0.8.0 huggingface-hub release (#17716 )	2022-06-21 11:51:18 -04:00
bart	TF: BART compatible with XLA generation (#17479 )	2022-06-20 11:07:46 +01:00
barthez
bartpho
beit
bert	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
bert_generation	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
bert_japanese	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
bertweet
big_bird	Use 5e-5 For BigBird PT/Flax equivalence tests (#17780 )	2022-06-21 17:55:26 +02:00
bigbird_pegasus	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
blenderbot	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
blenderbot_small	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
bloom	fix tolerance for a bloom slow test (#17634 )	2022-06-14 18:14:12 +02:00
bort
byt5
camembert
canine	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
clip	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
convbert
convnext	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
cpm
ctrl	Fix CTRL tests (#17508 )	2022-06-01 16:27:23 +02:00
cvt	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
data2vec	[Data2Vec] Speed up test (#17660 )	2022-06-10 18:48:58 +02:00
deberta	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
deberta_v2	Fx support for Deberta-v[1-2], Hubert and LXMERT (#17539 )	2022-06-07 18:05:20 +02:00
decision_transformer
deit
detr
distilbert
dit
dpr
dpt
electra
encoder_decoder
flaubert
flava	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
fnet	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
fsmt	Not use -1e4 as attn mask (#17306 )	2022-06-20 16:16:16 +02:00
funnel
glpn
gpt_neo	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
gpt_neox	Fix cache for GPT-Neo-X (#17764 )	2022-06-20 08:43:36 -04:00
gpt2	TF: BART compatible with XLA generation (#17479 )	2022-06-20 11:07:46 +01:00
gptj	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
herbert
hubert	Fx support for Deberta-v[1-2], Hubert and LXMERT (#17539 )	2022-06-07 18:05:20 +02:00
ibert	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
imagegpt	Enabling `imageGPT` auto feature extractor. (#16871 )	2022-05-24 12:30:46 +02:00
layoutlm	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
layoutlmv2	Add LayoutLMv3 (#17060 )	2022-05-24 09:53:45 +02:00
layoutlmv3	Add LayoutLMv3 (#17060 )	2022-05-24 09:53:45 +02:00
layoutxlm	Fix LayoutXLMProcessorTest (#17506 )	2022-06-01 16:26:37 +02:00
led	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
levit	Add skip logic for attentions test - Levit (#17633 )	2022-06-10 12:46:30 +02:00
longformer	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
longt5	[LongT5] disable model parallel test (#17702 )	2022-06-14 17:27:39 +02:00
luke	Debug LukeForMaskedLM (#17499 )	2022-06-01 10:03:06 -04:00
lxmert	Fx support for Deberta-v[1-2], Hubert and LXMERT (#17539 )	2022-06-07 18:05:20 +02:00
m2m_100	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
marian	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
maskformer
mbart	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
mbart50	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
mctct	M-CTC-T Model (#16402 )	2022-06-08 00:33:07 +02:00
megatron_bert
megatron_gpt2
mluke	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
mobilebert	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
mpnet
mt5
nystromformer
openai
opt	Add final_layer_norm to OPT model (#17785 )	2022-06-21 20:26:36 +02:00
pegasus	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
perceiver	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
phobert
plbart	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
poolformer	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
prophetnet	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
qdqbert
rag	Avoid GPU OOM for a TF Rag test (#17638 )	2022-06-10 18:50:29 +02:00
realm	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
reformer	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
regnet	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
rembert
resnet	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
retribert	fix retribert's `test_torch_encode_plus_sent_to_model` (#17231 )	2022-05-17 14:33:13 +02:00
roberta	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
roformer
segformer
sew
sew_d
speech_encoder_decoder
speech_to_text	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
speech_to_text_2	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
splinter	Add support for pretraining recurring span selection to Splinter (#17247 )	2022-05-17 23:42:14 +02:00
squeezebert
swin	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
t5	TF: BART compatible with XLA generation (#17479 )	2022-06-20 11:07:46 +01:00
tapas	Add magic method to our TF models to convert datasets with column inference (#17160 )	2022-06-06 15:53:49 +01:00
tapex
trajectory_transformer	Add trajectory transformer (#17141 )	2022-05-17 19:07:43 -04:00
transfo_xl	Add magic method to our TF models to convert datasets with column inference (#17160 )	2022-06-06 15:53:49 +01:00
trocr	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
unispeech
unispeech_sat
van	has_attentions - consistent test skipping logic and tf tests (#17495 )	2022-06-09 09:50:03 +02:00
vilt	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
vision_encoder_decoder
vision_text_dual_encoder
visual_bert
vit	ViT and Swin symbolic tracing with torch.fx (#17182 )	2022-05-12 10:42:27 +02:00
vit_mae	[ViTMAE] Fix docstrings and variable names (#17710 )	2022-06-21 15:56:00 +02:00
wav2vec2	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
wav2vec2_conformer	[Test] Fix W2V-Conformer integration test (#17303 )	2022-05-17 18:20:36 +02:00
wav2vec2_phoneme
wav2vec2_with_lm
wavlm
xglm	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
xlm
xlm_prophetnet	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
xlm_roberta	Black preview (#17217 )	2022-05-12 16:25:55 -04:00
xlm_roberta_xl
xlnet	Fx support for multiple model architectures (#17393 )	2022-05-31 10:02:55 +02:00
yolos
yoso	fix `train_new_from_iterator` in the case of byte-level tokenizers (#17549 )	2022-06-08 15:30:41 +02:00
__init__.py