transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-16 02:58:23 +06:00

History

Daniel Stancl a72f1c9f5b Add `LongT5` model (#16792 ) * Initial commit * Make some fixes * Make PT model full forward pass * Drop TF & Flax implementation, fix copies etc * Add Flax model and update some corresponding stuff * Drop some TF things * Update config and flax local attn * Add encoder_attention_type to config * . * Update docs * Do some cleansing * Fix some issues -> make style; add some docs * Fix position_bias + mask addition + Update tests * Fix repo consistency * Fix model consistency by removing flax operation over attn_mask * [WIP] Add PT TGlobal LongT5 * . * [WIP] Add flax tglobal model * [WIP] Update flax model to use the right attention type in the encoder * Fix flax tglobal model forward pass * Make the use of global_relative_attention_bias * Add test suites for TGlobal model * Fix minor bugs, clean code * Fix pt-flax equivalence though not convinced with correctness * Fix LocalAttn implementation to match the original impl. + update READMEs * Few updates * Update: [Flax] improve large model init and loading #16148 * Add ckpt conversion script accoring to #16853 + handle torch device placement * Minor updates to conversion script. * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM * gpu support + dtype fix * Apply some suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * * Remove (de)parallelize stuff * Edit shape comments * Update README.md * make fix-copies * Remove caching logic for local & tglobal attention * Apply another batch of suggestions from code review * Add missing checkpoints * Format converting scripts * Drop (de)parallelize links from longT5 mdx * Fix converting script + revert config file change * Revert "Remove caching logic for local & tglobal attention" This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46. * Stash caching logic in Flax model * Make side relative bias used always * Drop caching logic in PT model * Return side bias as it was * Drop all remaining model parallel logic * Remove clamp statements * Move test files to the proper place * Update docs with new version of hf-doc-builder * Fix test imports * Make some minor improvements * Add missing checkpoints to docs * Make TGlobal model compatible with torch.onnx.export * Replace some np.ndarray with jnp.ndarray * Fix TGlobal for ONNX conversion + update docs * fix _make_global_fixed_block_ids and masked neg value * update flax model * style and quality * fix imports * remove load_tf_weights_in_longt5 from init and fix copies * add slow test for TGlobal model * typo fix * Drop obsolete is_parallelizable and one warning * Update __init__ files to fix repo-consistency * fix pipeline test * Fix some device placements * [wip]: Update tests -- need to generate summaries to update expected_summary * Fix quality * Update LongT5 model card * Update (slow) summarization tests * make style * rename checkpoitns * finish * fix flax tests Co-authored-by: phungvanduy <pvduy23@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patil-suraj <surajp815@gmail.com>		2022-06-13 22:36:58 +02:00
..
internal	Allow from transformers import TypicalLogitsWarper (#17477 )	2022-06-03 11:08:35 +02:00
main_classes	Add Visual Question Answering (VQA) pipeline (#17286 )	2022-06-13 07:49:44 -04:00
model_doc	Add `LongT5` model (#16792 )	2022-06-13 22:36:58 +02:00
tasks	Update audio examples with MInDS-14 (#16633 )	2022-04-08 15:55:42 -05:00
_config.py	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
_toctree.yml	Add `LongT5` model (#16792 )	2022-06-13 22:36:58 +02:00
accelerate.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
add_new_model.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
add_new_pipeline.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
autoclass_tutorial.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
benchmarks.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
bertology.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
big_models.mdx	Make Trainer compatible with sharded checkpoints (#17053 )	2022-05-03 09:55:10 -04:00
community.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
converting_tensorflow_models.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
create_a_model.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
custom_models.mdx	Update custom_models.mdx (#16964 )	2022-04-27 16:46:55 +02:00
debugging.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
fast_tokenizers.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
glossary.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
index.mdx	Add `LongT5` model (#16792 )	2022-06-13 22:36:58 +02:00
installation.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
migration.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
model_sharing.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
model_summary.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
multilingual.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
perf_hardware.mdx	[WIP] [doc] performance/scalability revamp (#15723 )	2022-05-16 13:36:41 +02:00
perf_train_cpu.mdx	Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138 )	2022-06-08 09:41:57 -04:00
perf_train_gpu_many.mdx	[WIP] [doc] performance/scalability revamp (#15723 )	2022-05-16 13:36:41 +02:00
perf_train_gpu_one.mdx	[WIP] [doc] performance/scalability revamp (#15723 )	2022-05-16 13:36:41 +02:00
performance.mdx	Extend Transformers Trainer Class to Enable CPU AMP and Integrate Intel Extension for PyTorch (#17138 )	2022-06-08 09:41:57 -04:00
perplexity.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
philosophy.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pipeline_tutorial.mdx	docs(transformers): fix typo (#17263 )	2022-05-16 17:04:30 -04:00
pr_checks.mdx	Add a check on config classes docstring checkpoints (#17012 )	2022-04-30 10:40:46 +02:00
preprocessing.mdx	Fixing the output of code examples in the preprocessing chapter (#17162 )	2022-05-10 12:16:28 -04:00
quicktour.mdx	Fix doc test quicktour dataset (#16929 )	2022-04-25 16:26:59 +02:00
run_scripts.mdx	Fix all docs for accelerate install directions (#17145 )	2022-05-09 15:45:18 -04:00
sagemaker.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
serialization.mdx	Add `LongT5` model (#16792 )	2022-06-13 22:36:58 +02:00
task_summary.mdx	[Doctests] Correct task summary (#16644 )	2022-04-11 14:59:35 +02:00
testing.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
tokenizer_summary.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
training.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
troubleshooting.mdx	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00