mirror of https://github.com/huggingface/transformers.git synced 2025-07-03 21:00:08 +06:00

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments

Co-authored-by: sgugger <sylvain.gugger@gmail.com>

Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

2020-12-15 17:08:49 -05:00

18 KiB

Raw Blame History

🤗 Transformers Notebooks

You can find here a list of the official notebooks provided by Hugging Face.

Also, we would like to list here interesting content created by the community. If you wrote some notebook(s) leveraging 🤗 Transformers and would like be listed here, please open a Pull Request so it can be included under the Community notebooks.

Hugging Face's notebooks 🤗

Notebook	Description
Getting Started Tokenizers	How to train and use your very own tokenizer
Getting Started Transformers	How to easily start using transformers
How to use Pipelines	Simple and efficient way to use State-of-the-Art models on downstream tasks through transformers
How to train a language model	Highlight all the steps to effectively train Transformer model on custom data
How to generate text	How to use different decoding methods for language generation with transformers
How to export model to ONNX	Highlight how to export and run inference workloads through ONNX
How to use Benchmarks	How to benchmark models with transformers
Reformer	How Reformer pushes the limits of language modeling

Community notebooks:

Notebook	Description	Author
Train T5 in Tensorflow 2	How to train T5 for any task using Tensorflow 2. This notebook demonstrates a Question & Answer task implemented in Tensorflow 2 using SQUAD	Muhammad Harris
Train T5 on TPU	How to train T5 on SQUAD with Transformers and Nlp	Suraj Patil
Fine-tune T5 for Classification and Multiple Choice	How to fine-tune T5 for classification and multiple choice tasks using a text-to-text format with PyTorch Lightning	Suraj Patil
Fine-tune DialoGPT on New Datasets and Languages	How to fine-tune the DialoGPT model on a new dataset for open-dialog conversational chatbots	Nathan Cooper
Long Sequence Modeling with Reformer	How to train on sequences as long as 500,000 tokens with Reformer	Patrick von Platen
Fine-tune BART for Summarization	How to fine-tune BART for summarization with fastai using blurr	Wayde Gilliam
Fine-tune a pre-trained Transformer on anyone's tweets	How to generate tweets in the style of your favorite Twitter account by fine-tune a GPT-2 model	Boris Dayma
A Step by Step Guide to Tracking Hugging Face Model Performance	A quick tutorial for training NLP models with HuggingFace and & visualizing their performance with Weights & Biases	Jack Morris
Pretrain Longformer	How to build a "long" version of existing pretrained models	Iz Beltagy
Fine-tune Longformer for QA	How to fine-tune longformer model for QA task	Suraj Patil
Evaluate Model with 🤗nlp	How to evaluate longformer on TriviaQA with `nlp`	Patrick von Platen
Fine-tune T5 for Sentiment Span Extraction	How to fine-tune T5 for sentiment span extraction using a text-to-text format with PyTorch Lightning	Lorenzo Ampil
Fine-tune DistilBert for Multiclass Classification	How to fine-tune DistilBert for multiclass classification with PyTorch	Abhishek Kumar Mishra
Fine-tune BERT for Multi-label Classification	How to fine-tune BERT for multi-label classification using PyTorch	Abhishek Kumar Mishra
Fine-tune T5 for Summarization	How to fine-tune T5 for summarization in PyTorch and track experiments with WandB	Abhishek Kumar Mishra
Speed up Fine-Tuning in Transformers with Dynamic Padding / Bucketing	How to speed up fine-tuning by a factor of 2 using dynamic padding / bucketing	Michael Benesty
Pretrain Reformer for Masked Language Modeling	How to train a Reformer model with bi-directional self-attention layers	Patrick von Platen
Expand and Fine Tune Sci-BERT	How to increase vocabulary of a pretrained SciBERT model from AllenAI on the CORD dataset and pipeline it.	Tanmay Thakur
Fine-tune Electra and interpret with Integrated Gradients	How to fine-tune Electra for sentiment analysis and interpret predictions with Captum Integrated Gradients	Eliza Szczechla
fine-tune a non-English GPT-2 Model with Trainer class	How to fine-tune a non-English GPT-2 Model with Trainer class	Philipp Schmid
Fine-tune a DistilBERT Model for Multi Label Classification task	How to fine-tune a DistilBERT Model for Multi Label Classification task	Dhaval Taunk
Fine-tune ALBERT for sentence-pair classification	How to fine-tune an ALBERT model or another BERT-based model for the sentence-pair classification task	Nadir El Manouzi
Fine-tune Roberta for sentiment analysis	How to fine-tune an Roberta model for sentiment analysis	Dhaval Taunk
Evaluating Question Generation Models	How accurate are the answers to questions generated by your seq2seq transformer model?	Pascal Zoleko
Classify text with DistilBERT and Tensorflow	How to fine-tune DistilBERT for text classification in TensorFlow	Peter Bayerle
Leverage BERT for Encoder-Decoder Summarization on CNN/Dailymail	How to warm-start a EncoderDecoderModel with a bert-base-uncased checkpoint for summarization on CNN/Dailymail	Patrick von Platen
Leverage RoBERTa for Encoder-Decoder Summarization on BBC XSum	How to warm-start a shared EncoderDecoderModel with a roberta-base checkpoint for summarization on BBC/XSum	Patrick von Platen
Fine-tuning TAPAS on Sequential Question Answering (SQA)	How to fine-tune TapasForQuestionAnswering with a tapas-base checkpoint on the Sequential Question Answering (SQA) dataset	Niels Rogge
Evaluating TAPAS on Table Fact Checking (TabFact)	How to evaluate a fine-tuned TapasForSequenceClassification with a tapas-base-finetuned-tabfact checkpoint using a combination of the 🤗 datasets and 🤗 transformers libraries	Niels Rogge

18 KiB Raw Blame History

🤗 Transformers Notebooks

Hugging Face's notebooks 🤗

Community notebooks:

18 KiB

Raw Blame History