Commit Graph

15053 Commits

Author SHA1 Message Date
Sanchit Gandhi
7490a97cac
[Flax] Fix incomplete batches in example scripts (#17863)
* [Flax] Fix incomplete batches in example scripts

* fix dataloader batching

* convert jnp batch idxs to np array

* add missing `pad_shard_unpad` to final prediction generate step

* only `pad_shard_unpad` at inference time

* merge conflicts

* remove incomplete batch step from eval

* fix run_qa.py

* add `pad_shard_unpad` to run_flax_ner.py

* add `pad_shard_unpad` to run_flax_glue.py

* add `pad_shard_unpad` to run_image_classification.py

* make style

* fix mlm flax eval batches

* remove redundant imports
2022-07-27 15:50:47 +01:00
Alara Dirik
9caf68a638
Owlvit test fixes (#18303)
* fix owlvit test assertion errors

* fix gpu test error

* remove redundant lines

* fix styling
2022-07-27 17:26:27 +03:00
Sylvain Gugger
0077360d67
Fix sacremoses sof dependency for Transformers XL (#18321)
* Fix sacremoses sof dependency for Transofmers XL

* Add function to the submodule init
2022-07-27 09:37:02 -04:00
Lysandre Debut
5c5676cdf9
sentencepiece shouldn't be required for the fast LayoutXLM tokenizer (#18320) 2022-07-27 09:09:32 -04:00
Sylvain Gugger
cf32b2ee42
Remove all uses of six (#18318)
* Remove all uses of six

* fix quality
2022-07-27 08:39:09 -04:00
Duong A. Nguyen
170fcaa604
Generalize decay_mask_fn to apply mask to all LayerNorm params (#18273)
* generalize decay_mask_fn to find all layernorm params

* fixup

* generalising decay_mask_fn
2022-07-27 12:23:57 +01:00
Nouamane Tazi
83d2d74509
fix loading from pretrained for sharded model with `torch_dtype="auto" (#18061) 2022-07-27 07:20:35 -04:00
Younes Belkada
7996ef74dd
fix module order (#18312)
- put gelu before 4h to h
2022-07-27 07:06:01 -04:00
Mikkel Denker
70e7d1d656
Fixes torch jit tracing for LayoutLMv2 model (re-open) (#18313)
* Fixes torch jit tracing for LayoutLMv2 model.
Pytorch seems to reuse memory for input_shape which caused a mismatch in shapes later in the forward pass.

* Fixed code quality

* avoid unneeded allocation of vector for shape
2022-07-27 06:38:40 -04:00
Loubna Ben Allal
1d71ad8905
Update CodeParrot readme to include training in Megatron (#17798)
* add info about megatron training

* upload models and datasets from CodeParrot organization

* upload models and datasets from CodeParrot organization

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* fix typo and add comment about codeparrot vs megatron

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-07-27 11:59:08 +02:00
Yanming Wang
d5610b53fa
[XLA] Improve t5 model performance (#18288) 2022-07-27 10:44:14 +02:00
Seunghwan Hong
e318cda9ee
Apply type correction to TFSwinModelOutput (#18295)
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
2022-07-27 04:35:56 -04:00
NielsRogge
ccd4180f8a
[EncoderDecoder] Improve docs (#18271)
* Improve docs

* Improve docs of speech one as well

* Apply suggestions from code review

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-07-27 10:08:59 +02:00
Manuel R. Ciosici
5dfec704da
Remove duplicated line (#18310)
Removes a duplicated instantiation of device. I removed the second instance of the line to maintain code alignment with the GPT-J implementation of forward.
2022-07-27 04:00:47 -04:00
NielsRogge
47c2af0951
[DETR] Improve code examples (#18262)
* Improve doc test

* Improve code example of segmentation model

* Apply suggestion

* Update src/transformers/models/detr/modeling_detr.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-27 09:54:41 +02:00
Carolyn Wang
ee67e7ad4f
patch for smddp import (#18244)
* add import

* format
2022-07-26 16:00:24 -04:00
Matt
68097dcce0
Fix Sylvain's nits on the original KerasMetricCallback PR (#18300)
* Fix Sylvain's nits on the original PR

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Re-add "optional" to docstring

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-26 17:08:16 +01:00
Yih-Dar
6649133124
Add PYTEST_TIMEOUT for CircleCI test jobs (#18251)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-26 17:57:59 +02:00
Ian Castillo
a5d504834d
Add Spanish translation of custom_models.mdx (#17807)
* Update index

* Translate to Spanish two sections from custom_models

* Translate to Spanish custom models documentation

* Fixing typos and grammatical errors

* Add requested changes from reviewer
2022-07-26 10:10:37 -04:00
Federico Panero
7ea7eba39d
Add Italian translation of sharing_custom_models.mdx (#17631)
* work in progress: custom_models

* Update custom_models.mdx

* Update custom_models.mdx

* Update _toctree.yml

* Update _toctree.yml

* Update custom_models.mdx

* Update custom_models.mdx

* Update _toctree.yml

* Update _toctree.yml

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-26 09:48:58 -04:00
Yih-Dar
c4c6b4dbda
Add PyTorch 1.11 to past CI (#18302)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-26 15:47:23 +02:00
Federico Panero
bbc28106e0
Add Italian translation of converting_tensorflow_models.mdx (#18283)
* Add Italian translation of converting_tensorflow_models.mdx

* Update _toctree.yml

* Update converting_tensorflow_models.mdx

* Update docs/source/it/_toctree.yml

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-26 08:37:34 -04:00
Matt
a649de5551
Raise a TF-specific error when importing Torch classes (#18280)
* Raise a TF-specific error when importing Torch classes

* Update src/transformers/utils/import_utils.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Add an inverse error for PyTorch users

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-07-26 13:28:59 +01:00
Fellip Silva Alves
5e0ffd9183
[ create_a_model.mdx ] translate to pt (#18098)
* [ fast_tokenizers.mdx ] - Added translation to portuguese to tutorial

* Delete docs/source/pt-br directory

* [ fast_tokenizers.mdx ] - Continuing work on file

* [ fast_tokenizers.mdx ] - Continuing work on file

* Add fast tokenizers to _toctree.yml

* Eliminated config and toctree.yml

* Nits in fast_tokenizers.mdx

* Finishing create_a_model

* [ create_a_model.mdx ] finishing create a model in pt-br

* [ Changing _toctree.yml ] adding create a model in pt

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-07-26 08:01:08 -04:00
Gorkem Ozkaya
f58b9c0522
Update translation.mdx (#18169)
* Update translation.mdx

* update translation.mdx by running make style
2022-07-26 07:56:40 -04:00
Yih-Dar
b51695274a
Add TFAutoModelForImageClassification to pipelines.py (#18292)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-26 13:44:54 +02:00
Tom Mathews
f374d3918f
Adding type hints of TF:OpenAIGPT (#18263) 2022-07-26 12:30:06 +01:00
Tom Mathews
5bb211be6e
Adding type hints of TF:CTRL (#18264) 2022-07-26 12:27:02 +01:00
Sylvain Gugger
c8ed1b8b59
Replace false parameter by a buffer (#18259) 2022-07-26 13:02:58 +02:00
Jingya HUANG
2844c5de10
Fix ORTTrainer failure on gpt2 fp16 training (#18017)
* Ensure value and attn weights have the same dtype

* Remove prints

* Modify decision transformers copied from gpt2

* Nit device

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Fix style

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2022-07-26 04:14:08 -04:00
gilad19
2b09650885
Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER) (#17924)
* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* Add ViltForTokenClassification e.g. for Named-Entity-Recognition (NER)

* provide classifier only text hidden states

* add test_for_token_classification

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/vilt/modeling_vilt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add test_for_token_classification

Co-authored-by: gfuchs <gfuchs@ebay.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2022-07-26 10:11:32 +02:00
Alara Dirik
002915aa2a
Owlvit docs test (#18257)
* fix docs and add owlvit docs test

* fix minor bug in post_process, add to processor

* improve owlvit code examples

* fix hardcoded image size
2022-07-26 10:55:14 +03:00
Lysandre Debut
d32558cc7a
Good difficult issue override for the stalebot (#18094) 2022-07-26 03:39:14 -04:00
Yih-Dar
f65307e498
Fix dtype of input_features in docstring (#18258)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-26 09:34:06 +02:00
Raghavan
bd87480d20
Fix command of doc tests for local testing (#18236)
* Fix command of doc tests for local testing

* Fix command for after running doc tests locally
2022-07-26 03:07:11 -04:00
Matt
45a1475462
Fix TF bad words filter with XLA (#18286)
* Fix bad words filter in XLA generation

* Remove my cool debug breakpoints (again)
2022-07-25 20:19:39 +01:00
Matt
f4e172716b
Allows KerasMetricCallback to use XLA generation (#18265)
* Allows `KerasMetricCallback` to use XLA generation

* make fixup

* Slightly reword docstring
2022-07-25 12:51:37 +01:00
Yih-Dar
bbb62f2924
Skip passes report for --make-reports (#18250)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-25 11:09:23 +02:00
Joao Gante
7e44226fc7
Generate: deprecate default max_length (#18018) 2022-07-23 18:02:03 +01:00
amyeroberts
8e8384663d
Update serving code to enable saved_model=True (#18153)
* Add serving_output and serving methods to some vision models

* Add serving outputs for DeiT

* Don't convert hidden states - differing shapes

* Make saveable

* Fix up

* Make swin saveable

* Add in tests

* Fix funnel tests (can't convert to tensor)

* Fix numpy call

* Tidy up a bit

* Add in hidden states - resnet

* Remove numpy

* Fix failing tests - tensor shape and skipping tests

* Remove duplicated function

* PR comments - formatting and var names

* PR comments
Add suggestions made by Joao Gante:
* Use tf.shape instead of shape_list
* Use @tooslow decorator on tests
* Simplify some of the logic

* PR comments
Address Yih-Dar Sheih comments - making tensor names consistent and make types float

* Types consistent with docs; disable test on swin (slow)

* CI trigger

* Change input_features to float32

* Add serving_output for segformer

* Fixup

Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com>
2022-07-22 18:05:38 +01:00
Matt
07505358ba
Change how take_along_axis is computed in DeBERTa to stop confusing XLA (#18256)
* Change how `take_along_axis` is computed in DeBERTa to stop confusing XLA

* Greatly simplify take_along_axis() since the code wasn't using most of it
2022-07-22 17:01:30 +01:00
Yih-Dar
d95a32cc60
Fix torch version check in Vilt (#18260)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-22 16:24:49 +02:00
Muhammad Ahmed
7cb4da13fe
change bloom parameters to 176B (#18235) 2022-07-22 10:17:48 -04:00
Joao Gante
1fc4b2a132
TF: use the correct config with (...)EncoderDecoder models (#18097) 2022-07-22 13:31:45 +01:00
Fx039482
4935409757
Add Italian translation of create_model.mdx and serialization.mdx (#17640)
* First commit

* final changes

* Changed create_model to create_a_model
Translated into crea un'architettura personalizzata in the file it/_toctree.yml

* Added _toctree.yml in the italian translation loca: serialization title Esporta modelli transformers

* Edit translation for create_model.mdx

* t with '#' will be ignored, and an empty message aborts the commit.

* Added file serialization for translation in italian

* Fix toctree serialization position

I checked the eng toctree and realized I made a mistake.

* Update _toctree.yml

Correct spacing

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-22 13:53:54 +02:00
Sylvain Gugger
06d98e272e
Fix OwlViT tests (#18253)
* Fix OwlViT tests

* Forgot one
2022-07-22 13:32:19 +02:00
Alara Dirik
12d66b4701
Add OWL-ViT model for zero-shot object detection (#17938)
* add owlvit model skeleton

* add class and box predictor heads

* convert modified flax clip to pytorch

* fix box and class predictors

* add OwlViTImageTextEmbedder

* convert class and box head checkpoints

* convert image text embedder checkpoints

* add object detection head

* fix bugs

* update conversion script

* update conversion script

* fix q,v,k,out weight conversion conversion

* add owlvit object detection output

* fix bug in image embedder

* fix bugs in text embedder

* fix positional embeddings

* fix bug in inference mode vision pooling

* update docs, init tokenizer and processor files

* support batch processing

* add OwlViTProcessor

* remove merge conflicts

* readd owlvit imports

* fix bug in OwlViTProcessor imports

* fix bugs in processor

* update docs

* fix bugs in processor

* update owlvit docs

* add OwlViTFeatureExtractor

* style changes, add postprocess method to feature extractor

* add feature extractor and processor tests

* add object detection tests

* update conversion script

* update config paths

* update config paths

* fix configuration paths and bugs

* fix bugs in OwlViT tests

* add import checks to processor

* fix docs and minor issues

* fix docs and minor issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* fix bugs and issues

* update docs and examples

* fix bugs and issues

* update conversion script, fix positional embeddings

* process 2D input ids, update tests

* fix style and quality issues

* update docs

* update docs and imports

* update OWL-ViT index.md

* fix bug in OwlViT feature ext tests

* fix code examples, return_dict by default

* return_dict by default

* minor fixes, add tests to processor

* small fixes

* add output_attentions arg to main model

* fix bugs

* remove output_hidden_states arg from main model

* update self.config variables

* add option to return last_hidden_states

* fix bug in config variables

* fix copied from statements

* fix small issues and bugs

* fix bugs

* fix bugs, support greyscale images

* run fixup

* update repo name

* merge OwlViTImageTextEmbedder with obj detection head

* fix merge conflict

* fix merge conflict

* make fixup

* fix bugs

* fix bugs

* add additional processor test
2022-07-22 13:35:32 +03:00
Zachary Mueller
99eb9b523f
Fix no_trainer CI (#18242)
* Fix all tests
2022-07-21 14:44:57 -04:00
Sayak Paul
561b9a8c00
[SegFormer] TensorFlow port (#17910)
* add: segformer utils and img. classification.

* add: segmentation layer.

* feat: working implementation of segformer.

* chore: remove unused variable.

* add test, remaining modifications.

* remove: unnecessary files.

* add: rest of the files.

Co-authored-by: matt <rocketknight1@gmail.com>

* chore: remove ModuleList comment.

* chore: apply make style.

* chore: apply make fixup-copies.

* add  to check_repo.py

* add decode head to IGNORE_NON_TESTED

* chore: run make style.

* chore: PR comments.

* chore: minor changes to model doc.

* tests: reduction across samples.

* add a note on the space.

* sort importats.

* fix: reduction in loss computation.

* chore: align loss function with that of NER.

* chore: correct utils/documentation_tests.txt

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* chore: simplify the interpolation of logits in loss computation.

* chore: return transposed logits when return_dict=False.

* chore: add link to the tf fine-tuning repo.

* address pr comments.

* address niels's comments.

* remove from_pt=True since tf weights are in.

* remove comment from pt model.

* address niels's comments.

Co-authored-by: matt <rocketknight1@gmail.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-07-21 18:22:37 +01:00
Yih-Dar
2c5747edfe
Update notification service (#17921)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-21 15:03:50 +02:00