Commit Graph

12581 Commits

Author SHA1 Message Date
Yih-Dar
32b08742a5
DocumentQuestionAnsweringPipeline only for fast tokenizers (#22745)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-13 17:22:59 +02:00
Gabriel Yang
4def2fe969
🌐 [i18n-KO] Translated training.mdx to Korean (#22670)
translate training doc to Korean
2023-04-13 11:04:47 -04:00
Yih-Dar
7df1343292
Change torch_dtype to str when saved_model=True in save_pretrained for TF models (#22740)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-13 15:52:16 +02:00
NielsRogge
8eb38f638d
[Pix2struct] Simplify generation (#22527)
* Add model to doc tests

* Remove generate and replace by prepare_inputs_for_generation

* More fixes

* Remove print statements

* Update integration tests

* Fix generate

* Remove model from auto mapping

* Use auto processor

* Fix integration tests

* Fix test

* Add inference code snippet

* Remove is_encoder_decoder

* Update docs

* Remove notebook link
2023-04-13 09:01:14 -04:00
Rinat
95e7057507
Make vilt, switch_transformers compatible with model parallelism (#22703)
* Update modeling_vilt.py

Vilt compatible with model parallelism

* Update modeling_switch_transformers.py

switch_transformers compatible with model parallelism
2023-04-13 06:50:30 -04:00
Joel Lamy-Poirier
89087597ba
Indexing fix for gpt_bigcode (#22737)
Fix indexing
2023-04-13 11:00:37 +01:00
Elabonga Atuo
7ade6ef7d4
[Doctest] Add configuration_mvp.py (#22735)
* added configuration file for mvp model

* added configuration_mvp.py line to file
2023-04-13 08:19:18 +02:00
Elabonga Atuo
51007976ec
[Doctest] Add configuration_m2m_100.py (#22733)
m2m-100-config for doctest
2023-04-13 08:17:07 +02:00
Sylvain Gugger
888c4a2ae0
v4.29.0.dev0 2023-04-12 20:04:29 -04:00
Matt
50f82e1282
Fix docstrings for TF BLIP (#22618)
* Fix docstrings for TFBLIP

* Fix missing line in TF port!

* Use values from torch tests now other bugs fixed

* Use values from torch tests now other bugs fixed

* Fix doctest string
2023-04-12 17:46:41 +01:00
NielsRogge
ce06e4780e
Update warning levels (#22727)
* Use different level

* Remove futurewarning

* Use warning_once

* Update copies
2023-04-12 17:25:24 +01:00
Arthur
9858195481
add fast support and option (#22724)
* add fast support and option

* update based on review

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/llama/convert_llama_weights_to_hf.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* nit

* add print

* fixup

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-12 18:10:04 +02:00
Michael Benayoun
10fab90fe2
torch.distributed group initialization for torch_neuron disabled when optimum-neuron is installed (#22728)
* Make the process group initialization not happen if optimum_neuron is installed

* Add warning

* Remove list and added warning
2023-04-12 17:42:50 +02:00
Stas Bekman
1306b7d3ae
[tests] switch to torchrun (#22712) 2023-04-12 08:25:45 -07:00
ARKA1112
d87ef00c31
Modify pipeline_tutorial.mdx (#22726)
generator(model="openai/whisper-large") always returns error. As the error says the generator expects an input, just like the .flac file above. Even the generator object has no parameters called model. While there are parameters which can be passed to generator like 'batch_size' but to pass a model i believe the the parameter has to be passed while instantiating the pipeline and not as a parameter to the instance.

I believe the correct term should be:

generator = pipeline(model="openai/whisper-large", device=0)
2023-04-12 15:20:25 +01:00
Younes Belkada
370f0ca18c
[bnb] Let's make serialization of int8 models possible (#22177)
* make serialization of int8 models possible

* make fixup

* add docs

* add ability to push to hub and save pretrained

* fixes

* more addition

* more tests

* fix issues

* change variable

* clearer message

* adapt from suggestions

* few fixes

* remove unused function

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* address last comments

* last warning

* clarify doc

* protect import

* Update src/transformers/modeling_utils.py

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-12 08:01:18 -04:00
pioliverse
523ca4e016
add model resources for CPMAnt (new) (#20906)
* resolve conflicts

* rebase and make style

* test

* test

* test

* rebase and make style

* rebase and make style

* tests

* tests

* rewrite some functions

* rebase and make style

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* add models and tests

* solve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* save resolution

* make style

* delete redefinition code

* reformat function

* reformat

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* make style

* fix bugs and refactor

* modify docstrings and make style

* unify import format in __init__.py

* fix import-altclp bug

* fix copies to update index.md

* fix unused config parameters

* fix unused config parameters

* fix unused config parameters

* update README_ja.md

* dummy commit for unit test

* fix attention mask

* add CPMAntTokenizer&-Fast to auto-mapping

* drop redundant changes in README_ko

* fix  defaults in docstring

* fix use_cache and some docstring

* add missing args in tokenizer

* modify tester inheritance

* add is_jieba_available

* fix some bugs

* make style and fix-copies

* add doctests

* skip integration tests

* add is_jieba_available

* fix bugs in common tests

* adjust docstrings and make style

* add argument docstring

* adjust code to some specifications

* make style and fix-copies

* add fast tokenization test

* dummy commit for unit test

* dummy commit for unit test

* dummy commit for unit test

* normalize some comments and names

* Bert->CPMAnt

* camel names and drop redundant codes

* make style and fix-coies

* add CpmTokenizerFast _import_structure

* drop cpmanttokenizerfast in model_doc

* fix some problems

* fix CPMAnt tokenization for common test

* make style and fixup

* fix copies and fixup

* fix bugs in tokenization test

* dummy commit for connection failure in unittest

* fix copies

* drop trailing comma

* fix decorator in tests

* dummy commit for connection failure in unittest

---------

Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>
2023-04-12 07:33:20 -04:00
jprivera44
17503b00ea
Added parallel device usage for GPT-J (#22713) 2023-04-12 07:31:27 -04:00
Arthur
b76e6ebd44
remove wrong doc in readme (#22723) 2023-04-12 07:11:12 -04:00
amyeroberts
5a71977b8b
Update input values for docstring (#22631) 2023-04-12 11:44:29 +01:00
Yih-Dar
fe1f5a639d
Fix decorator order (#22708)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-11 17:59:15 +02:00
Sylvain Gugger
1b1867d86b
Replace -100s in predictions by the pad token (#22693)
* Replace -100s in predictions by the pad token

* Style

* Try to catch them all
2023-04-11 09:32:20 -04:00
Yih-Dar
ff73deeb0e
Remove 2 failing ONNX conversion tests (#22660)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-11 15:26:32 +02:00
Luc CAILLIAU
06b05d4575
Clarify stride option (#22684)
* Clarify stride option

* formatting
2023-04-11 14:06:54 +01:00
Mayank Agarwal
0224aaf67f
Enable naive Pipeline Parallelism training for Gpt neox japanese and san japanese (#22702)
Move labels to same device as logits
2023-04-11 09:06:17 -04:00
Sylvain Gugger
28c19ab58d
Make it easier to develop without a dev install (#22697)
* Make it easier to develop without a dev install

* Remove ugly hack that doesn't work anyway
2023-04-11 08:41:53 -04:00
Yih-Dar
4c01231e67
Update some MarkupLM tests' expected values (#22667)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-11 10:00:34 +02:00
Shahad Mahmud
151425ddb2
Model parallelism: Moving labels to same devices as the logits are (#22691)
Model parallelism correct labels device
2023-04-10 12:22:53 -04:00
Sugawara
6daa9cb515
add GPTNeoXForSequenceClassification (#22671)
* add GPTNeoXForSequenceClassification

* move the labels to logits.device (ref: #22561)

* fix
2023-04-10 11:52:23 -04:00
xinhe
f74b40208d
use __func__ to check can_generate (#22643) 2023-04-10 09:06:52 -04:00
Kirill
14fc1a2467
Fix quantization docs typo (#22666) 2023-04-10 08:53:53 -04:00
Sylvain Gugger
3876fc6839
Make dynamic code work with offline mode (#22661)
* Make dynamic code work with offline mode

* Clean up

* Quality
2023-04-10 08:49:42 -04:00
Shikhar Chauhan
98597725f1
(feat): Moving labels to same device as logits for Deit (#22679) 2023-04-10 08:04:57 -04:00
Shahad Mahmud
870d91fb89
Model parallelism: Moving labels to the same device as logits for BridgeTower models (#22676)
BrideTower Model parallelism logits device for loss calculation
2023-04-10 08:04:14 -04:00
Joel Lamy-Poirier
e0921c6b53
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-04-10 10:57:21 +02:00
Arun Brahma
656e869a45
moved labels to the same device as logits for BLOOM, GPT Neo, GPT NeoX, RoBERTa and VIT models (#22663)
moved labels to the same device as logits
2023-04-07 17:04:54 -04:00
Sylvain Gugger
6db23af50c
Revert migration of setup to pyproject.toml (#22658) 2023-04-07 15:08:44 -04:00
Joao Gante
3f96e0b4e4
Generate: add API warning to streamers (#22659)
add API warning
2023-04-07 14:15:20 -04:00
Arthur
f33419261a
[OPT] Fix default attention mask size (#22649)
* Fix default attention mask size

* fixup

* add a test to make sure that even if attention mask are not provided, works

* style
2023-04-07 20:12:57 +02:00
Arthur
b1b3dc3e52
[tokenization] do not push special file (#22657)
* do not push special file

* Update src/transformers/tokenization_utils_base.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-04-07 20:12:36 +02:00
Arthur
117a0f6afa
Small nit, (#22653)
* Small nit,
Fixes #21986

* Update src/transformers/pipelines/__init__.py
2023-04-07 17:29:23 +02:00
Wonhyeong Seo
fc1ba6fd11
🌐 [i18n-KO] Translated pipeline_tutorial.mdx to Korean (#22508)
docs: feat: Korean pipeline_tutorial

Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Hyeonseo Yun <0525_hhgus@naver.com>
Co-authored-by: gabrielwithappy <102908949+gabrielwithappy@users.noreply.github.com>
Co-authored-by: Na Yeon Han <nayeon2.han@gmail.com>
2023-04-07 11:27:59 -04:00
Yih-Dar
14d5b2b645
Fix MegaModel CI (#22652)
* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-07 17:13:04 +02:00
Seung-Moo Yang
f2cc8ffdaa
Fix typo (#22650) 2023-04-07 08:46:23 -04:00
Shikhar Chauhan
1de8ce9ee1
Move labels to the same device as logits for LlamaForSequenceClassification and Blip2 (#22596)
* (feat): Move labels to the same device as logits

* Trigger CI

* Trigger CI

* Trigger CI

* (feat): Making changes for Blip2
2023-04-07 08:23:55 -04:00
gabrielwithappy
d59034ff6f
🌐[i18n-KO] Translate autoclass_tutorial to Korean and Fix the typo of quicktour (#22533)
translate the autoclass_tutorial and fix the typo of the quicktour
2023-04-07 08:12:35 -04:00
Sourab Mangrulkar
ee8e80a060
fix FSDP version related issues (#22489)
fix fsdp
2023-04-07 04:25:19 +05:30
Yih-Dar
c7ec71baf5
Update tiny model summary file for recent models (#22637)
* Update tiny model summary file for recent models

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-04-06 22:52:59 +02:00
Younes Belkada
ed67286465
[Blip] Fix slow tests and doctests with correct values (#22632)
fix slow tests and doctests
2023-04-06 19:12:51 +02:00
Nicolas Patry
6a02e98074
LlamaTokenizerFast Fix (.., from_slow=True). (#22630) 2023-04-06 18:52:59 +02:00