Alara Dirik
269b054939
Add ALIGN to transformers ( #21741 )
...
Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.
2023-03-01 21:23:31 +03:00
Matt
f7c618e3b0
Add TFVisionTextDualEncoder ( #21873 )
...
* Temporary commit to stash everything so far
* Temporary commit to stash everything so far
* stash commit
* Refactor from_pretrained
* Fix final test, make fixup
* Update dummies
* Add model to TEST_FILES_WITH_NO_COMMON_TESTS
* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/models/vision_text_dual_encoder/modeling_tf_vision_text_dual_encoder.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Add TFVisionTextDualEncoder to utils/documentation_tests.txt
* make fixup
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-03-01 18:00:48 +00:00
Yih-Dar
53735d7c3b
Add an utility file to get information from test files ( #21856 )
...
* Add an utility file to get information from test files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-03-01 17:53:29 +01:00
Arthur
b599b19289
[ConvBert] Fix #21523 ( #21849 )
...
* fix reshaping
Fixes #21523
* add test
* styling
* last fixes
* Update src/transformers/models/convbert/modeling_convbert.py
* code quallity
2023-03-01 11:11:04 +01:00
Arthur
44e3e3fb49
prepare for "__floordiv__ is deprecated and its behavior will change in a future version of pytorch" ( #20211 )
...
* rounding_mode = "floor" instead of // to prevent behavioral change
* add other TODO
* use `torch_int_div` from pytrch_utils
* same for tests
* fix copies
* style
* use relative imports when needed
* Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-03-01 10:49:21 +01:00
Sylvain Gugger
b29e2dcaff
Fix flaky test for log level ( #21776 )
...
* Fix flaky test for log level
* Fix other flaky test
2023-02-28 16:24:14 -05:00
Matt
acfb714bdf
Improve TF weight loading, especially PT crossloading ( #21792 )
...
* First commit for the improved PT-TF weight loading
* Remove workarounds from TFEncoderDecoder tests
* Allow a custom weight renaming function in from_pretrained and use that to clean up EncoderDecoder
* make fixup
* First attempt at visionencoderdecoder
* Disable tensorfloat32 in tests to get consistent outputs
* Quick fix to tf_vision_encoder_decoder tests
* make fixup
* Update Blenderbot tests
* Remove unused arg in modeling_tf_opt
* load_tf_sharded_weights had strict=True! This meant transfer learning was impossible, so I'm setting it to False.
* Support prefixes when loading sharded TF checkpoints
* make fixup
* Add test to load sharded models with a weight prefix
* Fix sharded weight loading test
* Add a test for transfer from a sharded checkpoint
* make fixup
* Add test to check that crossloading from PT with a prefix works
* Refactor from_pretrained in the encoderdecoder classes
* Refactor from_pretrained in the encoderdecoder classes
* missmatched -> mismatched
* Explicitly check for None
* No comments showing my very impressive and attractive knowledge of Py3.9+
* Disable TF32 across all TF tests
2023-02-28 18:41:34 +00:00
Yih-Dar
871c31a6f1
🔥 Rework pipeline testing by removing PipelineTestCaseMeta
🚀 ( #21516 )
...
* Add PipelineTesterMixin
* remove class PipelineTestCaseMeta
* move validate_test_components
* Add for ViT
* Add to SPECIAL_MODULE_TO_TEST_MAP
* style and quality
* Add feature-extraction
* update
* raise instead of skip
* add tiny_model_summary.json
* more explicit
* skip tasks not in mapping
* add availability check
* Add Copyright
* A way to diable irrelevant tests
* update with main
* remove disable_irrelevant_tests
* skip tests
* better skip message
* better skip message
* Add all pipeline task tests
* revert
* Import PipelineTesterMixin
* subclass test classes with PipelineTesterMixin
* Add pipieline_model_mapping
* Fix import after adding pipieline_model_mapping
* Fix style and quality after adding pipieline_model_mapping
* Fix one more import after adding pipieline_model_mapping
* Fix style and quality after adding pipieline_model_mapping
* Fix test issues
* Fix import requirements
* Fix mapping for MobileViTModelTest
* Update
* Better skip message
* pipieline_model_mapping could not be None
* Remove some PipelineTesterMixin
* Fix typo
* revert tests_fetcher.py
* update
* rename
* revert
* Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests
* style and quality
* test fetcher for all pipeline/model tests
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-28 19:40:57 +01:00
Anahita Bhiwandiwalla
4cb5ffa93d
Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval ( #21684 )
...
* Add loss for BridgeTowerForMaskedLM and BridgeTowerForImageAndTextRetrieval
* minor fix return_dict
* implement test for loss computation
---------
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
2023-02-28 12:21:48 -05:00
Younes Belkada
7f4f8b97d0
[Blip2
] Fix Blip-2 multi gpu ( #21707 )
...
* fix blip multi gpu
* fix
* final changes
* adapt suggestions
* fix failing slow test
* forward contrib credits from testing and suggestions
* reformat
---------
Co-authored-by: akkikiki <akkikiki@users.noreply.github.com>
2023-02-28 17:28:58 +01:00
raghavanone
eec76042f4
Fix the issue of blip model returning loss even when the label is not provided. ( #21811 )
...
* Fix the issue of blip model returning loss even when the label is not provoided
* Fix ruff failure
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
* Incorporate PR feedbacks
2023-02-28 09:54:08 -05:00
Younes Belkada
b8de7e448e
[Blip2
] Add Blip2Model
( #21817 )
...
* add v1
* add `Blip2Model`
- add relevant functions
- add tests
- add on automapping
* fix docs
* fix doctest
2023-02-28 15:42:55 +01:00
Younes Belkada
ae9230af40
[T5
] Fix torchquant issue ( #21843 )
...
* fix torchquant issue
* add tests
2023-02-28 15:09:44 +01:00
Yih-Dar
a9dd124346
Rename MobileViTModelTest
to TFMobileViTModelTest
( #21825 )
...
Let's give TF a bit more love ❤️ 🙏
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-28 08:10:29 +01:00
Joao Gante
92dfceb124
Inheritance-based framework detection ( #21784 )
2023-02-27 15:31:55 +00:00
Younes Belkada
831f3144a6
[tests
] add accelerate
marker ( #21743 )
...
* add `accelerate` marker
* add to docs
* Update docs/source/en/testing.mdx
2023-02-27 12:33:34 +01:00
Arthur
c51dc4f927
[torch] remove deprecated uint8 in favor of bool ( #21384 )
...
* uint8 -> bool
* fix copies
* style
* update test modeling commen when checking attention buffers
* style
* use logical not on random mask instead of subtraction with 1
* remove torch uint8
* quality
* remove modified modeling utils
* Update based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
---------
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-02-27 11:46:02 +01:00
Arthur
cc44e72d14
[Pipeline] Add zero shot audio classificatoin pipeline ( #21600 )
...
* add pipeline
* update init
* add zero shot to init
* update inits and correct checkpoints
* update base to support input features
* add tests
* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/pipelines/zero_shot_audio_classification.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* update pieline code
* use tiny checkpoint
* nits and expected value with tiny model
* style
* last nit on tests values
* fix styling
* fix collate fn that was casting t float
* update
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-02-27 11:43:44 +01:00
Sanchit Gandhi
3dae0d7b4f
[SpeechT5] Fix HiFiGAN tests ( #21788 )
2023-02-24 16:55:38 +01:00
Kashif Rasul
ba0e370dc1
[time series] updated expected values for integration test. ( #21762 )
...
* updated expected
* prediction_length fix
* prediction_length default value
* default prediction_length 24
* revert back prediction_length default
* move prediction_length test
2023-02-24 12:36:54 +01:00
Arthur
087436c98e
Fix-ci-whisper ( #21767 )
...
* fix history
* input_features instead of input ids for TFWhisport doctest
* use translate intead of transcribe
2023-02-24 11:39:25 +01:00
bofeng huang
c8545d2a9c
[Whisper] Add SpecAugment ( #21298 )
...
* Return and rescale attention_mask
* Add SpecAugment to Whisper modeling
* Fix test
* Update docstring
* Add SpecAug related parameters to model config
* Add the _mask_input_features function to doc
* Fix quality
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Remove dev comments
* Add test
* Resolve conflict
* feat: mask {feature, time} prob fast tests
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-24 11:07:52 +01:00
Shubhamai
f7ca656f07
[Flax] adding support for batch norm layers ( #21581 )
...
* [flax] adding support for batch norm layers
* fixing bugs related to pt+flax integration
* cleanup, batchnorm support in sharded pt to flax
* support for batchnorm tests in pt+flax integration
* simplifying checking batch norm layer
2023-02-24 08:47:33 +01:00
Connor Henderson
279008adc3
fix: Change is_last chunk calc and add conditional break in chunk_iter ( #21612 )
...
* fix: Change is_last chunk calc and add conditional break
* format fix
* account for 0 and full stride_rights, add comment
* add new test
* make style
* update slow whisper asr test timestamps
* use nested_simplify on output and round timestamp to hundreths place
2023-02-24 08:30:32 +01:00
Stas Bekman
633062639b
[deepspeed tests] fix issues introduced by #21700 ( #21769 )
...
* [deepspeed tests] fix issues introduced by #21700
* fix
* fix
2023-02-23 13:22:25 -08:00
ydshieh
aa3787c8f0
Skip test_log_level for now
2023-02-23 12:11:20 +01:00
Joao Gante
1d4b797852
Generate: Fix GIT batched captioning ( #21738 )
2023-02-23 09:50:37 +00:00
Naga Sai Abhinay
448e050b0d
Make ImageProcessorMixin compatible with subfolder kwarg ( #21725 )
...
* Add subfolder support
* Add kwarg docstring
* formatting fix
* Add test
2023-02-23 09:28:18 +01:00
Sanchit Gandhi
82e61f3445
[SpeechT5HifiGan] Handle batched inputs ( #21702 )
...
* [SpeechT5HifiGan] Handle batched inputs
* fix docstring
* rebase and new ruff style
2023-02-22 11:16:56 +01:00
Yih-Dar
09127c5713
Fix GPTSanJapaneseModel
( #21731 )
...
* fix
* skip test_model_parallelism
* skip test_model_parallelism
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-22 11:09:04 +01:00
Sylvain Gugger
b19d64d852
Respect documentation on passive log level ( #21700 )
...
* Respect documentation on passive log level
* Fix test and set log level in examples
* Add doc
2023-02-22 09:39:18 +01:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions ( #21694 )
2023-02-22 09:14:54 +01:00
Kashif Rasul
df06fb1f0b
Time series transformer: input projection and Std scaler ( #21020 )
...
* added loc and scale outputs from scalers
* fix typo
* fix tests
* fixed formatting
* initial StdScaler
* move scaling to optional str
* calculate std feature for scalers
* undid change as it does not help
* added StdScaler with weights
* added input projection layer and d_model hyperparam
* use linear proj
* add back layernorm_embedding
* add sin-cos pos embeddings
* updated scalers
* formatting
* fix type
* fixed test
* fix repeated_past_values cal.
* fix when keepdim=false
* fix default_scale
* backward compatibility of scaling config
* update integration test expected output
* fix style
* fix docs
* use the actual num_static_real_features in feature_dim cal
* clarified docs
* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* prediction_length is not optional
* fix for reviewer
* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* get rid of un-needed new lines
* fix doc
* remove unneeded new lines
* fix style
* static_categorical_features and static_real_features are optional
* fix integration test
* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fixing docs for multivariate setting
* documentation for generate
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-22 07:50:13 +01:00
Yih-Dar
03aaac3502
Fix TVLT (torch device issue) ( #21710 )
...
* fix tvlt ci
* fix tvlt ci
* fix tvlt ci
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-21 11:37:49 +01:00
Jonatan Kłosko
deafc24388
Add WhisperTokenizerFast ( #21222 )
...
* Add WhisperTokenizerFast
* Fixup
* Up
* Up
* Improve tests
* Update src/transformers/models/whisper/tokenization_whisper_fast.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Keep stride in whisper pipelien test
* Remove unknown token special case
* Reduce vocabulary size in tests
* Fix vocab size assertion
* Sync copied changes from WhisperTokenizer
* Skip pipeline tests
* Update assertion
* Remove Whisper tokenizer dependency on sentencepiece
* Format
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-02-21 06:58:54 +01:00
Alara Dirik
49ab16239c
Add EfficientNet ( #21563 )
...
* Add EfficientNet to transformers
2023-02-20 16:37:11 +03:00
Younes Belkada
c9a0671477
[bnb
] fix bnb
decoders bug ( #21688 )
...
* fix `bnb` decoders bug
* make fixup
2023-02-20 12:21:58 +00:00
tanreinama
f56174ac5b
add GPTSAN model (reopen) ( #21291 )
...
* add GPTSAN-Japanese
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN (update for review)
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* fix typo in comment text
* add GPTSAN
* add GPTSAN
* add GPTSAN
* add GPTSAN
* fix document and comments
* fix class name GPTSAN->GPTSan
* fix import and test for tokenizer
2023-02-20 11:25:27 +01:00
Sylvain Gugger
c87bbe1ff0
Fix quality
2023-02-20 03:27:09 -05:00
Andy Ehrenberg
2840272c5f
add flax whisper implementation ( #20479 )
...
* add flax whisper implementation
* rever change to setup
* remove unused imports
* revert generation changes
* flax whisper docs
* docs
* import order
* import sorting
* isort
* add dummy objects
* doc formatting
* formatting
* remove trailing whitespaces
* fix flax whisper docs
* add generation logic to unlock flax whisper
* remove scans
* give credits to Flax Bart implementation
* remove unused imports
* add license
* remove assert
* more credits to Bart
* fix style
* formatting
* support left padding
* add flax whisper generation test
* remove copied from comments whenever not a full copy
* fix docstrings for logits processors
* revert change to FlaxForceTokensLogitsProcessor
* revert doc changes
* improve generation docs
* reorganize
* formatting
* cleanup docs
* add tests
* handle empty list case
* fix forced decoder ids in flax tests
* add flax whisper to inits
* upate dummy objects
* docs for FlaxAutoModelForSpeechSeq2Seq
* fix decoder_position_ids computation in pretrained model decode/__call__ fns
* add Copied from statements as necessary
* compute position_ids only in __call__ and decode methods of pretrained model subclasses
* improve readabilityof compute positional embeddings
* check dimensionality of input_features instead of hidden_states
* copied from statement for init_cache
* formatting
* fix copies
* fix copies
* pass attention mask to encoder layers
* fix decoder module outputs
* set dtype
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* smaller flax model for whisper test
* Update src/transformers/generation/flax_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/models/whisper/test_modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* cleanup
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* bias cleanup
* doc fix
* align style for force tokens processor
* readability
* fix input shape in tests
* revert FlaxGenerationMixin docstring
* formatting
* fix tests
* fix imports
* consistent encoder hidden states
* consistent hidden states
* input shapes
* typo
* partial class trick
* partial class for input shape
* base_class with correct input shape
* partial base classes
* match by name
* set main_input_name
* compare on names
* formatting
* remove unused import
* safer position ids computation
* safer position id computation
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove identical inherited tests
* fix prompt ids in tests
* use generation config
* use jnp array
* better var names
* more explicit bias use
* import transformers
* formatting
* test formatting
* remove unused imports
* remove unused imports
* formatting
* isort
* docs
* fix ln orders for encoder hidden states
* whisper unique generation stuff
* flake
* use finfo for attention bias
* docs
* Update src/transformers/generation/flax_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* docs
* add timestamp flax test
* jit for timestamps
* formatting
* clean up timestamps processor
* formatting
* remove if_true
* cleanup
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-02-20 09:17:40 +01:00
Younes Belkada
8a4c319d33
[BLIP
] update blip path on slow tests ( #21476 )
...
* update blip path
* Update tests/models/blip/test_modeling_blip.py
2023-02-17 18:26:36 +00:00
Younes Belkada
a8eb4f79f9
[CLAP
] Fix few broken things ( #21670 )
...
* add `is_longer`
* fix docstring
* fix config class
* fix loss
* fix all doctests
* fix order
* fix last failing tests
---------
Co-authored-by: arthur.zucker@gmail.com <arthur.zucker@gmail.com>
2023-02-17 11:32:14 +01:00
Younes Belkada
3668ec1716
[bnb
] Introducing BitsAndBytesConfig
( #21579 )
...
* v1 `BitsandbytesConfig`
- add v1
- add tests
- more user-friendly API
- add docs
* change to `BitsAndBytesConfig`
* replace logic
* changes
* make fixup
* quality
* make fixup
* fix doc
* fix test
* update toctree
* fix slow test
* add tips
* add warning
* change title
* oops
* Update docs/source/en/main_classes/quantization.mdx
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/utils/bitsandbytes.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* remove unused file
* adapt suggestion
- add also tests
- change logic
* update docs
* adapt suggestions
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-17 09:44:01 +01:00
Arthur
c236a62172
[CLAP] Add CLAP to the library ( #21370 )
...
* add model like clip
* update
* text model ok
* clap text works
* some refactor
- `CLAPVision` to `CLAPAudio`
- refactor kwargs of audio modules
* more refactor
* more refactor
* more refactor
* correct fusion
* more refactor
* new modules
* add basic processor
* fixup
* remove whisper copioed from
* audio logits match
* add doc
* correct filters mel and add maxlength
* style
* few fixes
* forward passes
* fixup
* fixup
* some clean up
* remove mels form the dictionnary
* pad after the repeat
* update padding when dsmaller
* fix padding
* style
* use swin patch merging
* use copied from swin
* processor with any tokenizer
* more copied from
* some clean up
* more refactor
* fix mel when rand_trunc
* style
* remove unused imports
* update processing
* remove image processing tests
* add testing fiel
* fixmodeling issues
* replace with `is_longer`
* clap in serialization
* more refactor
* `make fixup`
* make fixup
* fix feature extractor
* update test feature extractor
* `make fixup`
* clean up config
* more clean up
* more cleanup
* update tests
* refactor tests and inits
* removeCLAP vision config
* remove CLAP from image procssing auto and dummy vision objects
* update inits
* style
* re order classes in modeling clap
* Use roberta tokenizer as the other weights are not open sourced
* small cleaup
* remove tokenization CLAP
* processor tokenizr is roberta
* update feature extraction doc
* remove vclap from model zero shot
* update f_min and f_max to frequency_xx
* some changes
- fix modeling keys
- add `is_longer` in the forward pass
- make fixup
* make fixup
* consistent behavior ebtween rand_crop and fusion
* add numpy resize and bilinear and documentation
* move resizing to image utils
* clean feature extraction
* import resize from correct file
* resize in image transforms
* update
* style
* style
* nit
* remove unused arguments form the feature extractor
* style
* few fixes + make fixup
* oops
* fix more tests
* add zero shot audio classification pipeline
* update zeroshot classification pipeline
* fixup
* fix copies
* all CI tests pass
* make fixup + fix docs
* fix docs
* fix docs
* update tests pip;eline
* update zero shot pipeline
* update feature extraction clap
* update tokenization auto
* use nested simplify
* update pipeline tests
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* split in two lines
* fixes
* refactor
* clean up
* add integration tests
* update config docstring
* style
* update processor
* fix processor test
* fix feat extractor tests
* update docs
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix readmes
* fix tips
* Update src/transformers/models/auto/configuration_auto.py
* update doc and remove todo -> properly explained
* fix idx and typo
* typoe
* cleanup config
* cleanup tests, styles and doc
* ignore docstyle on image transform
* add conversion script
* remove the `clap` indx in favor of `CLAP`
* update __init
* nits
* Update src/transformers/pipelines/__init__.py
* fix bug
* clarifiy config
* fix copy
* fix init
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix model output
* fix comment
* make fixup
* make fixup
* rename to `Clap`
* replace to `Clap`
* replace to `Clap`
* repo consistency
* again repo-consistency
* make fixup
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* add config
* changes
* update conversion
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove unused function
* update based on code reviews
* style
* more comments
* cleanup
* clean up
* style
* apply suggestions
* Empty commit
* pipeline will be added in a different PR
* update calls to audio utils functions
* update pipeline init
* style
* style
* styling again
* use pad
* fix repo-consistency
* update utils and add doc for audio utils
* clean up resize by using torch. update inits accordingly
* style
* CLap's tokenizer is RobertA
* add audio utils to internal toctreee
* update totctree
* style
* update documentation and normalize naming accross audio utils and feature extraction clap
* style
* clean up
* update doc and typos
* fix doctest
* update modelin code, got rid of a lot of reshaping
* style on added doc audio utils
* update modeling clap
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* docstringvariables with CLAP
* rename key
* update modeling CLAP
* update audio utils docstring
* update processing clap
* fix readmes
* fix toctree
* udpate configuration clap
* fix init
* make fixup
* fix
* fix
* update naming
* update
* update checkpoint path
* Apply suggestions from code review
* Major refactoring
* Update src/transformers/models/clap/configuration_clap.py
* merge
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-02-16 20:59:27 +01:00
Connor Henderson
0f96c26de6
refactor: Make direct_transformers_import util ( #21652 )
...
* refactor: Make direct_import util
* edit direct import fn
* add docstring
* make import function specific to transformers only
* edit doc string
2023-02-16 11:32:32 -05:00
Jannis Vamvas
61abe3290b
[WIP] Move X-MOD models to facebook organization ( #21640 )
...
Move X-MOD models to facebook org
2023-02-16 09:18:25 -05:00
Sylvain Gugger
9d1116e995
Update deprecated load_module ( #21651 )
2023-02-15 15:57:24 -05:00
Zineng Tang
a0e69a9375
Add TVLT ( #20725 )
...
* Update image_processing_tvlt.py
* Update modeling_tvlt.py
* Update
* Update modeling_tvlt.py
* Create tvlt.mdx
* Update configuration_tvlt.py
* Update modeling_tvlt.py
* Update test_modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update image_processing_tvlt.py
* Update feature_extraction_tvlt.py
* Update tvlt models
* Update tests
* Update
* Update
* Update tests
* Update README_ko.md
* Update README_ja.md
* Update README_ko.md
* Update README_zh-hans.md
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update tvlt.mdx
* Update modeling_tvlt.py
* Update configuration_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Add files via upload
* Update model
* Update modeling_tvlt.py
* Update tvlt models
* Update src/transformers/models/tvlt/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/__init__.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add files via upload
* Add files via upload
* Delete modeling_tvlt.py
* Delete feature_extraction_tvlt.py
* Delete configuration_tvlt.py
* Delete image_processing_tvlt.py
* Delete processing_tvlt.py
* Update tvlt
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update README_es.md
* Update README_hd.md
* Update README_ja.md
* Update README_ko.md
* Update README_zh-hans.md
* Update README_zh-hant.md
* Update index.mdx
* Update tvlt.mdx
* Update tvlt.mdx
* Update configuration_tvlt.py
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update modeling_tvlt.py
* Add files via upload
* Update tvlt.mdx
* Update modeling_auto.py
* Add files via upload
* Add files via upload
* Update dummy_pt_objects.py
* Update __init__.py
* Update feature_extraction_tvlt.py
* Update feature_extraction_tvlt.py
* Update image_processing_tvlt.py
* Update modeling_auto.py
* Update test_feature_extraction_tvlt.py
* Update test_processor_tvlt.py
* Update test_feature_extraction_tvlt.py
* Add files via upload
* Update test_image_processor_tvlt.py
* Update tests/models/tvlt/test_processor_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_image_processor_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update tests/models/tvlt/test_image_processor_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_image_processor_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_image_processor_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update feature_extraction_tvlt.py
* Update feature_extraction_tvlt.py
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update image_processing_tvlt.py
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update test_image_processor_tvlt.py
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/tvlt/test_modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Add files via upload
* Add files via upload
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Add files via upload
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update image_processing_tvlt.py
* Add files via upload
* Add files via upload
* Update tvlt.mdx
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update docs/source/en/model_doc/tvlt.mdx
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Add files via upload
* Add files via upload
* Add files via upload
* Add files via upload
* Update modeling_auto.py
* Update tvlt.mdx
* Update dummy_pt_objects.py
* Update feature_extraction_tvlt.py
* Update modeling_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_image_processor_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update modeling_tvlt.py
* Update dummy_pt_objects.py
* Update dummy_speech_objects.py
* Add files via upload
* Update README_hd.md
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update test_modeling_tvlt.py
* Update src/transformers/models/tvlt/configuration_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/feature_extraction_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/image_processing_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update MAE processing
* Update modeling_tvlt.py
* Update modeling_tvlt.py
* Update modeling
* Update style
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/tvlt/modeling_tvlt.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update check_repo.py
* Update tvlt.mdx
* Update __init__.py
* Update tests
* Update tvlt models
* Update configuration_tvlt.py
* Update configuration_tvlt.py
* Update image_processing_tvlt.py
* Update dummy_pt_objects.py
* Add files via upload
* Update test_modeling_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_feature_extraction_tvlt.py
* Update test_feature_extraction_tvlt.py
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-02-15 18:10:30 +00:00
amyeroberts
3499c49c17
Skipping more high mem tests - Wav2Vec2 Hubert ( #21647 )
...
Skipping more tests
2023-02-15 16:00:50 +00:00
Susnato Dhar
0c9c8472e6
Add Ernie-M Model to huggingface ( #21349 )
...
* config and tokenization(fast too) changed and ErnieEncoder added
* Slow Tokenization Added
* Tokenizer(slow) is now working and Fast Tokenizer removed
* Added Config code
* Added Base Model and utils
* ErnieMModel is now working
* All added except tests
* All tests passed except ErnieUIEM
* All tests passed
* all fixes done
* all fixes done
* fixed MAP
* fixed check_code_quality
* fixed Build PR Documentation issue
* Added changes(comments) and also updated to the latest upstream/main
* Added fixup
* Added # Copied comments
* Added fixup
* Added more comments and some nits
* Added fixup
* Fixed README_hd.md
* Added more fixes
* ErnieMTokenizer (being sentencepiece) protected and other docs edited
* Added code_quality fix
* Fixed for
* Added more fix
* modified AZ
* ernie-m tokenization test added!
* attention mask part fixed(with 0->self.config.pad_token_id)
* applied make fixup
2023-02-15 09:24:56 -05:00
amyeroberts
fc28c006a6
Skip wav2vec2 hubert high mem tests ( #21643 )
...
* Skip high memory tests
* Skip high memory tests
* Remove unused import
2023-02-15 14:17:26 +00:00
Yih-Dar
e3d832ff87
Fix Blip-2 CI again ( #21637 )
...
* fix blip-2 ci
* fix blip-2 ci
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-15 10:59:42 +01:00
Sylvain Gugger
d4ba6e1a0e
Fix generation config for empty state dict ( #21630 )
2023-02-14 10:57:28 -05:00
Sylvain Gugger
317282927d
Fix the real failing test
2023-02-14 10:52:23 -05:00
Sylvain Gugger
c6f163c786
Skip failing test
2023-02-14 09:20:47 -05:00
Joao Gante
a81fe4e1df
Generate: input expansion for any model input ( #21624 )
2023-02-14 14:16:22 +00:00
Joao Gante
13e03e619d
Generate: filter encoder inputs when its signature does not accept wildcards ( #21603 )
2023-02-14 10:46:46 +00:00
Joao Gante
56b03c96b8
Fix TF CTC tests ( #21606 )
2023-02-13 21:23:00 +00:00
Yih-Dar
cbecf121cd
Fix env. variable type issue in testing ( #21609 )
...
* fix env issue
* fix env issue
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-13 20:53:26 +01:00
Joao Gante
fa4bdb0a40
Generate: correct default model input creation for decoder-only models ( #21580 )
2023-02-13 17:04:49 +00:00
Yih-Dar
edc1e734bf
Fix Blip-2 CI ( #21595 )
...
* use fp16
* use fp16
* use fp16
* use fp16
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-13 16:44:27 +01:00
Younes Belkada
1666c42f0b
[bnb
] Let's make the daily CI green 🍏 ( #21597 )
...
* fix bnb slow test
* make fixup
2023-02-13 16:18:50 +01:00
Joao Gante
24273268b7
Generate: Fix flaky indexing error in test_constrained_beam_search_generate_dict_output
( #21561 )
2023-02-13 15:12:07 +00:00
Joao Gante
4be75e9728
CI: skip failing TF hubert test ( #21601 )
...
skip test
2023-02-13 09:34:23 -05:00
Joao Gante
eb6c59bc78
Generate: TF supports multiple eos tokens ( #21571 )
2023-02-13 12:24:22 +00:00
amyeroberts
cb56590111
Replace input_values_processing with unpack_inputs ( #21502 )
...
* Replace input_values_prrocessing with unpack_inputs
* Skip test failing with OOM
* Update tests
2023-02-10 18:19:39 +00:00
Stas Bekman
2f5507580b
[from_pretrained] extend torch_dtype="auto"
to look up config.torch_dtype
first, expand docs ( #21524 )
...
* [from_pretrained] expand on torch_dtype entry
* fold 4 into 1
* style
* support torch_dtype='config' plus tests
* style
* oops
* fold config into auto, fix bug
* fix check
* better log
* better log
* clean up
2023-02-10 09:09:21 -08:00
Shubhamai
9e40bba6ba
[Tests] Improve flax test_attention_outputs ( #21486 )
...
improving flax tests
2023-02-10 11:31:49 -05:00
Patrick von Platen
b20147a3c8
[Variant] Make sure variant files are not incorrectly deleted ( #21562 )
...
* [Variant] Make sure variant files are not incorrectly deleted
* Apply suggestions from code review
* fix
2023-02-10 15:44:51 +01:00
Jannis Vamvas
b0d539ccad
Add X-MOD ( #20939 )
...
* Add X-MOD to Readme
* Add documentation for X-MOD
* Implement X-MOD
* Fix formatting of X-MOD docs
* Change signature of X-MOD forward methods to use lang_ids
* Minor changes
* Rebase with main and run make fix-copies
* Make suggested changes to docstrings
* Improve code readability
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Fix code style
* Conversion script: Remove asserts and type annotations
* Remove _TOKENIZER_FOR_DOC
* XMOD -> Xmod
* Update copyright note
* Fix doctests
* Fix docstring
* Add integration test for FillMaskPipeline
* Revert "Add integration test for FillMaskPipeline"
This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f.
* Add end-to-end integration test for mask fill
* make style
* Rebase with main and make fix-copies
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-02-10 15:32:06 +01:00
Quentin Meeus
5b72b3412b
Remove CLI spams with Whisper FeatureExtractor ( #21267 )
...
* Remove CLI spams with Whisper FeatureExtractor
Whisper feature extractor representation includes the MEL filters, a list of list that is represented as ~16,000 lines. This needlessly spams the command line. I added a `__repr__` method that replaces this list with a string "<array of shape (80, 201)>"
* Remove mel_filters from to_dict output
Credits to @ArthurZucker
* remove unused import
* update feature extraction tests for the changes in to_dict
2023-02-10 09:15:16 -05:00
Katie Le
21a2d900ec
Added with torch.no_grad() to Camembert integration test ( #21544 )
...
add with torch.no_grad() to Camembert integration test
Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-10 10:58:29 +01:00
Younes Belkada
f83942684d
[pipeline
] A simple fix for half-precision & 8bit models ( #21479 )
...
* v1 fix
* adapt from suggestions
* make style
* fix tests
* add gpu tests
* update docs
* fix other tests
* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* better fix
* make fixup
* better example
* revert changes
* proposal
* more elegant solution
* Update src/transformers/pipelines/automatic_speech_recognition.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
---------
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-10 10:26:17 +01:00
Sylvain Gugger
97d3390fc8
Skip failing test for now
2023-02-09 20:11:26 -05:00
Katie Le
23c146c38b
Added with torch.no_grad() to XLM-Roberta integration test ( #21547 )
...
* added with torch.no_grad() to the integration tests and applied make style
* added with torch.no_grad() to xlm roberta forward pass
---------
Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-09 21:49:54 +01:00
Sylvain Gugger
04b2f13c37
🚨 🚨 🚨 Enforce single model initialization ( #21431 )
...
* Enforce single model initialization
* Add OneFormer example for problem 3
* Do it the Stas way
* Actually rename the uses...
* Rewrite test
* Try to change the test this way
* Fix all init slow/fast tests
* Break connection
* Fix more tests
* Fix test for initialization
* Remove custom test
* Quality
* Fix last failing tests
* The end?
2023-02-09 15:46:26 -05:00
Sylvain Gugger
2020ac4bd6
Fix from_pretrained API with config and state_dict ( #21542 )
2023-02-09 15:44:02 -05:00
NielsRogge
d7f1e7c009
Add BLIP-2 ( #21441 )
...
* First draft
* More improvements
* More improvements
* Improve conversion script
* Convert all weights
* Make forward pass work
* Make logits match
* More improvements
* More improvements
* More improvements
* Use get_input_embeddings
* Improve some more
* Improve model tests
* Improve model tests
* More improvements
* Fix processor
* Update files
* Update prepare_inputs_for_generation
* More improvements
* Fix copies
* More fixes
* Make fixup
* More improvements
* Add support for seq2seq language model
* More improvements
* Fix test
* More improvements
* Improve conversion script
* Remove some todo's
* Fix README's
* Improve conversion script
* Fix generation
* Fix style and remove Blip2Model
* Fix model outputs
* More improvements
* Set eos_token_id in config
* Fix quality
* Small improvements
* Add processor tests
* More improvements
* Apply suggestions
* Apply suggestions
* Add integration test
* Update image URL
* Add integration test
* Fix model_type
* Update style
* Improve docs
* Add doc tests
* Fix copies
* Remove tests which are passing
* Improve some more
* Add tests for seq2seq language models
* Minor fix
* Convert more checkpoints
* finalize CI
* Fix blip and blip2 processors
* add `accelerate` support for `blip2`
* clean up
* make style
* Update conversion script
* Update conversion script some more
* Update organization
* revert toc file
* add blip-2 to toc file
* Some more improvements
* Fix docstring
* Improve docs
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-02-09 16:52:11 +01:00
Joao Gante
0d33381fad
Tag tests as slow ⌛ ( #21537 )
...
begone slow tests
2023-02-09 14:46:15 +00:00
Joao Gante
2edf9a857b
Generate: TF .generate()
can now be exported with dynamic length ( #21474 )
2023-02-09 12:52:30 +00:00
Joao Gante
e69f9715eb
Generate: make TF .generate()
signature == PT .generate()
signature ( #21525 )
2023-02-09 11:10:13 +00:00
Motoki Wu
9960506cbe
Fix multiple eos_token_id
s in model.generate(...) ( #21461 )
...
* add tests with multiple eos_token_ids
* make math.prod instead of sum
* make fixup
* fix long and also use np.prod since math.prod does not exist <python 3.8
* make fixup
* add prod util
* use prod util instead of np.prod
* make fixup
* previous .long location
* use tensor ops
* remove prod
* remove prod
* update device
* make fixup
* fix none
2023-02-08 13:48:46 -05:00
Stas Bekman
8ea994d3c5
[tests] add missing report_to none
( #21505 )
...
[tests] report_to none
2023-02-08 09:32:40 -08:00
Joao Gante
1d9c26a4b8
Generate: TF compute_transition_scores
( #21341 )
2023-02-08 16:36:43 +00:00
Guillaume Klein
ca905ba28e
Exclude the madeup words from M2M100Tokenizer.vocab_size ( #20976 )
2023-02-08 09:19:06 -05:00
Katie Le
cc1d0685b3
Wrap RemBert integration test forward passes with torch.no_grad() ( #21503 )
...
added with torch.no_grad() to the integration tests and applied make style
Co-authored-by: Bibi <Bibi@katies-mac.local>
2023-02-08 14:00:52 +01:00
Adrian Sager La Ganga
a3034c7004
Add inverse sqrt learning rate scheduler ( #21495 )
...
* added inverse sqrt lr scheduler
* Updated get_scheduler in src/transformers/optimization.py
* Updated src/transformers/__init__.py
* Added inverse sqrt lr scheduler test
* Updated docs/source/en/main_classes/optimizer_schedules.mdx
* Ran style and quality scripts
* Fix get_inverse_sqrt_schedule docstring
* Comment implementation URL
2023-02-07 15:00:50 -05:00
Stas Bekman
b9af152efb
[tokenizer] sanitize saved config ( #21483 )
...
* [tokenizer] sanitize saved config
* rm config["name_or_path"] test
2023-02-07 10:51:45 -08:00
Sylvain Gugger
67d074874d
Cleanup quality ( #21493 )
...
* Remove mentions of flake8/isort
* Clean up inits
* Deall with all other inits
* Last special rule for dummy files
2023-02-07 12:27:31 -05:00
Arthur
9e7f84a556
[OPT] Adds GPT2TokenizerFast
to the list of tokenizer to use for OPT. ( #20823 )
...
* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),
* skip failing test
* Add ("opt", ("GPT2Tokenizer", "GPT2TokenizerFast" if is_tokenizers_available() else None)),
* skip failing test
2023-02-07 17:35:28 +01:00
Joao Gante
1e4cf8bb44
Generate: TF can now generate from embeddings in encoder-decoder models ( #21475 )
2023-02-07 11:18:23 +00:00
Arthur
12eb528b5a
[CI ] Remove past
in favor of pat_key_values
( #21443 )
...
* fix past renamed to past_key_value
* update more `past`that were ski^êd
* fixup
* remove changes made to rag
* refactor `_reorder_cache` to use `past_key_values`
* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache
2023-02-07 09:51:35 +01:00
Sylvain Gugger
cc8407522a
Fix epoch number when resuming training ( #21478 )
2023-02-06 19:34:34 -05:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting ( #21480 )
...
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
2023-02-06 18:10:56 -05:00
Joao Gante
4943331015
Generate: TF can now accept custom logits processors ( #21454 )
2023-02-06 15:44:47 +00:00
Yih-Dar
0db5d911fc
Fix SpeechT5ForSpeechToSpeechIntegrationTests
device issue ( #21460 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-06 10:43:07 +01:00
Yih-Dar
59d5edef34
Avoid flaky generation sampling tests ( #21445 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-03 22:01:25 +01:00
Matthijs Hollemans
e4bacf6614
[WIP] add SpeechT5 model ( #18922 )
...
* make SpeechT5 model by copying Wav2Vec2
* add paper to docs
* whoops added docs in wrong file
* remove SpeechT5Tokenizer + put CTC back in the name
* remove deprecated class
* remove unused docstring
* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead
* remove classes we don't need right now
* initial stab at speech encoder prenet
* add more speech encoder prenet stuff
* improve SpeechEncoderPrenet
* add encoder (not finished yet)
* add relative position bias to self-attention
* add encoder CTC layers
* fix formatting
* add decoder from BART, doesn't work yet
* make it work with generate loop
* wrap the encoder into a speech encoder class
* wrap the decoder in a text decoder class
* changed my mind
* changed my mind again ;-)
* load decoder weights, make it work
* add weights for text decoder postnet
* add SpeechT5ForCTC model that uses only the encoder
* clean up EncoderLayer and DecoderLayer
* implement _init_weights in SpeechT5PreTrainedModel
* cleanup config + Encoder and Decoder
* add head + cross attention masks
* improve doc comments
* fixup
* more cleanup
* more fixup
* TextDecoderPrenet works now, thanks Kendall
* add CTC loss
* add placeholders for other pre/postnets
* add type annotation
* fix freeze_feature_encoder
* set padding tokens to 0 in decoder attention mask
* encoder attention mask downsampling
* remove features_pen calculation
* disable the padding tokens thing again
* fixup
* more fixup
* code review fixes
* rename encoder/decoder wrapper classes
* allow checkpoints to be loaded into SpeechT5Model
* put encoder into wrapper for CTC model
* clean up conversion script
* add encoder for TTS model
* add speech decoder prenet
* add speech decoder post-net
* attempt to reconstruct the generation loop
* add speech generation loop
* clean up generate_speech
* small tweaks
* fix forward pass
* enable always dropout on speech decoder prenet
* sort declaration
* rename models
* fixup
* fix copies
* more fixup
* make consistency checker happy
* add Seq2SeqSpectrogramOutput class
* doc comments
* quick note about loss and labels
* add HiFi-GAN implementation (from Speech2Speech PR)
* rename file
* add vocoder to TTS model
* improve vocoder
* working on tokenizer
* more better tokenizer
* add CTC tokenizer
* fix decode and batch_code in CTC tokenizer
* fix processor
* two processors and feature extractors
* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2
* cleanup
* more cleanup
* even more fixup
* notebooks
* fix log-mel spectrograms
* support reduction factor
* fixup
* shift spectrograms to right to create decoder inputs
* return correct labels
* add labels for stop token prediction
* fix doc comments
* fixup
* remove SpeechT5ForPreTraining
* more fixup
* update copyright headers
* add usage examples
* add SpeechT5ProcessorForCTC
* fixup
* push unofficial checkpoints to hub
* initial version of tokenizer unit tests
* add slow test
* fix failing tests
* tests for CTC tokenizer
* finish CTC tokenizer tests
* processor tests
* initial test for feature extractors
* tests for spectrogram feature extractor
* fixup
* more fixup
* add decorators
* require speech for tests
* modeling tests
* more tests for ASR model
* fix imports
* add fake tests for the other models
* fixup
* remove jupyter notebooks
* add missing SpeechT5Model tests
* add missing tests for SpeechT5ForCTC
* add missing tests for SpeechT5ForTextToSpeech
* sort tests by name
* fix Hi-Fi GAN tests
* fixup
* add speech-to-speech model
* refactor duplicate speech generation code
* add processor for SpeechToSpeech model
* add usage example
* add tests for speech-to-speech model
* fixup
* enable gradient checkpointing for SpeechT5FeatureEncoder
* code review
* push_to_hub now takes repo_id
* improve doc comments for HiFi-GAN config
* add missing test
* add integration tests
* make number of layers in speech decoder prenet configurable
* rename variable
* rename variables
* add auto classes for TTS and S2S
* REMOVE CTC!!!
* S2S processor does not support save/load_pretrained
* fixup
* these models are now in an auto mapping
* fix doc links
* rename HiFiGAN to HifiGan, remove separate config file
* REMOVE auto classes
* there can be only one
* fixup
* replace assert
* reformat
* feature extractor can process input and target at same time
* update checkpoint names
* fix commit hash
2023-02-03 12:43:46 -05:00
Yih-Dar
197e7ce911
Fix device issue in a ConvBertModelTest
test ( #21438 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-03 15:12:28 +01:00
Joao Gante
f21af26279
🚨 🚨 Generate: standardize beam search behavior across frameworks ( #21368 )
2023-02-03 10:24:02 +00:00
Yih-Dar
a6d8a149a8
Fix some pipeline tests ( #21401 )
...
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-02 19:03:31 +01:00
Younes Belkada
8298e4ec02
[bnb
] Fine-tuning HF 8-bit models ( #21290 )
...
* force `memory_efficient_backward=True`
* enhancements
- trainer support
- add new flag
* some changes
- internal changes in `Trainer`
- small refactor
* make quality
* Fixes
- add new testing util
- add new test
- change test in Trainer
* fix CI test
* educate users on how to ft 8bit models
* more checks
* fix `logger` error
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* adapt from review
* fix
* add comment
* use return instead
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-02 16:39:23 +01:00
Clémentine Fourrier
67a3920d85
Fix Graphormer test suite ( #21419 )
...
* [FIX] path for Graphormer checkpoint
* [FIX] Test suite for graphormer
* [FIX] Update graphormer default num_classes
2023-02-02 16:29:13 +01:00
Joel Lamy-Poirier
e006ab51ac
Add the GeLU activation from pytorch with the tanh approximation ( #21345 )
...
* gelu_python_tanh
* rename
* Version check, add test
* Pr comment
2023-02-02 09:33:04 -05:00
Joao Gante
92ce53aab8
Generate: decoder-only models can generate with inputs_embeds
( #21405 )
2023-02-01 21:50:38 +00:00
raghavanone
77db257e2a
Fix the issue of using only inputs_embeds in convbert model ( #21398 )
...
* Fix the input embeds issue with tests
* Fix black and isort issue
* Clean up tests
* Add slow tag to the test introduced
* Incorporate PR feedbacks
2023-02-01 09:47:25 -05:00
Patrick von Platen
90cddfa824
Add variant to transformers ( #21332 )
...
* Bump onnx in /examples/research_projects/decision_transformer
Bumps [onnx](https://github.com/onnx/onnx ) from 1.11.0 to 1.13.0.
- [Release notes](https://github.com/onnx/onnx/releases )
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog.md )
- [Commits](https://github.com/onnx/onnx/compare/v1.11.0...v1.13.0 )
---
updated-dependencies:
- dependency-name: onnx
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
* adapt
* finish
* Update examples/research_projects/decision_transformer/requirements.txt
* up
* add tests
* Apply suggestions from code review
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* fix test
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-02-01 09:21:52 +01:00
Yih-Dar
bc44e947f3
Update Graphormer
and fix its torchscript
test failures ( #21380 )
...
* fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-31 17:32:25 +01:00
Joao Gante
19d67bfecb
Generate: fix TF XLA tests on models with max_position_embeddings
or max_target_positions
( #21389 )
2023-01-31 15:49:34 +00:00
Joao Gante
623346ab18
Template for framework-agnostic tests ( #21348 )
2023-01-31 11:33:18 +00:00
NielsRogge
5451f8896c
Add DETA ( #20983 )
...
* First draft
* Add initial draft of conversion script
* Convert all weights
* Fix config
* Add image processor
* Fix DetaImageProcessor
* Run make fix copies
* Remove timm dependency
* Fix dummy objects
* Improve loss function
* Remove conv_encoder attribute
* Update conversion scripts
* Improve postprocessing + docs
* Fix copied from statements
* Add tests
* Improve postprocessing
* Improve postprocessing
* Update READMEs
* More improvements
* Fix rebase
* Add is_torchvision_available
* Add torchvision dependency
* Fix typo and README
* Fix bug
* Add copied from
* Fix style
* Apply suggestions
* Fix thanks to @ydshieh
* Fix another dependency check
* Simplify image processor
* Add scipy
* Improve code
* Add threshold argument
* Fix bug
* Set default threshold
* Improve integration test
* Add another integration test
* Update setup.py
* Address review
* Improve deformable attention function
* Improve copied from
* Use relative imports
* Address review
* Replace assertions
* Address review
* Update dummies
* Remove dummies
* Address comments, update READMEs
* Remove custom kernel code
* Add image processor tests
* Add requires_backends
* Add minor comment
* Update scripts
* Update organization name
* Fix defaults, add doc tests
* Add id2label for object 365
* Fix tests
* Update task guide
2023-01-31 10:43:10 +01:00
Clémentine Fourrier
14d989a91d
Fixes path for Graphormer checkpoint ( #21367 )
...
[FIX] path for Graphormer checkpoint
2023-01-30 21:48:04 +01:00
Joao Gante
42b60f8b02
Generate: Relaxed max_length
and max_new_tokens
coexistence ( #21347 )
...
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-01-30 17:53:54 +00:00
Yih-Dar
c749bd405e
Pipeline testing - using tiny models on Hub ( #20426 )
...
* rework pipeline tests
* run pipeline tests
* fix
* fix
* fix
* revert the changes in get_test_pipeline() parameter list
* fix expected error message
* skip a test
* clean up
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-30 10:39:43 +01:00
Yih-Dar
a582cfce3c
Fix GitModelIntegrationTest.test_batched_generation
device issue ( #21362 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-30 10:37:56 +01:00
Arthur
0dff407d71
[Whisper] another patch ( #21324 )
...
* another patch
* fix timestamp test modeling
* let it be negative when the token is None
2023-01-27 16:35:16 +01:00
Yih-Dar
449df41f01
Fix TFEncoderDecoder
tests ( #21301 )
...
remove max_length=None
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-26 16:56:42 +01:00
Yih-Dar
4e41b87e3d
Use model_class.__name__
and compare against XXX_MAPPING_NAMES
( #21304 )
...
* update
* update all
* clean up
* make quality
* clean up
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-26 11:31:31 +01:00
amyeroberts
d18a1cba24
Accept batched tensor of images as input to image processor ( #21144 )
...
* Accept a batched tensor of images as input
* Add to all image processors
* Update oneformer
2023-01-26 10:15:26 +00:00
Arthur
6f3faf3863
[WHISPER] Small patch ( #21307 )
...
* add small patch
* update tests, forced decoder ids is not prioritary against generation config
* fix two new tests
2023-01-25 22:49:23 +01:00
Anahita Bhiwandiwalla
3a6e4a221c
Add BridgeTower model ( #20775 )
...
* Commit with BTModel and latest HF code
* Placeholder classes for BTForMLM and BTForITR
* Importing Bert classes from transformers
* Removed objectives.py and dist_utils.py
* Removed swin_transformer.py
* Add image normalization, BridgeTowerForImageAndTextRetrieval
* Add center_crop
* Removing bert tokenizer and LCI references
* Tested config loading from HF transformers hub
* Removed state_dict updates and added path to hub
* Enable center crop
* Getting image_size from config, renaming num_heads and num_layers
* Handling max_length in BridgeTowerProcessor
* Add BridgeTowerForMaskedLM
* Add doc string for BridgeTowerConfig
* Add doc strings for BT config, processor, image processor
* Adding docs, removed swin
* Removed convert_bridgetower_original_to_pytorch.py
* Added doc files for bridgetower, removed is_vision
* Add support attention_mask=None and BridgeTowerModelOutput
* Fix formatting
* Fixes with 'make style', 'make quality', 'make fixup'
* Remove downstream tasks from BridgeTowerModel
* Formatting fixes, add return_dict to BT models
* Clean up after doc_test
* Update BTModelOutput return type, fix todo in doc
* Remove loss_names from init
* implement tests and update tuples returned by models
* Add image reference to bridgetower.mdx
* after make fix-copies, make fixup, make style, make quality, make repo-consistency
* Rename class names with BridgeTower prefix
* Fix for image_size in BTImageProcessor
* implement feature extraction bridgetower tests
* Update image_mean and image_std to be list
* remove unused import
* Removed old comments
* Rework CLIP
* update config in tests followed config update
* Formatting fixes
* Add copied from for BridgeTowerPredictionHeadTransform
* Update bridgetower.mdx
* Update test_feature_extraction_bridgetower.py
* Update bridgetower.mdx
* BridgeTowerForMaskedLM is conditioned on image too
* Add BridgeTowerForMaskedLM
* Fixes
* Call post_init to init weights
* Move freeze layers into method
* Remove BTFeatureExtractor, add BT under multimodal models
* Remove BTFeatureExtractor, add BT under multimodal models
* Code review feedback - cleanup
* Rename variables
* Formatting and style to PR review feedback
* Move center crop after resize
* Use named parameters
* Style fix for modeling_bridgetower.py
* Update docs/source/en/model_doc/bridgetower.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/bridgetower.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/bridgetower.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/bridgetower/modeling_bridgetower.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/bridgetower/modeling_bridgetower.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update docs/source/en/model_doc/bridgetower.mdx
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Update src/transformers/models/bridgetower/modeling_bridgetower.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Rename config params, copy BERT classes, clean comments
* Cleanup irtr
* Replace Roberta imports, add BTTextConfig and Model
* Update docs, add visionconfig, consistent arg names
* make fixup
* Comments for forward in BTModel and make fixup
* correct tests
* Remove inconsistent roberta copied from
* Add BridgeTowerTextModel to dummy_pt_objects.py
* Add BridgeTowerTextModel to IGNORE_NON_TESTED
* Update docs for BT Text and Vision Configs
* Treat BridgeTowerTextModel as a private model
* BridgeTowerTextModel as private
* Run make fix-copies
* Adding BTTextModel to PRIVATE_MODELS
* Fix for issue with BT Text and Image configs
* make style changes
* Update README_ja.md
Add から to BridgeTower's description
* Clean up config, .mdx and arg names
* Fix init_weights. Remove nn.Sequential
* Formatting and style fixes
* Re-add tie_word_embeddings in config
* update test implementation
* update style
* remove commented out
* fix style
* Update README with abs for BridgeTower
* fix style
* fix mdx file
* Update bridgetower.mdx
* Update img src in bridgetower.mdx
* Update README.md
* Update README.md
* resolve style failed
* Update _toctree.yml
* Update README_ja.md
* Removed mlp_ratio, rename feats, rename BTCLIPModel
* Replace BTCLIP with BTVisionModel,pass in vision_config to BTVisionModel
* Add test_initialization support
* Add support for output_hidden_states
* Update support for output_hidden_states
* Add support for output_attentions
* Add docstring for output_hidden_states
* update tests
* add bridgetowervisionmodel as private model
* rerun the PR test
* Remove model_type, pass configs to classes, renames
* Change self.device to use weight device
* Remove image_size
* Style check fixes
* Add hidden_size and num_hidden_layers to BridgeTowerTransformer
* Update device setting
* cosmetic update
* trigger test again
* trigger tests again
* Update test_modeling_bridgetower.py
trigger tests again
* Update test_modeling_bridgetower.py
* minor update
* re-trigger tests
* Update docs/source/en/model_doc/bridgetower.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Remove pad, update max_text_len, doc cleanup, pass eps to LayerNorm
* Added copied to, some more review feedback
* make fixup
* Use BridgeTowerVisionEmbeddings
* Code cleanup
* Fixes for BridgeTowerVisionEmbeddings
* style checks
* re-tests
* fix embedding
* address comment on init file
* retrigger tests
* update import prepare_image_inputs
* update test_image_processing_bridgetower.py to reflect test_image_processing_common.py
* retrigger tests
Co-authored-by: Shaoyen Tseng <shao-yen.tseng@intel.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
2023-01-25 14:04:32 -05:00
Yih-Dar
cc714d74c4
Update OneFormerModelIntegrationTest
expected values ( #21295 )
...
* update values
* update values
* update values
* Update tests/models/oneformer/test_modeling_oneformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-25 17:27:02 +01:00
Nicolas Patry
8788fd0ceb
Moving to cleaner tokenizer version or oneformer
. ( #21292 )
...
Moving to cleaner tokenizer version.
2023-01-25 15:46:10 +01:00
Arthur
255257f3ea
[Whisper] Refactor whisper ( #21252 )
...
* update whisper logit processor
* add generate for whisper
* remove part of the whisper specific code from pipeline
* update logit processes
* major update
* enforce first timestamp
* update generate
* add more tests
* update new decoding strategy
* Apply suggestions from code review
* update docstring
* fixup
* default config will not have multilingual ar
* update expected tokenizer size, see pull on the hub for whisper-tiny
2023-01-25 13:09:43 +01:00
Nicolas Patry
99e7905422
Supporting ImageProcessor
in place of FeatureExtractor
for pipelines ( #20851 )
...
* Fixing the pipeline with image processor.
* Update the slow test.
* Using only the first image processor.
* Include exclusion mecanism for Image processor.
* Do not handle Gitconfig, deemed as a bug.
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Remove `conversational` changes. They are not supposed to be here.
* Address first row of comments.
* Remove OneFormer modifications.
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-01-25 10:16:31 +01:00
NielsRogge
efdbad56ab
[GIT] Add test for batched generation ( #21282 )
...
* Add test
* Apply suggestions
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-01-25 10:14:18 +01:00
Sanchit Gandhi
14d058b940
[W2V2 with LM] Fix decoder test with params ( #21277 )
2023-01-24 19:27:56 +01:00
Arthur
94a7edd938
[GenerationConfig] add additional kwargs handling ( #21269 )
...
* add additional kwargs handling
* fix issue when serializing
* correct order of kwargs removal for serialization in from dict
* add `dict_torch_dtype_to_str` in case a dtype is needed for generation
* add condition when adding the kwargs : not from config
* Add comment based on review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* add test function
* default None when poping arg
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-01-24 19:04:42 +01:00
Stas Bekman
9286039c2a
[examples/deepspeed] fix renamed api ( #21283 )
2023-01-24 09:54:33 -08:00
Younes Belkada
e2e393c6f2
[t5
] Fix T5 inference in float16
+ bnb
error ( #21281 )
...
* attempts to fix:
- upcast input for `T5DenseActDense`
- add the condition `self.wo.weight.dtype != torch.int8`
- added tests on `test/mixed_int8`
- `make fixup`
* fix ci test
2023-01-24 18:14:38 +01:00
Alara Dirik
f424b09410
Fix MaskFormerImageProcessor.post_process_instance_segmentation ( #21256 )
...
* fix instance segmentation post processing
* add Mask2FormerImageProcessor
2023-01-24 18:49:29 +03:00
Yih-Dar
bde7378bf0
Skip test_multi_gpu_data_parallel_forward
for UperNetModelTest
( #21216 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-24 10:41:16 +01:00
amyeroberts
c18b4fbe9f
Add class properties with warnings ( #21195 )
...
* Replace reduce_labels with do_reduce_labels
* Replace only for __init__ and preprocess
* Add class properties with warnings
* Update tests
2023-01-23 18:45:27 +00:00
Arthur
b80b2218b5
[ci-daily] Fix pipeline tests ( #21257 )
...
* use streaming dataset
* fix whisper's test
* add rescale argument to chunk_iter
2023-01-23 19:32:49 +01:00
amyeroberts
e2bd7f80d0
Update tests: replace feature extractor tests with image processor ( #20768 )
...
* Update imports and test fetcher
* Revert but keep test fetcher update
* Fix imports
* Fix all imports
* Replace fe with ip names
* Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952 )
* Add generate kwargs to AutomaticSpeechRecognitionPipeline
* Add test for generation kwargs
* Update image processor parameters if creating with kwargs (#20866 )
* Update parameters if creating with kwargs
* Shallow copy to prevent mutating input
* Pass all args in constructor dict - warnings in init
* Fix typo
* Rename tester class
* Rebase and tidy up
* Fixup
* Use ImageProcessingSavingTestMixin
* Update property ref in tests
* Update property ref in tests
* Update recently merged in models
* Small fix
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
2023-01-23 17:25:41 +00:00
amyeroberts
354ea44340
Replace reduce_labels with do_reduce_labels ( #21218 )
...
* Replace reduce_labels with do_reduce_labels
* Replace only for __init__ and preprocess
* Update tests
2023-01-23 17:21:33 +00:00
Joao Gante
1eda4a4102
Generate: save generation config with the models' .save_pretrained()
( #21264 )
2023-01-23 16:21:44 +00:00
amyeroberts
66459ce319
Add test_image_processing_common.py ( #20785 )
...
* Add test_image_processing_common.py
* Fix typo
* Update imports and test fetcher
* Revert but keep test fetcher update
* Fix imports
* Fix all imports
* Formatting fix
* Update tests/test_image_processing_common.py
2023-01-23 13:48:30 +00:00
NielsRogge
91ff7efeeb
[DETR and friends] Use AutoBackbone as alternative to timm ( #20833 )
...
* First draft
* More improvements
* Add conversion script
* More improvements
* Add docs
* Address review
* Rename class to ConvEncoder
* Address review
* Apply suggestion
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update all DETR friends
* Add corresponding test
* Improve test
* Fix bug
* Add more tests
* Set out_features to last stage by default
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-23 12:15:47 +01:00
Sylvain Gugger
4e730b3873
Skip failing test for now ( #21226 )
...
skip failing test for now
2023-01-20 20:46:11 -05:00
Joao Gante
af37d183b3
Generate: documented function to compute the transition scores ( #21191 )
...
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-01-20 12:50:01 +00:00
Arthur
5d3cb760a0
[Whispe] Fix pipeline after timestamp merges ( #21198 )
...
* pass return_timestamps to pre-process
* add a test to test it
* test does not need device 0
* remove failing bit
* update test
2023-01-20 10:31:40 +01:00
Bartosz Szmelczynski
1b37fb5e17
Efficientformer ( #20459 )
...
- Adds EfficientFormer V1 to transformers
- PR co-authored by @novice03 and @Bearnardd
Co-authored-by: novice <pranavpulijala@gmail.com>
Co-authored-by: novice <44259234+novice03@users.noreply.github.com>
2023-01-20 11:35:42 +03:00
Clémentine Fourrier
87208a05af
Graphormer model for Graph Classification ( #20968 )
...
* [FT] First commit for graphormer architecture.
The model has no tokenizer, as it uses a collator and preprocessing function for its input management.
Architecture to be tested against original one.
The arch might need to be changed to fit the checkpoint, but a revert to the original arch will make the code less nice to read.
TODO: doc
* [FIX] removed test model
* [FIX] import error
* [FIX] black and flake
* [DOC] added paper refs
* [FIX] [DOC]
* [FIX] black
* [DOC] Updated READMEs
* [FIX] Order of imports + rm Tokenizer calls
* [FIX] Moved assert in class to prevent doc build failure
* [FIX] make fix-copies
* [Doc] update from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [FIX] Removed Graphormer from Sequence classification model list
* [DOC] Added HF copyright to Cython file
* [DOC] Fixed comments
* [FIX] typos in class doc + removed config classes.
Todo: update doc from paper definitions
* [FIX] Removed dependency to fairseq, and replaced all asserts with Exception management
* [FIX] Homogeneized initialization of weights to pretrained constructor
* [FIX] [CP] Updated multi_hop parameter to get same results as in original implementation
* [DOC] Relevant parameter description in the configuration file
* [DOC] Updated doc and comments in main graphormer file
* [FIX] make style and quality checks
* [DOC] Fix doc format
* [FIX] [WIP] Updated part of the tests, though still a wip
* [FIX] [WIP]
* [FIX] repo consistency
* [FIX] Changed input names for more understandability
* [FIX] [BUG] updated num_classes params for propagation in the model
* simplified collator
* [FIX] Updated tests to follow new naming pattern
* [TESTS] Updated test suite along with model
* |FIX] rm tokenizer import
* [DOC] add link to graphormerdoc
* Changed section in doc from text model to graph model
* Apply suggestions from code review
Spacing, inits
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* [DOC] Explain algos_graphormer functions
* Cython soft import protection
* Rm call to Callable in configuration graphormer
* [FIX] replaced asserts with Exceptions
* Add org to graphormer checkpoints
* Prefixed classes with Graphormer
* Management of init functions
* format
* fixes
* fix length file
* update indent
* relaunching ci
* Errors for missing cython imports
* fix style
* fix style doc
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-19 13:05:59 -05:00
Karim Foda
b9403e9516
Add hallucination filter ( #18675 )
...
* Add hallucination penalty
* Make quality changes
* Inverse penalty
* Fix imports & quality
* Fix name spelling issue
* set encoder_repetition_penalty and fix quality
* Fix failing test
* Add to config_common_kwargs
* Fix modelling_rag error
* Update src/transformers/generation_logits_process.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove breakpoint
* Make style fixes
* Update encoder_repetition_penalty default value
* Merge latest main changes
* Make fixup changes
* Add EncoderRepetitionPenaltyLogitsProcessor to generation/__init__.py
* Fix repo-inconsistency
* Remove venv
* Remove tensorflow-macos & add tests
* Add documentation
* Fix quality issues
* move encoder_repetition_penalty to config
* Update src/transformers/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove encoder_repetition_penalty from tests
* Fix type error
* Fix format error
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-01-19 11:20:25 -05:00
Arthur
e9b4800dda
[Whisper] Fix timestamp processor ( #21187 )
...
* add draft logit processor
* add template functions
* update timesapmt processor parameters
* draft script
* simplify code
* cleanup
* fixup and clean
* update pipeline
* style
* clean up previous idea
* add tokenization utils
* update tokenizer and asr output
* fit whisper type
* style and update test
* clean test
* style test
* update tests
* update error test
* udpate code (not based on review yet)
* update tokenization
* update asr pipeline
* update code
* cleanup and update test
* fmt
* remove text verificatino
* cleanup
* cleanup
* add model test
* update tests
* update code add docstring
* update code and add docstring
* fix pipeline tests
* add draft logit processor
add template functions
update timesapmt processor parameters
draft script
simplify code
cleanup
fixup and clean
update pipeline
style
clean up previous idea
add tokenization utils
update tokenizer and asr output
fit whisper type
style and update test
clean test
style test
update tests
update error test
udpate code (not based on review yet)
update tokenization
update asr pipeline
update code
cleanup and update test
fmt
remove text verificatino
cleanup
cleanup
add model test
update tests
update code add docstring
update code and add docstring
fix pipeline tests
* Small update.
* Fixup.
* Tmp.
* More support.
* Making `forced_decoder_ids` non mandatory for users to set.
* update and fix first bug
* properly process sequence right after merge if last
* tofo
* allow list inputs + compute begin index better
* start adding tests
* add the 3 edge cases
* style
* format sequences
* fixup
* update
* update
* style
* test passes, edge cases should be good
* update last value
* remove Trie
* update tests and expec ted values
* handle bigger chunk_length
* clean tests a bit
* refactor chunk iter and clean pipeline
* update tests
* style
* refactor chunk iter and clean pipeline
* upade
* resolve comments
* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* take stride right into account
* update test expected values
* Update code based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
* major refactor
* add correct strides for tests
* Update src/transformers/pipelines/automatic_speech_recognition.py
* fix whisper timestamp test
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-01-19 16:25:56 +01:00
amyeroberts
fc8a93507c
Rename GLPN image processor tests ( #21194 )
2023-01-19 14:46:07 +00:00
Yih-Dar
5761ceb35a
Fix device issue in UperNetModelIntegrationTest
( #21192 )
...
fix device
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-19 14:26:14 +01:00
Jitesh Jain
5b949623c7
Add OneFormer Model ( #20577 )
...
* Add Oneformer Model
* Add OneFormer Tests
* Add UNIVERSAL_SEGMENTATION_MAPPING
* Fix config
* 🐛 Fix error encountered while writing tests
* 🔨 Fix instance segmentation post processing
* Format Files and Add Documentation
* Add Documentation mdx file
* Run make fixup
* Run make fix-copies
* Remove unnecessary code
* Format modeling_oneformer.py
* Add OneFormer to ImageSegmentationPipeline
* Format files
* Add Demo link to Readme
* Fix fomatting errors
* Fix test failures
* Update Table in index.mdx
* Fix version
* Fix style
* Remove OneFormer from TF
* Fix Imports
* Fix dummy objects
* Fix tests
* Add newline
* Remove OneFormerFeatureExtractor
* Remove CUDA Kernels
* Use AutoBackbone for Swin
* Fix description
* Use Image Processor
* Fix copies
* Fix formatting
* Fix import order
* Fix flake8 errors
* Fix doc errors
* Add Hindi Readme entry
* Update supported backbones
* Update supported backbones
* Undo Changes
* Fix type of config
* Fix isort
* Fix auto.mdx
* Fix swin config
* Replace DinatBackbone with AutoBackbone
* Use SwinBackbone
* Use SwinBackbone
* Fix conversion script
* Fix arguments
* Add argument description
* Fix style
* Add OneFormerProcessor
* Fix OneFormerProcessor Tests
* Fix mapping
* Fix imports
* Fix inits
* Fix style
* Fix comment
* Fix docstring
* Move OneFormer to MultiModal
* Fix Copies
* Remove size divisor
* Fix check_repo.py
* Fix copies
* Add Processor for Testing Pipeline
* Fix padding for tokens
* Fix variables
* Fix formatting with correct black version
* Add Image Processor Test
* Apply suggestions
* Revert common modeling
* Add check for task
* Fix conversion script
* Fix initialization order
* Fix tests
* Undo Pipeline Changes
* Fix layers in MLP
* Fix copies
* Update image paths
* Fix copies
* Apply suggestions
2023-01-19 09:31:07 +01:00
jeffhataws
c59d71b282
Add AWS Neuron torchrun support ( #20806 )
...
* Add XLA torchrun support
* Clarify that currently DDP doesn't work with torch.distributed XLA backend yet
* Enable DDP with torchrun and XLA (now available in PT-XLA 1.13)
* Add check for AWS Neuron availability and AWS Neuron specific compiler flag
* Change the new test's name to TestTrainerDistributedNeuronCore
* Remove "assert" and replace raised exception
* Remove compiler flag as it is optional. If needed, will be another PR.
* Use TORCHELASTIC_RUN_ID to determine whether torchrun is used
2023-01-18 11:21:19 -05:00
Sylvain Gugger
05e72aa0c4
Adapt repository creation to latest hf_hub ( #21158 )
...
* Adapt repository creation to latest hf_hub
* Update all examples
* Fix other tests, add Flax examples
* Address review comments
2023-01-18 11:14:00 -05:00
Pengfei Liu
8ad06b7c13
using raw string for regex to search <extra_id> ( #21162 )
...
* using raw string for regex to search <extra_id>
* fix the same issue in test file:`tokenization_t5.py`
2023-01-18 09:43:54 -05:00
Peter Lin
e1ad188641
Fix git model for generate with beam search. ( #21071 )
...
* Fix git model for generate with beam search.
* Update comment
* Fix bug on multi batch
* Add generate tests
* Clean up tests
* Fix style
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-01-18 09:40:24 -05:00
Joao Gante
e15f0d73db
OPT: Fix batched generation with FLAX ( #21150 )
...
* Fix Flax OPT numerical masking
* re-enable test
* add fix to bart and reintroduce copied from in opt
2023-01-18 14:24:53 +00:00
Younes Belkada
023f51fe16
blip
support for training (#21021 )
...
* `blip` support for training
* remove labels creation
* remove unneeded `decoder_input_ids` creation
* final changes
- add colab link to documentation
- reduction = mean for loss
* fix nits
* update link
* clearer error message
2023-01-18 11:24:37 +01:00
Yih-Dar
c8849583ad
Make test_save_pretrained_signatures
slow test ( #21105 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-18 10:43:05 +01:00
Sherman Siu
865da84abb
Add Epsilon- and Eta-Sampling ( #21121 )
...
* Add epsilon- and eta-sampling.
Add epsilon- and eta-sampling, following the official code from https://github.com/john-hewitt/truncation-sampling and adapting to be more configurable, as required by Huggingface transformers.
* Add unit tests for epsilon- and eta-sampling.
* Black: fix code formatting.
* Fix docstring spacing.
* Clean up newlines.
* Fix implementation bugs and their associated tests.
* Remove epsilon- and eta-sampling parameters from PretrainedConfig.
* Clarify and clean up the documentation.
* Remove parameters for PretrainedConfig test.
2023-01-17 13:04:32 -05:00
Arthur
bb300ac686
Whisper Timestamp processor and prediction ( #20620 )
...
* add draft logit processor
* add template functions
* update timesapmt processor parameters
* draft script
* simplify code
* cleanup
* fixup and clean
* update pipeline
* style
* clean up previous idea
* add tokenization utils
* update tokenizer and asr output
* fit whisper type
* style and update test
* clean test
* style test
* update tests
* update error test
* udpate code (not based on review yet)
* update tokenization
* update asr pipeline
* update code
* cleanup and update test
* fmt
* remove text verificatino
* cleanup
* cleanup
* add model test
* update tests
* update code add docstring
* update code and add docstring
* fix pipeline tests
* add draft logit processor
add template functions
update timesapmt processor parameters
draft script
simplify code
cleanup
fixup and clean
update pipeline
style
clean up previous idea
add tokenization utils
update tokenizer and asr output
fit whisper type
style and update test
clean test
style test
update tests
update error test
udpate code (not based on review yet)
update tokenization
update asr pipeline
update code
cleanup and update test
fmt
remove text verificatino
cleanup
cleanup
add model test
update tests
update code add docstring
update code and add docstring
fix pipeline tests
* Small update.
* Fixup.
* Tmp.
* More support.
* Making `forced_decoder_ids` non mandatory for users to set.
* update and fix first bug
* properly process sequence right after merge if last
* tofo
* allow list inputs + compute begin index better
* start adding tests
* add the 3 edge cases
* style
* format sequences
* fixup
* update
* update
* style
* test passes, edge cases should be good
* update last value
* remove Trie
* update tests and expec ted values
* handle bigger chunk_length
* clean tests a bit
* refactor chunk iter and clean pipeline
* update tests
* style
* refactor chunk iter and clean pipeline
* upade
* resolve comments
* Apply suggestions from code review
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
* take stride right into account
* update test expected values
* Update code based on review
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
2023-01-17 15:50:09 +01:00
Nicolas Patry
25ddd91b24
Fixing offline mode for pipeline (when inferring task). ( #21113 )
...
* Fixing offline mode for pipeline (when inferring task).
* Update src/transformers/pipelines/__init__.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Updating test to reflect change in exception.
* Fixing offline mode.
* Clean.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-17 15:24:40 +01:00
amyeroberts
0dde58978a
Rename test_feature_extraction files ( #21140 )
...
* Rename files
* Update file names in tests
2023-01-17 14:04:07 +00:00
Alara Dirik
2411f0e465
Add Mask2Former ( #20792 )
...
* Adds Mask2Former to transformers
Co-authored-by: Shivalika Singh <shivalikasingh95@gmail.com>
Co-authored-by: Shivalika Singh <73357305+shivalikasingh95@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-16 20:37:07 +03:00
NielsRogge
9edf375834
[GIT] Fix training ( #21133 )
...
* Fix training
* Add test
* Fix failing tests
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-01-16 15:37:38 +01:00
Yih-Dar
a45914193a
Fix RealmModelIntegrationTest.test_inference_open_qa
( #21136 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-16 15:09:52 +01:00
Nicolas Patry
488a179ce1
Fixing batching pipelines on single items for ChunkPipeline ( #21132 )
...
* Fixing #20783
* Update src/transformers/pipelines/base.py
* Fixing some tests.
* Fixup.
* Remove ffmpeg dep + a bit more relaxed for bigbird QA precision.
* Better dataset.
* Prevent failing on TF.
* Better condition. We can't use `can_use_iterator` since we cannot use it
directly.
2023-01-16 15:04:27 +01:00
NielsRogge
4ed89d48ab
Add UperNet ( #20648 )
...
* First draft
* More improvements
* Add convnext backbone
* Add conversion script
* Add more improvements
* Comment out to_dict
* Add to_dict method
* Add default config
* Fix config
* Fix backbone
* Fix backbone some more
* Add docs, auto mapping, tests
* Fix some tests
* Fix more tests
* Fix more tests
* Add conversion script
* Improve conversion script
* Add support for getting reshaped undownsampled hidden states
* Fix forward pass
* Add print statements
* Comment out set_shift_and_window_size
* More improvements
* Correct downsampling layers conversion
* Fix style
* First draft
* Fix conversion script
* Remove config attribute
* Fix more tests
* Update READMEs
* Update ConvNextBackbone
* Fix ConvNext tests
* Align ConvNext with Swin
* Remove files
* Fix index
* Improve docs
* Add output_attentions to model forward
* Add backbone mixin, improve tests
* More improvements
* Update init_weights
* Fix interpolation of logits
* Add UperNetImageProcessor
* Improve image processor
* Fix image processor
* Remove print statements
* Remove script
* Update import
* Add image processor tests
* Remove print statements
* Fix test
* Add integration test
* Add convnext integration test
* Update docstring
* Fix README
* Simplify config
* Apply suggestions
* Improve docs
* Rename class
* Fix test_initialization
* Fix import
* Address review
* Fix confg
* Convert all checkpoints
* Fix default backbone
* Usage same processor as segformer
* Apply suggestions
* Fix init_weights, update conversion scripts
* Improve config
* Use Auto API instead of creating a new image processor
* Fix docs
* Add doctests
* Remove ResNetConfig dependency
* Add always_partition argument
* Fix rebaseé
* Improve docs
* Convert checkpoints
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2023-01-16 09:39:13 +01:00
Yih-Dar
b210c83a78
Fix torchscript
tests for AltCLIP
( #21102 )
...
fix torchscript tests for AltCLIP
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-13 10:03:19 +01:00
Yih-Dar
b3a0aad37d
Fix past CI ( #20967 )
...
* Fix for Past CI
* make style
* clean up
* unindent 2 blocks
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-12 18:04:21 +01:00
Stas Bekman
41b0564b35
[bnb optim] fixing test ( #21030 )
...
* [bnb optim] fixing test
* force 1 gpu
* fix
* fix
* fix
* finalize
* improve commentary
* fix
* cleanup
* more fixes
2023-01-12 08:52:54 -08:00
Susnato Dhar
b5be744d3c
Fixed issue #21039 ( #21062 )
...
Fixed issue #21039 and added test for low_cpu_mem_usage
2023-01-12 10:03:13 +01:00
Arthur
e3ecbaa4ab
Patch-past-refactor ( #21050 )
...
* small patches, forgot a line
* refactor PT
* the actual fix
2023-01-09 18:12:13 +01:00
Sylvain Gugger
9a046cc14e
Skip failing test until Athur looks at it.
2023-01-08 04:53:20 -05:00
NielsRogge
4f1c9d162e
[CLIPSeg] Fix integration test ( #20995 )
...
Fix integration test
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2023-01-05 14:30:32 +01:00
Sylvain Gugger
12313838d3
Make sure dynamic objects can be saved and reloaded ( #21008 )
...
* Make sure dynamic objects can be saved and reloaded
* Remove processor test
2023-01-05 07:30:25 -05:00
Younes Belkada
bf82c9b74f
[BLIP
] Fix daily CI failing test ( #20877 )
2023-01-05 13:24:31 +01:00
Joao Gante
b91048968b
Generate: Fix CI related to #20727 ( #21003 )
2023-01-04 20:26:56 +00:00
Joao Gante
a6c850e4f4
Generate: TF uses GenerationConfig
as the basis for .generate()
parametrization ( #20994 )
2023-01-04 18:23:20 +00:00
Alara Dirik
52c9e6af29
Fix bug in segmentation postprocessing ( #20198 )
...
* Fix post_process_instance_segmentation
* Add test for label fusing
2023-01-04 18:34:58 +03:00
amyeroberts
292acd71d6
Update image processor parameters if creating with kwargs ( #20866 )
...
* Update parameters if creating with kwargs
* Shallow copy to prevent mutating input
* Pass all args in constructor dict - warnings in init
* Fix typo
2023-01-04 14:29:48 +00:00
Jongjyh
ce85686a1f
Add AltCLIP ( #20446 )
...
* add altclip
* update
* fix wrong title
* fix the copyright in readme
* add altclip model
* add altclip
* fix test_gradient_checkpointing_enable_disable
* code
* add return class
* add projection_state
* "fix pretrained model bug"
* delete print and fix 2 test instances.
* delete token
* rm xlmr
* one model one file.
* empty commit to trigger CI
* Fix modeling_outputs.py
* Fix __init__
* Fix quality
* Fix modeling file docstring
* Fix README.md
* Fix test file
* add vision model
* empty commit to trigger CI
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* del token in mdx file
* fix
* fix
* fix
* remove altrob from test list
* add vision test
* fix fx
* fix
* fix
* fix
* trigger CI
* fix copies
* fix tests
* fix style
* fix quality
* update
* recover import
* recover
* add ,
* recover
* fix copies
* trigger CI
* fix
* some of review
* update
* remove import
* last 2
* fix
* fix style
* fix style
* fix bug
* fix uncomment
* fix
* update
* fix
* second review
* empty commit to trigger CI
* empty commit to trigger CI
* fix position
* fix
* empty commit to trigger CI
* empty commit to trigger CI
* third comment
* Update docs/source/en/model_doc/altclip.mdx
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update docs/source/en/model_doc/altclip.mdx
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/__init__.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/models/altclip/configuration_altclip.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/models/altclip/modeling_altclip.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/models/altclip/processing_altclip.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* Update src/transformers/models/altclip/modeling_altclip.py
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* fix merge
* fix copies
* update
* update
* empty commit to trigger CI
* fix code example
* empty commit to trigger CI
* fix
* empty commit to trigger CI
* empty commit to trigger CI
Co-authored-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: shunxing1234 <33774367+shunxing1234@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2023-01-04 09:18:57 +01:00
Motoki Wu
45da7cec5a
Add custom stop token ids for generation ( #20727 )
...
* Add StopIdStoppingCriteria
* add a working test for stop id criteria
* add to global scope
* add stop_ids to generate
* add pipeline test
* use tokenizer encode in test
* add test to generation utils
* reformat
* fixup
* make-fix-copies
* rename to stop_token_id
* use stop_tokens instead
* add to text to text generation
* make fixup
* make repo-consistency
* Add support for list of ints for eos_token_id inside generation/utils.py
* Instead of having if elses, cast the eos_token_id into a List[int]
* Add List[int] support for logits_process.py
* add List[int] for beam_search.py
* add List[int] for forced_eos_token_id
* revert stop token id stopping criteria changes
* make fixup
* fix tests
* add eos_token_id to generation/utils.py and added tests test_utils.py
* add eos_token_id type hints and fix for pad tokens
* add comments
* remove some prints and remove forced false test
* fix
* put back test_stop_sequence_stopping_criteria
* remove unused import and make fixup
* add a none check
* update docstring
* add more docstring for list ints
* make fixup
2023-01-03 15:18:24 -05:00
Alara Dirik
cd2457809f
Improve OWL-ViT postprocessing ( #20980 )
...
* add post_process_object_detection method
* style changes
2023-01-03 19:25:09 +03:00
samuelpullely
15c68c67f4
Enable decoder_attention_mask
in generate
function ( #20726 )
...
* Enable `decoder_attention_mask` in `generate` function
* Make style corrections
* Run `make repo-consistency`
* Add integration test
2023-01-03 09:59:08 -05:00
NielsRogge
9c6f7485a6
Add GIT (GenerativeImage2Text) ( #20295 )
...
* First draft
* Make model instantiation work
* Fix copied from statement
* More fixes
* Add correct output head
* Improve configuration
* Add conversion script
* Improve conversion script
* Remove token_type_ids
* Fix conversion of projection layers
* Convert all weights
* Use cats image
* Make logits match
* Generate caption on cats image
* Add GITProcessor
* Update conversion script
* Add support for more checkpoints
* Fix conversion script
* Add initial tests
* Remove cross-attention
* More improvements
* Remove is_decoder
* Improve model tests
* Improve tests
* Improve model outputs
* Fix model outputs equivalence
* Fix more tests
* Remove unused code
* Use generate to generate text, no use of cache for now
* Use generate more appropriately
* Fix config tests
* Fix style
* Add support for use_cache
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Fix style
* Fix GIT vision encoder
* Update README
* Fix integration test
* Set bos and eos token ids
* Improve docs
* Improve code
* Add support for provided attention_mask
* Add copied from statement
* Fix gradient checkpointing test
* Set model_input_names
* Investigate model_input_names
* Remove script
* Fix model inputs
* Fix docstring
* Rename GIT to Git
* Support more models
* Add support for textvqa model
* Add video support
* Extend conversion script for video
* Add support for large variant
* Add support for more models
* Fix config archive map
* Update integration test
* Fix README
* Fix CLIP mean and std
* Update processor
* Fix use_cache for video, thanks @gante
* Remove print statements
* Remove assertion
* Add processor tests
* Fix model_input_names
* Use Auto API for processor
* Fix processor tests
* Fix integration test
* Fix pipeline test
* Make tests faster
* Update conversion script
* Update conversion script
* Convert more checkpoints
* Update conversion script
* Fix typo
* Update docstrings
* Improve code snippets
* Fix doc tests
* Add more code examplesé
* Fix doc tests
* Add integration tests
* Fix unused variable
* revert
* Add GIT to Japanese README
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-03 14:17:18 +01:00
Konstantin Kotik
367fdf3330
MinNewTokensLengthLogitsProcessor
for .generate
method #20814 ( #20892 )
...
* feat: add min new length logit processor
* test: add min new length logit processor
* docs: add MinNewTokensLengthLogitsProcessor
* feat: import MinNewTokensLengthLogitsProcessor
* fix: update pytorch dummy objects
* refactor & fix: rename attributes and var and get rid of dynamic attribute
* tests: align test with new interface
* docs: fix typo
* docs: minor clarification
* Empty-Commit
* empty commit
* run automated quality edits
Co-authored-by: Joao Gante <joao@huggingface.co>
2023-01-03 06:29:02 -05:00
Hao Wang
375801d5e6
update pyknp to rhoknp ( #20890 )
...
* update pyknp to rhoknp
* fix linter
* fix linter
* fix linter
* fix linter
* fix linter
* support rhoknp==1.1.0, fix testcase
2022-12-31 01:22:26 -05:00
bofeng huang
47c9b22d08
Add generate kwargs to AutomaticSpeechRecognitionPipeline
( #20952 )
...
* Add generate kwargs to AutomaticSpeechRecognitionPipeline
* Add test for generation kwargs
2022-12-31 01:13:28 -05:00
bofeng huang
fe65657de1
Fix FP16 inference in TextGenerationPipeline ( #20913 )
...
* add torch_dtype attribute to Pipeline
* Use torch_dtype to cast input tensor type in AutomaticSpeechRecognitionPipeline
* Fix code quality
* Add TextGenerationPipeline fp16 test
* Fix code quality
* Remove useless require in tests
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2022-12-29 02:19:25 -05:00
Yih-Dar
5fa0b17c3d
[Past CI] 🔥 Leave Past CI failures in the past 🔥 ( #20861 )
...
* torch.jit._state
* Fix past CI
* Fix for perceiver
* Fix REALM
* Fix for Bloom
* Fix for SwinMode
* Fix for TrajectoryTransformerModel
* Fix for test_wav2vec2_with_lm
* make style
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-27 18:37:25 +01:00
Arthur
a081f292ca
[RobertaPreLayernom] Fixes the CI daily test ( #20886 )
...
get correct checkpoint
2022-12-23 19:55:17 +01:00
Nicolas Patry
f7f0ec2f54
Adding support for fp16
for asr pipeline. ( #20864 )
...
* Supporting `fp16` for asr pipeline
* Adding test.
* Style.
* Oops.
* Flake8 update ?
* Fixing flake8 ?
* Revert "Flake8 update ?"
This reverts commit 0b917fcb52
.
* Style (acctidentally deleted flake8 F401.)
* Move to a bigger test (no small whisper model, and s2t doesn't seem to
accept torch_dtype=fp16).
Also we need to use a GPU to actually compute on fp16.
* Using BatchFeature capability.
2022-12-23 10:18:45 +01:00
Syed Abdul Gaffar Shakhadri
15bc776fec
Add Onnx Config for PoolFormer ( #20868 )
...
poolformer onnx
Co-authored-by: syed <syed.abdul@sandlogic.com>
2022-12-23 01:30:57 -05:00
Younes Belkada
52dd2b61bf
[MobileNet-v2
] Fix ONNX typo ( #20860 )
...
* fix typo `onnx`
* fix test
2022-12-22 18:52:54 +01:00
Yih-Dar
39e620c134
Update HubertModelIntegrationTest.test_inference_keyword_spotting
( #20863 )
...
fix ci
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-21 18:40:14 +01:00
Yih-Dar
3090e70857
Fix past CI by skipping LevitModelTest.test_problem_types
( #20859 )
...
* Fix past CI
* Fix past CI
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-21 14:29:13 +01:00
İdil Sülo
0ae58204c6
Add visual prompt to processor of CLIPSeg model ( #20816 )
...
Adds visual_prompt argument to CLIPSegProcessor to enable image-guided segmentation
2022-12-21 15:23:45 +03:00
Younes Belkada
0d284bd574
Add BLIP ( #20716 )
...
* add new model like
* add v1
* v1
* v1
* vision encoder logits match
* v2
* fix
* add docstring
* CI tests pass
* fix tests
* make fixup
* add to `toctree`
* fix processors
* fix processors
* fix doc
* fill title
* add content doc
* remove from tokenization auto
* fix config
* change order
* add `# Copied from`
* few fixes
- add correct license on modeling text
- remove dummy argument
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* replace name
* refactor a bit
* more refactor
* remove unused arg
* make fixup + remove some `# Adapted from ...`
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* more `# Copied from`
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* now `generate` supports no prefix
* remove `FeatureExtractor`
* fix path
* correct dependency
* fix tests
* few fixes
* add integration tests
* add correct conversion script
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add `blip` to tokenization auto
* fix docstrings
* fix test + add image
* remove processor from uncorrect place
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* clean up a bit
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* clean pixel mask
* clean pixel mask
* fix `F`
* Update src/transformers/models/blip/modeling_blip.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix output
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix pad token id
* remove `token_type_ids`
* make fixup
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* make fixup
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* add comments
* Update src/transformers/models/blip/modeling_blip.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* remove `token_type_ids`
* make fixup
* better name
* replace with `image_attention_mask`
* refactor
* make fixup
* better docstring
* replace `answer_xx`
* remove ununsed args
* add `labels`
* add `labels`
* fix processing tests
* make fixup
* make fixup
* put correct repo
* remove `pad`
* remove `crop` and `center_crop`
* Update src/transformers/models/blip/image_processing_blip.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix
* remove `size_divisor`
* fix weights `init`
* remove unneeded functions
* add suggestions
* minor changes
- change slow test output for PT 1.13
- docstring order
* replace `feature_extractor` by `image_processor`
* fix doctests
* fix weight init order + add fp16 slow test
* add `blip` to doctest
* add correct repo name and fix test
* Update src/transformers/models/blip/processing_blip.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* fix tests
* use `convert_to_rgb` from `image_transforms`
* make fixup
* fix large loading issue
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-21 09:39:10 +01:00
NielsRogge
2875fa971c
[SegFormer] Add support for segmentation masks with one label ( #20279 )
...
* Add support for binary segmentation
* Fix loss calculation and add test
* Remove space
* use fstring
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2022-12-20 16:46:50 +01:00
fzyzcjy
ae3cbbcaf6
Fix tiny typo ( #20841 )
...
* Fix typo
* Update README.md
* Update run_mlm_flax_stream.py
* Update README.md
2022-12-20 03:17:59 -05:00
Thomas-MMJ
7ef3f19c3c
fix typo output not ouput in bitsandbytes trainer test ( #20839 )
...
fix typo output not ouput
typo was causing an error on pytest collection
2022-12-20 03:16:26 -05:00
Andreas Madsen
b4b613b102
Implement Roberta PreLayerNorm ( #20305 )
...
* Copy RoBERTa
* formatting
* implement RoBERTa with prelayer normalization
* update test expectations
* add documentation
* add convertion script for DinkyTrain weights
* update checkpoint repo
Unfortunately the original checkpoints assumes a hacked roberta model
* add to RoBERTa-PreLayerNorm docs to toc
* run utils/check_copies.py
* lint files
* remove unused import
* fix check_repo reporting wrongly a test is missing
* fix import error, caused by rebase
* run make fix-copies
* add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS
* Fix documentation <Facebook> -> Facebook
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup: Fix documentation <Facebook> -> Facebook
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add missing Flax header
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* expected_slice -> EXPECTED_SLICE
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update copies after rebase
* add missing copied from statements
* make fix-copies
* make prelayernorm explicit in code
* fix checkpoint path for the original implementation
* add flax integration tests
* improve docs
* update utils/documentation_tests.txt
* lint files
* Remove Copyright notice
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* make fix-copies
* Remove EXPECTED_SLICE calculation comments
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-19 09:30:17 +01:00