transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
NielsRogge	f9a0008d2d	Add VideoMAE (#17821 ) * First draft * Add VideoMAEForVideoClassification * Improve conversion script * Add VideoMAEForPreTraining * Add VideoMAEFeatureExtractor * Improve VideoMAEFeatureExtractor * Improve docs * Add first draft of model tests * Improve VideoMAEForPreTraining * Fix base_model_prefix * Make model take pixel_values of shape (B, T, C, H, W) * Add loss computation of VideoMAEForPreTraining * Improve tests * Improve model testsé * Make all tests pass * Add VideoMAE to main README * Add tests for VideoMAEFeatureExtractor * Add integration test * Improve conversion script * Rename patch embedding class * Remove VideoMAELayer from init * Update design of patch embeddings * Improve comments * Improve conversion script * Improve conversion script * Add conversion of pretrained model * Add loss verification of pretrained model * Add loss verification of unnormalized targets * Add integration test for pretraining model * Apply suggestions from code review * Fix bug to make feature extractor resize only shorter edge * Address more comments * Improve normalization of videos * Add doc examples * Move constants to dedicated script * Remove scripts * Transfer checkpoints, fix docs * Update script * Update image mean and std * Fix doc tests * Set return_tensors to NumPy by default * Revert the previous change Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-04 18:02:55 +02:00
Thomas Wang	672b66262a	Add FX support for torch.baddbmm andd torch.Tensor.baddbmm (#18363 )	2022-08-04 16:02:16 +02:00
Sylvain Gugger	df28de0581	Fix load of model checkpoints in the Trainer (#18470 )	2022-08-04 08:22:25 -04:00
Kian Sierra McGettigan	330247ede2	Update no trainer scripts for multiple-choice (#18468 ) * swag_no_trainer updated for with gather_metrics * Removed unused variable samples_seen	2022-08-04 07:29:32 -04:00
Michael Benayoun	c74befc9e3	HFTracer.trace can now take callables and torch.nn.Module (#18457 ) * Enable HFTracer to trace with custom dummy inputs instead of pre-computed ones * Add HFTracer.trace docstring, and make it possible to handle callable and torch.nn.Module in general * Remove pdb comment * Apply suggestions	2022-08-04 13:29:18 +02:00
nlpcat	fc1d841b2d	change shape to support dynamic batch input in tf.function XLA generate for tf serving (#18372 ) * change shape to support dynamic batch input in tf.generate * add tests Co-authored-by: nlpcatcode <nlpcodecat@gmail.com>	2022-08-04 11:26:11 +01:00
Thomas Wang	b69a62d579	[BLOOM] Clean modeling code (#18344 ) * Cleanup some code * Improve signatures * Try to reduce the number of reshape/copies * I don't think we actually need the layer_num scaling trick * No need for duplication * Try to fix beam_search * Fix beam search * Removing layer num normalization seems to be breaking * Not sure self.layer_number normalization actually matters * Try and be backward compatible * Try to fix beam_search * Revert attempt to be backward compatible * Improve documentation on past_key_values format * Optimize the device allocation in case of hidden_states in multiple devices * No need to manually cast the values to a specific device * Rename with long version of variables * Improve type hinting * Add comment that explains that some methods return views * Actually i think the attention casting only makes sense when we use torch.float16 * We don't actually need layer_number to be passed anymore * Fix FX test * Bypass torch.baddbmm * Apply suggestions from code review * Add comment about support for torchScript v1.11 * fix ONNX support for bloom (#18456) Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>	2022-08-04 11:08:03 +02:00
LSinev	02b176c4ce	Fix torch version comparisons (#18460 ) Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py	2022-08-03 13:37:18 -04:00
Sayak Paul	be41eaf55f	fix: keras fit tests for segformer tf and minor refactors. (#18412 ) * fix: keras fit tests for segformer tf and minor refactors. * refactor: test_keras_fit to make it simpler using the existing one. * fix: styling issues.	2022-08-03 16:39:54 +01:00
Alara Dirik	fc546332d7	add zero-shot obj detection notebook to docs (#18453 )	2022-08-03 17:14:39 +03:00
Daniel Suess	8fb7c908c8	Fix failing tests for XLA generation in TF (#18298 ) * Fix failing test_xla_generate_slow tests * Fix failing speech-to-text xla_generate tests	2022-08-03 09:45:15 -04:00
Omar Sanseviero	a507908cd3	Update pinned hhub version (#18448 ) * Update pinned hhub version * Make style	2022-08-03 08:37:42 -04:00
Ritik Nandwal	3db4378bd7	Update no trainer scripts for language modeling and image classification examples (#18443 ) * Update no_trainer script for image-classification * Update no_trainer scripts for language-modeling examples * Remove unused variable * Removing truncation from losses array for language modeling examples	2022-08-03 08:33:18 -04:00
Ian Castillo	10e1ec9a8c	Add Spanish translation of run_scripts.mdx (#18415 ) * Add file in spanish docs to be translated * Translate first two sections to Spanish * Translate four additional sections to Spanish * Finish translation to Spanish * Improve writing style in Spanish * Add suggested changes from reviewer	2022-08-03 07:32:20 -04:00
Gary Miguel	9d7b70bcd7	support ONNX export of XDropout in deberta{,_v2} and sew_d (#17502 ) * support ONNX export of XDropout in deberta{,_v2} * black * copy to sew_d * add test * isort * use pytest.mark.filterwarnings * review comments	2022-08-03 06:33:44 -04:00
Steven Liu	92915ebec2	Update _toctree.yml (#18440 ) This PR moves GroupViT and LXMert to their correct sections. As pointed out by @NielsRogge and @LysandreJik, GroupViT and LXMert are both multimodal models.	2022-08-03 12:26:01 +02:00
Sourab Mangrulkar	22a0dd2ef7	fixing error when using sharded ddp (#18435 )	2022-08-03 08:39:58 +05:30
Christopher Akiki	5096a654b7	Add programming languages (#18434 ) The current wording makes it sound as if the programming languages are part of the 46 natural languages.	2022-08-02 16:02:25 -04:00
David	042f420364	Update pipeline word heuristic to work with whitespace in token offsets (#18402 ) * Update pipeline word heuristic to work with whitespace in token offsets This change checks for whitespace in the input string at either the character preceding the token or in the first character of the token. This works with tokenizers that return offsets excluding whitespace between words or with offsets including whitespace. fixes #18111 starting * Use smaller model, ensure expected tokenization * Re-run CI (please squash)	2022-08-02 15:31:01 -04:00
Yih-Dar	c382ed8a2f	Accept `trust_remote_code` and ignore it in `PreTrainedModel.from_pretrained` (#18428 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 21:03:59 +02:00
João Lages	dbd9641c8c	Improve `generate` docstring (#18198 ) * improve generate docstring * Remove 'defaults to None' comment	2022-08-02 13:22:55 -04:00
Yih-Dar	5546fb61ab	fix run_clip README (#18332 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 19:14:46 +02:00
Yih-Dar	2959d09072	Fix `test_load_default_pipelines_tf` test error (#18422 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 18:51:10 +02:00
Alara Dirik	8ae7784256	update maskformer docs (#18423 ) * update maskformer docs * fix typo	2022-08-02 18:43:58 +03:00
Yih-Dar	0b8c1b6994	Change audio kwarg to images in TROCR processor (#18421 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 15:04:45 +02:00
Yih-Dar	dd21fb378f	Fix the hub user name in a longformer doctest checkpoint (#18418 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-02 15:04:10 +02:00
Piotr Dabkowski	68a894a587	Fix uninitialized parameter in conformer relative attention. (#18368 ) `torch.Tensor` creates an unitialized tensor (as via `torch.empty`), this leads to undeterministic behavior, poor initialization, and nans if you have unlucky init. The paper does not specify the initialization for bias terms, so I guess zero seems like a good choice - no bias initially. `torch.Tensor` is usually populated with zeros, so this fix will be close to the intended behavior: ``` >>> torch.Tensor(100, 100).sum() tensor(0.) >>> torch.Tensor(100, 100).sum() tensor(nan) >>> torch.Tensor(100, 100).sum() tensor(0.) ```	2022-08-02 10:34:10 +01:00
Yassine	df5e4232f5	fix: create a copy for tokenizer object (#18408 )	2022-08-01 15:32:12 -04:00
Kelvin Kong	24845aeb6d	Layoutlmv2 tesseractconfig (#17733 ) * Added option for users to modify config parameter used by pytesseract during feature extraction - Added optional 'tess_config' kwarg when setting up LayoutLMV2 processor that is used by pytesseract during feature extraction - Eg. Can be used to modify psm values by setting tess_config to '--psm 7' - Different psm values significantly influences the output of layoutlmv2 * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated variable names to be more explicit * Fixed styles * Added option for users to modify config parameter when calling pytesseract during feature extraction - Added option to set "tesseract_config" parameter during LayoutLMV3 processor initialization - Can be used to modify PSM values, eg. by setting tesseract_config="--psm 6" * Removed from function signature Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2022-08-01 12:24:43 -04:00
Steven Liu	151a2aaa4e	Split model list on modality (#18328 ) * 📝 split up model list * Adapt script to reorg * apply niels feedback Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>	2022-08-01 11:10:20 -05:00
Sylvain Gugger	01db72abd4	Rewrite push_to_hub to use upload_files (#18366 ) * Rewrite push_to_hub to use upload_files * Adapt the doc a bit * Address review comments and clean doc	2022-08-01 12:07:30 -04:00
Duong A. Nguyen	3909d7f139	Add Flax BART pretraining script (#18297 ) * add bart pretraining flax script * fixup * add bart pretraining flax script * add BART to README * add BART to README * add BART to README * add BART to README * add BART to README * add bos eos document * Update README.md * Update README.md * Update examples/flax/language-modeling/run_bart_dlm_flax.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * final * final * final * remove use_auth_token ing from_config Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2022-08-01 12:06:30 -04:00
Sylvain Gugger	941d233153	Fix ROUGE add example check and update README (#18398 ) * Fix ROUGE add example check and update README * Stay consistent in values	2022-08-01 11:14:49 -04:00
Ikuya Yamada	62098b9348	Adding fine-tuning models to LUKE (#18353 ) * add LUKE models for downstream tasks * add new LUKE models to docs * fix typos * remove commented lines * exclude None items from tuple return values	2022-08-01 11:09:47 -04:00
NielsRogge	7b9e995b70	Fix docs (#18399 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-01 17:02:51 +02:00
Sylvain Gugger	e0bc4c73e8	Add balanced strategies for device_map in from_pretrained (#18349 ) * Add balanced strategies for device_map in from_pretrained * Add safeguards for Accelerate version * Update src/transformers/modeling_utils.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Style Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2022-08-01 10:28:26 -04:00
NielsRogge	39e76d76fd	Fix doc tests (#18397 ) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-08-01 15:56:10 +02:00
Arthur	1141371103	Fix OPT doc tests (#18365 )	2022-08-01 15:19:45 +02:00
Sylvain Gugger	af1e6b4d87	Add evaluate to test dependencies (#18396 )	2022-08-01 08:55:44 -04:00
Yih-Dar	bd6d1b4300	Add a check regarding the number of occurrences of ``` (#18389 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-08-01 14:23:02 +02:00
YouJiacheng	1cd7c6f154	Fix from_pretrained kwargs passing (#18387 ) Fix #18385 I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.	2022-08-01 08:16:24 -04:00
amyeroberts	96b5d7db9c	Remove pt-like calls on tf tensor (#18393 )	2022-08-01 13:06:30 +01:00
Ogundepo Odunayo	679d68a11b	Correct the spelling of bleu metric (#18375 )	2022-08-01 07:51:27 -04:00
atturaioe	1f84399171	Migrate metric to Evaluate in Pytorch examples (#18369 ) * Migrate metric to Evaluate in pytorch examples * Remove unused imports	2022-08-01 07:40:25 -04:00
dependabot[bot]	25ec12eaf7	Bump mistune from 0.8.4 to 2.0.3 in /examples/research_projects/lxmert (#18370 ) Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 04:46:57 -04:00
dependabot[bot]	a7360385f4	Bump mistune in /examples/research_projects/visual_bert (#18371 ) Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3. - [Release notes](https://github.com/lepture/mistune/releases) - [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst) - [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3) --- updated-dependencies: - dependency-name: mistune dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 04:46:31 -04:00
Sourab Mangrulkar	b2e4b091f0	fix FSDP ShardedGradScaler (#18358 ) renaming it	2022-07-30 10:07:56 +05:30
Yih-Dar	51227e26ab	Fix TFSegformerForSemanticSegmentation doctest (#18362 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-07-29 16:30:59 +02:00
Michael Benayoun	4e2f4a92dd	[FX] Symbolic trace for Bloom (#18356 ) * Bloom model can now be traced * Bloom traced model can be torch scripted and serialized * Bloom can be traced with variable keyword arguments * Enable XLNet support * Disable XLNet for now	2022-07-29 16:12:27 +02:00
Yih-Dar	1763770bd9	Fix some doctests (#18359 ) * Fix some doctests Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-07-29 14:13:28 +02:00

1 2 3 4 5 ...

10370 Commits