transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-31 02:02:21 +06:00

Author	SHA1	Message	Date
dependabot[bot]	32ff06403d	Bump redis from 4.1.4 to 4.5.3 in /examples/research_projects/decision_transformer (#22410 ) Bump redis in /examples/research_projects/decision_transformer Bumps [redis](https://github.com/redis/redis-py) from 4.1.4 to 4.5.3. - [Release notes](https://github.com/redis/redis-py/releases) - [Changelog](https://github.com/redis/redis-py/blob/master/CHANGES) - [Commits](https://github.com/redis/redis-py/compare/v4.1.4...v4.5.3) --- updated-dependencies: - dependency-name: redis dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-03-27 20:23:55 -04:00
Kshiteej K	3ec7a47664	[neptune] fix checkpoint bug with relative out_dir (#22102 ) * [neptune] fix checkpoint bug with relative out_dir * update imports * reformat with black * check neptune without imports * fix typing-related issue * run black on code * use os.path.sep instead of raw \ * simplify imports and remove type annotation * make ruff happy * apply review suggestions --------- Co-authored-by: Aleksander Wojnarowicz <alwojnarowicz@gmail.com>	2023-03-27 15:00:16 -04:00
Arthur	19ade2426a	[WIP]`NLLB-MoE` Adds the moe model (#22024 ) * Initial commit * update modeling code * update doc * add functions necessary * fix impotrs * revert changes * fixup * more styling to get going * remove standalone encoder * update code * styling * fix config and model * update code and some refactoring * make more tests pass * Adding NLLB-200 - MoE - 54.5B for no language left behind Fixes #21300 * fix mor common tests * styke * update testing file * update * update * Router2 doc * update check config with sparse layer * add dummy router * update current conversion script * create on the fly conversion script * Fixup * style * style 2 * fix empty return * fix return * Update default config sparse layers * easier to create sparse layers * update * update conversion script * update modeling * add to toctree * styling * make ruff happy * update docstring * update conversion script * update, will break tests but impelemting top2 * update * ❗local groups are supported here * ⚠️ Support for local groups is now removed ⚠️ This is because it has to work with model parallelism that we do not support * finish simplificaiton * Fix forward * style * fixup * Update modelling and test, refactoring * update tests * remove final layer)norm as it is done in the FF * routing works! Logits test added * nit in test * remove top1router * style * make sure sparse are tested. Had to change route_tokens a liottle bit * add support for unslip models when converting * fixup * style * update test s * update test * REFACTOR * encoder outputs match! * style * update testing * 🎉encoder and decoder logits match 🎉 * styleing * update tests * cleanup tests * fix router test and CIs * cleanup * cleanup test styling * fix tests * Finally the generation tests match! * cleanup * update test * style testing file * remove script * cleanup * more cleanup * nits * update * NLLB tokenizer is wrong and will be fixed soon * use LongTensors * update tests * revert some small changes * fix second expert sampling and batch prioritized routing * update tests * finish last tests * make ruff happy * update * ruff again * style * Update docs/source/en/model_doc/nllb-moe.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Updates based on review * style and fix import issue * nit * more nits * cleanup * styling * update test_seconde_expert_policy * fix name * last nit on the markdown examples --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 19:42:00 +02:00
Sylvain Gugger	057e1d7473	Fix quality	2023-03-27 13:17:14 -04:00
Donny Greenberg	f02e3a2b18	Hardware Auto-Setup for Examples (#22319 ) * Add initial remote hardware auto-setup docs * Fix a few typos and clarify some language * Add missing dependency * Update self-hosted launch script with Sylvain's comments. * Formatting. * Trigger CI * Style	2023-03-27 13:07:53 -04:00
Joao Gante	738944c9ee	Trainer: missing None check (#22404 ) missing None check	2023-03-27 18:04:28 +01:00
Joao Gante	53155b520d	Trainer: move Seq2SeqTrainer imports under the typing guard (#22401 )	2023-03-27 16:39:26 +01:00
NielsRogge	0e708178ed	[Pix2Struct] Add support to resize embeddings (#22394 ) * First draft * Fix integration test * Remove script * Fix test and typos * Fix one more test * Skip tied embeddings test * Remove line * Address comments	2023-03-27 11:38:07 -04:00
Sylvain Gugger	f6b80a0139	Transformers env safetensors (#22400 ) * Report safetensors version in transformers-cli env * Styling * Trigger CI maybe	2023-03-27 11:12:42 -04:00
Younes Belkada	d324b70f00	[`bnb`] Force `requires_grad` to be `False` (#22396 ) for rg to be `False`	2023-03-27 16:55:55 +02:00
Joao Gante	7dcd8703ef	Generate: support for left-padding on GPTNeoX and Llama (#22382 )	2023-03-27 15:48:23 +01:00
Nathan Fradet	5506d04969	Seq2seq trainer generation config arg (#22323 ) * seq2seq trainer and training arguments accepting GenerationConfig arg * seq2seq Trainer and training arguments docstring fixes * Update training_args_seq2seq.py docstring Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fixing trainer_seq2seq.py docstring Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * seq2seq trainer: legacy gen args back & GenerationConfig created at init * Seq2seq trainer: fix in case gen_config.max_new_tokens is None Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding legacy arg retrocompatibility * seq2seq trainer and training arguments accepting GenerationConfig arg * seq2seq Trainer and training arguments docstring fixes * Update training_args_seq2seq.py docstring Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fixing trainer_seq2seq.py docstring Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * seq2seq trainer: legacy gen args back & GenerationConfig created at init * Seq2seq trainer: fix in case gen_config.max_new_tokens is None Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding legacy arg retrocompatibility * seq2seq trainer: evaluate and predict untouched * Apply suggestions from code review Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * seq2seq trainer: adding init args, keeping IDEs hints --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-27 15:47:35 +01:00
Vladislav Sokolovskii	03966cacf9	Wav2Vec2ProcessorWithLM can return N best hypotheses now (#22235 ) * Wav2Vec2ProcessorWithLM can return N best hypotheses now Signed-off-by: Vladislav Sokolovskii <vladislav@parrothq.com> * Wav2Vec2ProcessorWithLM n_best cannot be None Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Batch decoding can return N best hypotheses now batch_decode was extended with the same functionality as decode function, N best hypotheses per sample can be returned Signed-off-by: Vladislav Sokolovskii <vladislav@parrothq.com> --------- Signed-off-by: Vladislav Sokolovskii <vladislav@parrothq.com> Co-authored-by: Vladislav Sokolovskii <vladislav@parrothq.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-27 10:37:46 -04:00
кѳѳsнī	66d1eee682	load_in_8bit now respects 'balanced' device maps in multi-gpu environments (#22377 ) balanced 8bit memory	2023-03-27 10:34:52 -04:00
Sylvain Gugger	8cfc6678da	Adapt find_tied_parameters to handle breaking change in Accelerate (#22360 )	2023-03-27 10:11:14 -04:00
Nicola Procopio	204737fcc5	Translated documentation in italian (#22388 ) * updated toctree * added and translated mdx documents	2023-03-27 09:48:49 -04:00
Charlie-Bell	d5c2c71c0f	Changed world_size() to get_world_size() bugfix (#22381 ) Edited one line in src/transormers/generation/utils.py. Changed dist.world_size() to dist.get_world_size() since world_size() doesn't exist in pytorch.dist.	2023-03-27 09:24:25 -04:00
Joao Gante	c746eb1603	TensorFlow: additional missing `cmake` dependencies in CI (#22383 ) * missing cmake * more cmake	2023-03-27 09:20:56 -04:00
Stas Bekman	cae78c46d6	[safetensors] don't use in `torch<1.10` (#22370 ) * [safetensors] don't use in pt<1.10 * better fix	2023-03-24 16:23:27 -04:00
Sylvain Gugger	cfab34e188	Fix TF pipeline job	2023-03-24 16:16:43 -04:00
Stas Bekman	500fce073b	[Trainer] add disclaimer that full_determinism is slow (#22368 )	2023-03-24 12:46:41 -07:00
Shubhamai	a0cbbba31f	Resnet flax (#21472 ) * [WIP] flax resnet * added pretrained flax models, results reproducible * Added pretrained flax models, results reproducible * working on tests * no real code change, just some comments * [flax] adding support for batch norm layers * fixing bugs related to pt+flax integration * removing loss from modeling flax output class * fixing classifier tests * fixing comments, model output * cleaning comments * review changes * review changes * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * renaming Flax to PyTorch --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-03-24 19:45:57 +00:00
Joao Gante	88dae78f4d	TensorFlow: pin maximum version to 2.12 (#22364 )	2023-03-24 18:45:03 +00:00
Samuel Bubán	3a7f5fa9d2	Improve error message (#22361 ) * Improve error message * Fix consistency	2023-03-24 18:09:01 +00:00
Sylvain Gugger	6587125c0a	Pin tensorflow-text to go with tensorflow (#22362 ) * Pin tensorflow-text to go with tensorflow * Make it more convenient to pin TensorFlow * setup don't like f-strings	2023-03-24 10:54:06 -04:00
Yih-Dar	01203475c9	Update docker files to use official torch 2.0.0 (#22357 ) * update docker files to use official torch 2.0.0 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-24 14:29:05 +01:00
Mitch Naylor	57f25f4b7f	Add Mega: Moving Average Equipped Gated Attention (#21766 ) * add mega file structure and plain pytorch version of mega source code * added config class with old naming conventions * filled in mega documentation * added config class and embeddings with optional token types * updated notes * starting the conversion process, deleted intermediate and added use_cache back to config * renamed config attributes in modeling_mega.py * checkpointing before refactoring incremental decoding functions * removed stateful incremental key/values for EMA and self-attention * refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask * MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement * more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention * bug fix in attention mask handling in MovingAverageGatedAttention * removed incremental state from GatedCrossAttention and removed IncrementalState class * finished gated cross attention and got MegaLayer working * fixed causal masking in mega decoder * fixed how padding and causal masks are passed through MegaLayer with and without k/v caching * finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids * added optional dense hidden layer for masked and causal LM classes * docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention * removed before_attn_fn in Mega class and updated docstrings and comments up to there * bug fix in MovingAverageGatedAttention masking * working conversion of MLM checkpoint in scratchpad script -- perfect matches * moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters * renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint * finished checkpoint conversion script * cleanup old class in mega config script * removed 'copied from' statements and passing integration tests * added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing * fixed tuple output of megamodel * all common tests passing after fixing issues in decoder, gradient retention, and initialization * added mega-specific tests, ready for more documentation and style checks * updated docstrings; checkpoint before style fixes * style and quality checks, fixed initialization problem in float_tensor, ready for PR * added mega to toctree * removed unnecessary arg in megaconfig * removed unused arg and fixed code samples with leftover roberta models * Apply suggestions from code review Applied all suggestions except the one renaming a class, as I'll need to update that througout Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA * removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms * reformatted .forward() docstrings to match style and removed unused mask input in cross-attention * removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights() * renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files * variable names in NFFN * manual Mega->MEGA changes in docs * Mega->MEGA in config auto * style and quality fixes * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments * commit before dealing with merge conflicts * made new attention activation functions available in ACT2FN and added generation test from OPT * style and quality in activations and tests * documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings * style and quality fixes after latest updates, before rotary position ids * causal mask in MegaBlock docstring + added missing device passing * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR * style and quality fixes + readme updates pointing to main --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-24 08:17:27 -04:00
Joao Gante	0fa46524ac	Generate: Add GPTNeoX integration test (#22346 )	2023-03-24 11:33:16 +00:00
Ashwin Mathur	b79607656b	Fix typo in Greedy Search Description (#22345 ) Fix typo in greedy search docs	2023-03-24 07:32:18 -04:00
James Reed	c0fa2aa0b8	[HFTracer] Make embeddings ops take on the dtype of the weight (#22347 ) * [HFTracer] Make embeddings ops take on the dtype of the weight * fix bug	2023-03-24 07:04:51 -04:00
Yih-Dar	e8cc02555e	Automatically create/update tiny models (#22275 ) * Automatically create or update tiny models * Skip failed tests * update workflow file * use revision --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-23 19:14:17 +01:00
кѳѳsнī	a92e0ad2e2	Enable training Llama with model or pipeline parallelism (#22329 ) * Llama - Move target tokens to final pipeline device if needed * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-23 13:15:51 -04:00
Joao Gante	502fec779b	Generate: add test for left-padding support (#22322 )	2023-03-23 17:00:22 +00:00
jeffhataws	ec9b18f62d	Fix --bf16 option support for Neuron after PR #22300 (#22307 ) This PR fixes the "RuntimeError: No CUDA GPUs are available" when running with --bf16 option on Neuron. Related PRs: https://github.com/huggingface/transformers/pull/20684 https://github.com/huggingface/transformers/pull/22300	2023-03-23 12:27:13 -04:00
Batese2001	aef488c503	Added type hints to TFDeiTModel (#22327 ) * Added type hints to TFDeiTModel * make style --------- Co-authored-by: Matt <rocketknight1@gmail.com>	2023-03-23 15:31:32 +00:00
Samuel Larkin	59b9351b78	Minor typo in pipeline FillMaskPipeline's documentation. (#22339 )	2023-03-23 11:14:11 -04:00
Sylvain Gugger	506e7c6361	Fix various imports (#22281 ) * Fix various imports * Fix copies * Fix import	2023-03-23 10:34:17 -04:00
Quentin Lhoest	053c2153f8	Mention why one needs to specify max_steps in Trainer (#22333 ) * Mention why one needs to specify max_steps in Trainer * dummy change to trigger CI	2023-03-23 15:26:51 +01:00
mollerup23	5a9eb31477	Fixed gradient checkpoint bug for TimeSeriesTransformer (#22272 ) * Fixed gradient checkpoint bug for this model * Updating PR indentation (maintainer feedback) * make fixup --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2023-03-23 08:45:13 -04:00
Younes Belkada	ff20f9cf36	[`MBart`] Add `accelerate` support for MBart (#22309 ) add `accelerate` support for MBart	2023-03-23 10:34:43 +01:00
Stas Bekman	61f79b2986	[gptj] support older pytorch version (#22325 ) * [gptj] support older pytorch version * contributor * contributor * make copies --------- Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Nick Hill <nickhill@us.ibm.com>	2023-03-22 18:35:04 -07:00
Sylvain Gugger	80e3b36361	Really fix quality due to ruff release	2023-03-22 20:56:22 -04:00
Sylvain	ef28df0572	Fix quality due to ruff release	2023-03-22 20:45:08 -04:00
Stas Bekman	73fdc8c5b4	[deepspeed zero3] need `generate(synced_gpus=True, ...)` (#22242 ) * [deepspeed zero3] need generate(synced_gpus=True, ...) * fix * rework per Sylvain's suggestion * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 12:18:57 -07:00
Yih-Dar	8b05ace014	Fix PipelineTests skip conditions (#22320 ) * check what tests fail * Skip failing tests * Skip failing tests * Skip failing tests * Skip failing tests * clean up * clean up --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-22 20:02:24 +01:00
Luc CAILLIAU	d62e7d8842	Chunkable token classification pipeline (#21771 ) * Chunkable classification pipeline The TokenClassificationPipeline is now able to process sequences longer than 512. No matter the framework, the model, the tokenizer. We just have to pass process_all=True and a stride number (optional). The behavior remains the same if you don't pass these optional parameters. For overlapping parts when using stride above 0, we consider only the max scores for each overlapped token in all chunks where the token is. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * update with latest black format * update black format * Update token_classification.py * Update token_classification.py * format correction * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update comments * Update src/transformers/pipelines/token_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Update token_classification.py Correct spaces, remove process_all and keep only stride. If stride is provided, the pipeline is applied to the whole text. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update chunk aggregation Update the chunk aggregation strategy based on entities aggregation. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py Remove unnecessary pop from outputs dict * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add chunking tests * correct formating * correct formatting * correct model id for test chunking * update scores with nested simplify * Update test_pipelines_token_classification.py * Update test_pipelines_token_classification.py * update model to a tiny one * Update test_pipelines_token_classification.py * Adding smaller test for chunking. * Fixup * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 14:13:20 -04:00
Tom Aarsen	f48d3314e4	docs: Resolve incorrect type typo in trainer methods (#22316 ) Resolve incorrect type typo in trainer methods	2023-03-22 11:57:08 -04:00
Younes Belkada	0f68a7f408	Add Pix2Struct (#21400 ) * v1 all keys match * clean up * forward pass ok * add correct image transform * generate works, logits matching * clean up * more refactor * revert * revert * clean up * clean ups * clean up * refactor * refactor * fix doc * fix tokenizer test * fix toctree * revert toctree * oops * few fixes * replace to `pixel_embeds` * make fixup * test processing & feat extractor * fix some tests * more fixes * make fixup * clean up * more clean up * add a single slow test * fix test * make fixup * fix * fix authors * fix toctree * update docs * add docstring * revert change * Update src/transformers/models/pix2struct/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix tokenizer * fix processor test * fix test * make fixup * refactor * fix config * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * format * fix * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup * add docstring * fix issues * fix * fix * fix * add slow test * fix * fix * fix batched issue * fix training issues * fix ci test * fix slow test * fix conversion script * remove unneeded classes * fix slow test * fix require backends * fix masked fill * revert * fix softmax * add large models support * fix conditional generation * few fixes * add instructions * rm unneeded file * Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py * fix ci test * fix ci test really * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix nit * fix nits * fix image processors nits * docstring * clean up * fix nit * fix tests * docstring nit * fix reshape * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix nit * fix repetition * refactor processor * make patch size consistent * refactor forward * fix docstring * fix max_patches issue * update docstirng * update docstring * fix coped from * add skip reasons * few fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * format * fix doctests * refactor and fix * fix doc build issue * fix processor test * small fix conversion script * replace correct weights * make fixup * fix some issues * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * revert config and fixes * Update src/transformers/models/pix2struct/image_processing_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * more details * fixes * fix processor * fix processor test * fix * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * fix processor * Update src/transformers/models/pix2struct/modeling_pix2struct.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add copied * make fixup * fix copies * update docstring * refactor * fix docstring * fix conversion script * fix vqa issue * replace to `flattened_patches` * nit * fix numpy issue * fix image processors * add batched vqa support * fix vqa conversion * make fixup * fix conversion script * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add correct docstring * update docstring * fix module level + channel dim * use `make_list_of_images` * refactor * correct docstring * fix authors * remove `data_format` * add header text test * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup * add checkpoints --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2023-03-22 16:53:52 +01:00
Joao Gante	fd3eb3e3cd	Beef up Llama tests (#22314 ) * tmp commit * beef up llama tests	2023-03-22 15:20:48 +00:00
Joao Gante	12febc20db	Generate: Export TF generate with a TF tokenizer (#22310 ) * Export TF generate with a TF tokenizer * remove unused lines	2023-03-22 15:00:20 +00:00

1 2 3 4 5 ...

12443 Commits