transformers

mirror of https://github.com/huggingface/transformers.git synced 2025-07-15 10:38:23 +06:00

History

Alazar 94306352f4 Port IDEFICS to tensorflow (#26870 ) * Initial commit * Just a copy of modeling_idefics.py that will be ported to TF * - Prepend TF to the name of all classes - Convert pytorch ops to TF (not all operations are converted yet) * Add TF imports * Add autotranslated files * Add TF classes to model_tf_auto.py * Add the TF classes in model_doc * include auto-translated code * Adopted from auto-translated version * Add a forgotten super().build * Add test code for TF version. * Fix indentation and load pytorch weights for now * Some fixes. Many tests are still failing but some are passing now. - I have added TODO's for some of the hacks I made to unblock me and I will address them soon - I have the processing_idefics.py hacked in my view to support TF temporarily * Add ALL_LAYERNORM_LAYERS to match pytorch * Revert "Add ALL_LAYERNORM_LAYERS to match pytorch" This reverts commit 7e0a35119b4d7a6284d04d8c543fba1b29e573c9 as it is not needed in the tf implementation. * Fix freeze_relevant_params() * Some more fixes * Fix test_attention_outputs * Add tf stuff to processing_idefics.py processing_idefics.py supports both pytorch and tf now. test_processor_idefics.py for pytorch is passing, so i didn't break anything but still some issues with tf. I also need to add tf tests in test_processor_idefics.py. * Pass return_tensors to image processing code and fix test * Pass return_tensors to the image processor __init__ * Fix several test cases - Make input to some of the forward pass of type `TFModelInputType` - Decorate main layer forward pass with `@unpack_inputs` - Decorate main layer with `@keras_serializable` - Pass `inputs` to TFIdeficsModel * Some more fixes forgotten in last commit * Fix processing code and vision_tf.py * Fix perceiver bug * Import from * Auto-add build() methods + style pass * Fix build() errors due to `None` being passed as shape to some layers * Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text * Fix pytorch weights load for tf2 There were a lot of `name=` missing in weight initialization code. * Attempt to fix CI * Add back accidently removed line * Remove torch-specific stuff from the TF test file * make fix-copies, make style, remove autotranslated files * Fixes to imports/docstrings * Let's try the from future import in desperation * Fix the core random_attention_mask fn to match the torch/flax behaviour * Clean random_attention_mask up correctly * Remove torch-only test * Fix loss shape, couple of nits * make style * Don't test for OOB embeddings because IDEFICS uses those deliberately * Fix loss computation to handle masking * Fix test failures when flattening * Fix some test failures - Add cross attention gate which was missing and wasn't being passed arround - Fix overwriting of image_attention_mask due to hack I had for dummy inputs * Add a proper stateless scaled_dot_product_attention * make style * Adding missing attribute from the PyTorch version * Small cleanups to decoupledlinearlayer in case that helps * Pass epsilon to LayerNormalization * Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding * Fix a bug in TFIdeficsGatedCrossAttentionLayer * Patching up build() methods * Constant self.inv_freq * Constant self.inv_freq * First working version The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear where the weights were mis-intialized (in_features,out_features) when it should be: (out_features, in_features) I have tested this so far with tiny-random and idefics-9b-instruct and gives correct output. I also dumped the final outputs for both pytorch and TF and they are identical. * Fix some test failures * remove print statement * Fix return_tensors * Fix CI test failure check_code_quality * Attempt to fix CI failures by running `make fixup` The hardcoded IDs in test_modeling_tf_idefics.py are for the integration test and makes that file unreadable and should probably be moved to a seperate file. * Attempt to fix tests_pr_documentation_tests * Fix a test failure in test_image_processing_idefics.py * Fix test test_pt_tf_model_equivalence * Fix a few failures * Tiny fix * Some minor fixes * Remove a duplicate test * Override a few test failures for IDEFICS - `test_keras_save_load` is passing now - `test_compile_tf_model` is still failing * Fix processing_idefics.py after rebase * Guard import keras with is_tf_available * fix check code quality * fix check code quality * Minor fixes * Skip test_save_load temporarily This test passed on my local box but fails on the CI, skipping for now to see if there are other remaining failures on the CI. * Run `ruff format tests src utils` * Fix last failing test, `test_compile_tf_model` * Add fixes for vision_tf.py I forgot to add this file in last commit. * Minor fixes * Replace "<<<" with "<<" for doc tests IDEFICS-9B is too big for doctest runner, so don't run it there * Make code more readable * Fix bug after code review I added a layer_norm_eps to IdeficsConfig but I don't even need it since the vision config has a layer_norm_eps. * Fix after code review Use original code tokenizer.convert_tokens_to_ids * Keep PyTorch as the default return_tensors * Fixes to modeling_tf after code review * Fixes from code review - Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST` - Pass 1e-5 to LayerNormalization in perceiver * Run ruff * Undo a change * Refactor processing code after Matt's suggestion * Remove TODO's that aren't needed anymore * For pytorch, Use original pytorch processing code from main Since this PR is a TF port it shouldn't make any modifications to pytorch IDEFICS code. This changes undo's the pytorch processing modifications I made and uses original code from main. * Update tests/models/idefics/test_modeling_idefics.py * Update tests/models/idefics/test_modeling_tf_idefics.py * Add missing imports for is_pt_tf_cross_test * [DO NOT MERGE]: This is a commit for debugging and will be reverted The cross test `test_pt_tf_model_equivalence` passes locally but fails when running on the CI. This commit is to help debug that and will be reverted. * Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted" This reverts commit 8f0d709ec5bd46685fb0b4259d914ffee794875b. * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted" This reverts commit 998cc38b8c3d313bf5e5eb55a7f5b7b881897b89. * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted" This reverts commit 1c695ac4219c4ae4d39b330b01744dc27deb7dd4. * Don't skip test_save_load IIRC test_save_load was also failing on the CI but not on my local box, it might be easier to debug that on the CI first than the cross tests * Debugging commit, will be reverted * Revert "Debugging commit, will be reverted" This reverts commit 8eafc8e41e20c4e95a3a90834f06a6e9f445e2d5. * Override `test_save_load` and push model to save Maybe this will help me repro this weird bug * pass my repo_id * add endpoint * Pass a temp (write) token just for this CI * Undo last few commits, still pushing to hub for model debugging The issue seems to be with save_pretrained(), when I looked at the model saved from the CI test failure it is basically empty and has no weights. `self.save_weights(..)` seems to be failing in save_pretrained but needs more debugging * Add logging to modeling tf utils, will be reverted just for debugging * Debugging, will revert * Revert "Debugging, will revert" This reverts commit 9d0d3075fb7c82d8cde3a5c76bc8f3876c5c55d3. * Revert "Add logging to modeling tf utils, will be reverted just for debugging" This reverts commit 774b6b7b1c17b3ce5d7634ade768f2f686cee617. * Remove `test_save_load` The CI failures are gone after my latest rebase, no idea why but I was still saving the model to my hub on HF and the tf_model.h5 file now has everything. * Run make fix-copies * Run ruff format tests src utils * Debugging commit, will be reverted * Run ruff, also trigger CI run * Run ruff again * Undo debugging commit --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>		2024-05-13 15:59:46 +01:00
..
internal	Generate: add `min_p` sampling (#30639 )	2024-05-09 14:36:53 +01:00
main_classes	Reboot Agents (#30387 )	2024-05-07 12:59:49 +02:00
model_doc	Port IDEFICS to tensorflow (#26870 )	2024-05-13 15:59:46 +01:00
tasks	Update object detection guide (#30683 )	2024-05-08 15:16:14 +01:00
_config.py	[#29174 ] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888 )	2024-04-08 14:21:16 +01:00
_redirects.yml	Extended semantic segmentation to image segmentation (#27039 )	2023-11-23 15:58:21 +00:00
_toctree.yml	Reboot Agents (#30387 )	2024-05-07 12:59:49 +02:00
accelerate.md	Fix typos (#25936 )	2023-09-04 11:15:12 +01:00
add_new_model.md	Remove add-new-model in favor of add-new-model-like (#30424 )	2024-04-24 09:38:18 +02:00
add_new_pipeline.md	add `push_to_hub` to pipeline (#29172 )	2024-04-16 15:34:04 +01:00
agents.md	Reboot Agents (#30387 )	2024-05-07 12:59:49 +02:00
attention.md	[Docs] Fix broken links and syntax issues (#28918 )	2024-02-08 14:13:35 -08:00
autoclass_tutorial.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
benchmarks.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
bertology.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
big_models.md	[docs] Big model loading (#29920 )	2024-04-01 18:47:32 -07:00
chat_templating.md	Deprecate default chat templates (#30346 )	2024-04-19 15:41:26 +01:00
community.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
contributing.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
conversations.md	Add sidebar tutorial for chat models (#30401 )	2024-04-25 19:38:48 +01:00
create_a_model.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
custom_models.md	[Docs] Add language identifiers to fenced code blocks (#28955 )	2024-02-12 10:48:31 -08:00
debugging.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
deepspeed.md	Rename torch.run to torchrun (#30405 )	2024-04-23 09:04:17 -07:00
fast_tokenizers.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
fsdp.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00
generation_strategies.md	Docs: fix `generate`-related rendering issues (#30600 )	2024-05-02 14:42:25 +01:00
glossary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
hf_quantizer.md	[CI] Quantization workflow (#29046 )	2024-02-28 10:09:25 -05:00
hpo_train.md	Remove-auth-token (#27060 )	2023-11-13 14:20:54 +01:00
index.md	Port IDEFICS to tensorflow (#26870 )	2024-05-13 15:59:46 +01:00
installation.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
llm_optims.md	Cache: Static cache as a standalone object (#30476 )	2024-04-30 16:37:19 +01:00
llm_tutorial_optimization.md	F.scaled_dot_product_attention support (#26572 )	2023-12-09 05:38:14 +09:00
llm_tutorial.md	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00
model_memory_anatomy.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
model_sharing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
model_summary.md	model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702 )	2024-03-23 18:29:39 -07:00
multilingual.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
notebooks.md	Enable doc in Spanish (#16518 )	2022-04-04 10:25:46 -04:00
pad_truncation.md	[Doc] Spanish translation of pad_truncation.md (#27890 )	2023-12-08 10:32:18 -08:00
peft.md	[`Peft`] `modules_to_save` support for peft integration (#27466 )	2023-11-14 10:32:57 +01:00
perf_hardware.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_infer_cpu.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md	Fix GroundingDINO, DPR after BERT SDPA update (#30506 )	2024-04-26 18:04:41 +01:00
perf_torch_compile.md	Fix rendering for `torch.compile()` docs (#25432 )	2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_cpu.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_gpu_one.md	Fix minor typo: softare => software (#29602 )	2024-03-12 10:39:56 +00:00
perf_train_special.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
performance.md	[docs] Update CPU/GPU inference docs (#26881 )	2023-10-31 09:44:51 -07:00
perplexity.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
philosophy.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
pipeline_tutorial.md	More fixes for doctest (#30265 )	2024-04-16 11:58:55 +02:00
pipeline_webserver.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pr_checks.md	[Docs] Fix spelling and grammar mistakes (#28825 )	2024-02-02 08:45:00 +01:00
preprocessing.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
quantization.md	Add HQQ quantization support (#29637 )	2024-05-02 17:51:49 +01:00
quicktour.md	Add HQQ quantization support (#29637 )	2024-05-02 17:51:49 +01:00
run_scripts.md	Fix broken link to Transformers notebooks (#30512 )	2024-04-29 10:57:51 +01:00
sagemaker.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
serialization.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
task_summary.md	More fixes for doctest (#30265 )	2024-04-16 11:58:55 +02:00
tasks_explained.md	[docs] Spanish translation of tasks_explained.md (#29224 )	2024-02-26 08:18:15 -08:00
testing.md	[doc] fix some typos and add `xpu` to the testing documentation (#29894 )	2024-03-28 09:42:49 +00:00
tf_xla.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tflite.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
tokenizer_summary.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
torchscript.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
trainer.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
training.md	🚨🚨🚨Deprecate `evaluation_strategy` to `eval_strategy`🚨🚨🚨 (#30190 )	2024-04-18 12:49:43 -04:00
troubleshooting.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00