transformers/docs/source/en
Alazar 94306352f4
Port IDEFICS to tensorflow (#26870)
* Initial commit

* Just a copy of modeling_idefics.py that will be ported to TF

* - Prepend TF to the name of all classes
- Convert pytorch ops to TF (not all operations are converted yet)

* Add TF imports

* Add autotranslated files

* Add TF classes to model_tf_auto.py

* Add the TF classes in model_doc

* include auto-translated code

* Adopted from auto-translated version

* Add a forgotten super().build

* Add test code for TF version.

* Fix indentation and load pytorch weights for now

* Some fixes. Many tests are still failing but some are passing now.

- I have added TODO's for some of the hacks I made to unblock me
  and I will address them soon
- I have the processing_idefics.py hacked in my view to support TF temporarily

* Add ALL_LAYERNORM_LAYERS to match pytorch

* Revert "Add ALL_LAYERNORM_LAYERS to match pytorch"

This reverts commit 7e0a35119b4d7a6284d04d8c543fba1b29e573c9 as it
is not needed in the tf implementation.

* Fix freeze_relevant_params()

* Some more fixes

* Fix test_attention_outputs

* Add tf stuff to processing_idefics.py

processing_idefics.py supports both pytorch and tf now.

test_processor_idefics.py for pytorch is passing, so i didn't break anything
but still some issues with tf. I also need to add tf tests in
test_processor_idefics.py.

* Pass return_tensors to image processing code and fix test

* Pass return_tensors to the image processor __init__

* Fix several test cases

- Make input to some of the forward pass of type `TFModelInputType`
- Decorate main layer forward pass with `@unpack_inputs`
- Decorate main layer with `@keras_serializable`
- Pass `inputs` to TFIdeficsModel

* Some more fixes forgotten in last commit

* Fix processing code and vision_tf.py

* Fix perceiver bug

* Import from

* Auto-add build() methods + style pass

* Fix build() errors due to `None` being passed as shape to some layers

* Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text

* Fix pytorch weights load for tf2

There were a lot of `name=` missing in weight initialization code.

* Attempt to fix CI

* Add back accidently removed line

* Remove torch-specific stuff from the TF test file

* make fix-copies, make style, remove autotranslated files

* Fixes to imports/docstrings

* Let's try the from future import in desperation

* Fix the core random_attention_mask fn to match the torch/flax behaviour

* Clean random_attention_mask up correctly

* Remove torch-only test

* Fix loss shape, couple of nits

* make style

* Don't test for OOB embeddings because IDEFICS uses those deliberately

* Fix loss computation to handle masking

* Fix test failures when flattening

* Fix some test failures

- Add cross attention gate which was missing and wasn't being passed arround
- Fix overwriting of image_attention_mask due to hack I had for dummy inputs

* Add a proper stateless scaled_dot_product_attention

* make style

* Adding missing attribute from the PyTorch version

* Small cleanups to decoupledlinearlayer in case that helps

* Pass epsilon to LayerNormalization

* Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding

* Fix a bug in TFIdeficsGatedCrossAttentionLayer

* Patching up build() methods

* Constant self.inv_freq

* Constant self.inv_freq

* First working version

The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear
where the weights were mis-intialized (in_features,out_features)
when it should be: (out_features, in_features)

I have tested this so far with tiny-random and idefics-9b-instruct
and gives correct output.

I also dumped the final outputs for both pytorch and TF
and they are identical.

* Fix some test failures

* remove print statement

* Fix return_tensors

* Fix CI test failure check_code_quality

* Attempt to fix CI failures by running `make fixup`

The hardcoded IDs in test_modeling_tf_idefics.py are for the integration
test and makes that file unreadable and should probably be moved to a seperate file.

* Attempt to fix tests_pr_documentation_tests

* Fix a test failure in test_image_processing_idefics.py

* Fix test test_pt_tf_model_equivalence

* Fix a few failures

* Tiny fix

* Some minor fixes

* Remove a duplicate test

* Override a few test failures for IDEFICS

- `test_keras_save_load` is passing now
- `test_compile_tf_model` is still failing

* Fix processing_idefics.py after rebase

* Guard import keras with is_tf_available

* fix check code quality

* fix check code quality

* Minor fixes

* Skip test_save_load temporarily

This test passed on my local box but fails on the CI, skipping
for now to see if there are other remaining failures on the CI.

* Run `ruff format tests src utils`

* Fix last failing test, `test_compile_tf_model`

* Add fixes for vision_tf.py

I forgot to add this file in last commit.

* Minor fixes

* Replace "<<<" with "<<" for doc tests

IDEFICS-9B is too big for doctest runner, so don't run it there

* Make code more readable

* Fix bug after code review

I added a layer_norm_eps to IdeficsConfig but I don't even need it
since the vision config has a layer_norm_eps.

* Fix after code review

Use original code tokenizer.convert_tokens_to_ids

* Keep PyTorch as the default return_tensors

* Fixes to modeling_tf after code review

* Fixes from code review

- Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST`
- Pass 1e-5 to LayerNormalization in perceiver

* Run ruff

* Undo a change

* Refactor processing code after Matt's suggestion

* Remove TODO's that aren't needed anymore

* For pytorch, Use original pytorch processing code from main

Since this PR is a TF port it shouldn't make any modifications
to pytorch IDEFICS code. This changes undo's the pytorch processing
modifications I made and uses original code from main.

* Update tests/models/idefics/test_modeling_idefics.py

* Update tests/models/idefics/test_modeling_tf_idefics.py

* Add missing imports for is_pt_tf_cross_test

* [DO NOT MERGE]: This is a commit for debugging and will be reverted

The cross test `test_pt_tf_model_equivalence` passes locally but
fails when running on the CI. This commit is to help debug that
and will be reverted.

* Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted"

This reverts commit 8f0d709ec5bd46685fb0b4259d914ffee794875b.

* [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted

* [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted

* Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"

This reverts commit 998cc38b8c3d313bf5e5eb55a7f5b7b881897b89.

* Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"

This reverts commit 1c695ac4219c4ae4d39b330b01744dc27deb7dd4.

* Don't skip test_save_load

IIRC test_save_load was also failing on the CI but not on my local
box, it might be easier to debug that on the CI first than the cross tests

* Debugging commit, will be reverted

* Revert "Debugging commit, will be reverted"

This reverts commit 8eafc8e41e20c4e95a3a90834f06a6e9f445e2d5.

* Override `test_save_load` and push model to save

Maybe this will help me repro this weird bug

* pass my repo_id

* add endpoint

* Pass a temp (write) token just for this CI

* Undo last few commits, still pushing to hub for model debugging

The issue seems to be with save_pretrained(),  when I looked at the model saved
from the CI test failure it is basically empty and has no weights.
`self.save_weights(..)` seems to be failing in save_pretrained but needs
more debugging

* Add logging to modeling tf utils, will be reverted just for debugging

* Debugging, will revert

* Revert "Debugging, will revert"

This reverts commit 9d0d3075fb7c82d8cde3a5c76bc8f3876c5c55d3.

* Revert "Add logging to modeling tf utils, will be reverted just for debugging"

This reverts commit 774b6b7b1c17b3ce5d7634ade768f2f686cee617.

* Remove `test_save_load`

The CI failures are gone after my latest rebase, no idea why
but I was still saving the model to my hub on HF and the tf_model.h5
file now has everything.

* Run make fix-copies

* Run ruff format tests src utils

* Debugging commit, will be reverted

* Run ruff, also trigger CI run

* Run ruff again

* Undo debugging commit

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2024-05-13 15:59:46 +01:00
..
internal Generate: add min_p sampling (#30639) 2024-05-09 14:36:53 +01:00
main_classes Reboot Agents (#30387) 2024-05-07 12:59:49 +02:00
model_doc Port IDEFICS to tensorflow (#26870) 2024-05-13 15:59:46 +01:00
tasks Update object detection guide (#30683) 2024-05-08 15:16:14 +01:00
_config.py [#29174] ImportError Fix: Trainer with PyTorch requires accelerate>=0.20.1 Fix (#29888) 2024-04-08 14:21:16 +01:00
_redirects.yml Extended semantic segmentation to image segmentation (#27039) 2023-11-23 15:58:21 +00:00
_toctree.yml Reboot Agents (#30387) 2024-05-07 12:59:49 +02:00
accelerate.md Fix typos (#25936) 2023-09-04 11:15:12 +01:00
add_new_model.md Remove add-new-model in favor of add-new-model-like (#30424) 2024-04-24 09:38:18 +02:00
add_new_pipeline.md add push_to_hub to pipeline (#29172) 2024-04-16 15:34:04 +01:00
agents.md Reboot Agents (#30387) 2024-05-07 12:59:49 +02:00
attention.md [Docs] Fix broken links and syntax issues (#28918) 2024-02-08 14:13:35 -08:00
autoclass_tutorial.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
benchmarks.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
bertology.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
big_models.md [docs] Big model loading (#29920) 2024-04-01 18:47:32 -07:00
chat_templating.md Deprecate default chat templates (#30346) 2024-04-19 15:41:26 +01:00
community.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
contributing.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
conversations.md Add sidebar tutorial for chat models (#30401) 2024-04-25 19:38:48 +01:00
create_a_model.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
custom_models.md [Docs] Add language identifiers to fenced code blocks (#28955) 2024-02-12 10:48:31 -08:00
debugging.md [Docs] Fix spelling and grammar mistakes (#28825) 2024-02-02 08:45:00 +01:00
deepspeed.md Rename torch.run to torchrun (#30405) 2024-04-23 09:04:17 -07:00
fast_tokenizers.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
fsdp.md [docs] Trainer docs (#28145) 2023-12-20 10:37:23 -08:00
generation_strategies.md Docs: fix generate-related rendering issues (#30600) 2024-05-02 14:42:25 +01:00
glossary.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
hf_quantizer.md [CI] Quantization workflow (#29046) 2024-02-28 10:09:25 -05:00
hpo_train.md Remove-auth-token (#27060) 2023-11-13 14:20:54 +01:00
index.md Port IDEFICS to tensorflow (#26870) 2024-05-13 15:59:46 +01:00
installation.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
llm_optims.md Cache: Static cache as a standalone object (#30476) 2024-04-30 16:37:19 +01:00
llm_tutorial_optimization.md F.scaled_dot_product_attention support (#26572) 2023-12-09 05:38:14 +09:00
llm_tutorial.md Generate: update links on LLM tutorial doc (#30550) 2024-04-30 18:14:12 +01:00
model_memory_anatomy.md 🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 (#30190) 2024-04-18 12:49:43 -04:00
model_sharing.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
model_summary.md model_summary.md - Restore link to Harvard's Annotated Transformer. (#29702) 2024-03-23 18:29:39 -07:00
multilingual.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
notebooks.md Enable doc in Spanish (#16518) 2022-04-04 10:25:46 -04:00
pad_truncation.md [Doc] Spanish translation of pad_truncation.md (#27890) 2023-12-08 10:32:18 -08:00
peft.md [Peft] modules_to_save support for peft integration (#27466) 2023-11-14 10:32:57 +01:00
perf_hardware.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_infer_cpu.md [Docs] Fix spelling and grammar mistakes (#28825) 2024-02-02 08:45:00 +01:00
perf_infer_gpu_one.md Fix GroundingDINO, DPR after BERT SDPA update (#30506) 2024-04-26 18:04:41 +01:00
perf_torch_compile.md Fix rendering for torch.compile() docs (#25432) 2023-08-10 13:25:00 +02:00
perf_train_cpu_many.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_cpu.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_gpu_many.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_gpu_one.md Fix minor typo: softare => software (#29602) 2024-03-12 10:39:56 +00:00
perf_train_special.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
perf_train_tpu_tf.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
performance.md [docs] Update CPU/GPU inference docs (#26881) 2023-10-31 09:44:51 -07:00
perplexity.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
philosophy.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
pipeline_tutorial.md More fixes for doctest (#30265) 2024-04-16 11:58:55 +02:00
pipeline_webserver.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
pr_checks.md [Docs] Fix spelling and grammar mistakes (#28825) 2024-02-02 08:45:00 +01:00
preprocessing.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
quantization.md Add HQQ quantization support (#29637) 2024-05-02 17:51:49 +01:00
quicktour.md Add HQQ quantization support (#29637) 2024-05-02 17:51:49 +01:00
run_scripts.md Fix broken link to Transformers notebooks (#30512) 2024-04-29 10:57:51 +01:00
sagemaker.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
serialization.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
task_summary.md More fixes for doctest (#30265) 2024-04-16 11:58:55 +02:00
tasks_explained.md [docs] Spanish translation of tasks_explained.md (#29224) 2024-02-26 08:18:15 -08:00
testing.md [doc] fix some typos and add xpu to the testing documentation (#29894) 2024-03-28 09:42:49 +00:00
tf_xla.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
tflite.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
tokenizer_summary.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
torchscript.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
trainer.md 🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 (#30190) 2024-04-18 12:49:43 -04:00
training.md 🚨🚨🚨Deprecate evaluation_strategy to eval_strategy🚨🚨🚨 (#30190) 2024-04-18 12:49:43 -04:00
troubleshooting.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00