* Fix FSDP Initialization for resume training
* Added init_fsdp function to work with dummy values
* Fix FSDP initialization for resuming training
* Added CUDA decorator for tests
* Added torch_gpu decorator to FSDP tests
* Fixup for failing code quality tests
* add idefics
* conflicts after merging main
* enable tests but need to fix some
* fix tests
* no print
* fix/skip some slow tests
* continue not skip
* rebasing broken smth, this is the fix
* mistral qna start
* mixtral qna
* oops
* qwen2 qna
* qwen2moe qna
* add missing input embed methods
* add copied to all methods, can't directly from llama due to the prefix
* make top level copied from
* refactor: benchmarks
Based on a discussion with @LysandreJik & @ArthurZucker, the goal of
this PR is to improve transformers' benchmark system.
This is a WIP, for the moment the infrastructure required to make things
work is not ready. Will update the PR description when it is the case.
* feat: add db init in benchmarks CI
* fix: pg_config is missing in runner
* fix: add psql to the runner
* fix: connect info from env vars + PR comments
* refactor: set database as env var
* fix: invalid working directory
* fix: `commit_msg` -> `commit_message`
* fix: git marking checked out repo as unsafe
* feat: add logging
* fix: invalid device
* feat: update grafana dashboard for prod grafana
* feat: add `commit_id` to header table
* feat: commit latest version of dashboard
* feat: move measurements into json field
* feat: remove drop table migration queries
* fix: `torch.arrange` -> `torch.arange`
* fix: add missing `s` to `cache_position` positional argument
* fix: change model
* revert: `cache_positions` -> `cache_position`
* fix: set device for `StaticCache`
* fix: set `StaticCache` dtype
* feat: limit max cache len
* fix script
* raise error on failure!
* not try catch
* try to skip generate compilation
* update
* update docker image!
* update
* update again!@
* update
* updates
* ???
* ??
* use `torch.cuda.synchronize()`
* fix json
* nits
* fix
* fixed!
* f**k
* feat: add TTNT panels
* feat: add try except
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
* Generate using exported model and enable gemma2-2b in ExecuTorch
* [run_slow] gemma, gemma2
* truncate expected output message
* Bump required torch version to support gemma2 export
* [run_slow] gemma, gemma2
---------
Co-authored-by: Guang Yang <guangyang@fb.com>
* Default synced_gpus to True when using FullyShardedDataParallel
Fixes#30228
Related:
* https://github.com/pytorch/pytorch/issues/100069
* https://github.com/pytorch/pytorch/issues/123962
Similar to DeepSpeed ZeRO Stage 3, when using FSDP with multiple GPUs and differently sized data per rank, the ranks reach different synchronization points at the same time, leading to deadlock
To avoid this, we can automatically set synced_gpus to True if we detect that a PreTrainedModel is being managed by FSDP using _is_fsdp_managed_module, which was added in 2.0.0 for torch.compile: https://github.com/pytorch/pytorch/blob/v2.0.0/torch/distributed/fsdp/_dynamo_utils.py
* Remove test file
* ruff formatting
* ruff format
* Update copyright year
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Add test for FSDP-wrapped model generation
Before #33483, these tests would have hung for 10 minutes before crashing due to a timeout error
* Ruff format
* Move argparse import
* Remove barrier
I think this might cause more problems if one of the workers was killed
* Move import into function to decrease load time
https://github.com/huggingface/transformers/pull/33483#discussion_r1787972735
* Add test for accelerate and Trainer
https://github.com/huggingface/transformers/pull/33483#discussion_r1790309675
* Refactor imports
* Ruff format
* Use nullcontext
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Allow for hyphenated field names in long-options
argparse converts hyphens into underscores before assignment (e.g., an
option passed as `--long-option` will be stored under `long_option`), So
there is no need to pass options as literal attributes, as in
`--long_option` (with an underscore instead of a hyphen). This commit
ensures that this behavior is respected by `parse_args_into_dataclasses`
as well.
Issue: #33933
Co-authored-by: Daniel Marti <mrtidm@amazon.com>