mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-04 05:10:06 +06:00
Image transforms library (#18520)
* Adapt FE methods to transforms library * Mixin for saving the image processor * Base processor skeleton * BatchFeature for packaging image processor outputs * Initial image processor for GLPN * REmove accidental import * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Import BatchFeature from feature_extraction_utils * Fixup and docs * Fixup and docs * BatchFeature for packaging image processor outputs * Import BatchFeature from feature_extraction_utils * Fixup and docs * Mixin for saving the image processor * Fixup and docs * Add rescale back and remove ImageType * fix import mistake * Fix enum var reference * Can transform and specify image data format * Remove redundant function * Update reference * Data format flag for rescale * Fix typo * Fix dimension check * Fixes to make IP and FE outputs match * Add tests for transforms * Add test for utils * Update some docstrings * Make sure in channels last before converting to PIL * Remove default to numpy batching * Fix up * Add docstring and model_input_types * Use feature processor config from hub * Alias GLPN feature extractor to image processor * Alias feature extractor mixin * Add return_numpy=False flag for resize * Fix up * Fix up * Use different frameworks safely * Safely import PIL * Call function checking if PIL available * Only import if vision available * Address Sylvain PR comments Co-authored-by: Sylvain.gugger@gmail.com * Apply suggestions from code review Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/image_transforms.py Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> * Update src/transformers/models/glpn/feature_extraction_glpn.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Add in docstrings * Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226) Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Refactor `TFSwinLayer` to increase serving compatibility (#18352) * Refactor `TFSwinLayer` to increase serving compatibility Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix missed parameters while refactoring Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> * Fix window_reverse to calculate batch size Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com> Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add TF prefix to TF-Res test class (#18481) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove py.typed (#18485) * Fix pipeline tests (#18487) * Fix pipeline tests * Make sure all pipelines tests run with init changes * Use new huggingface_hub tools for download models (#18438) * Draft new cached_file * Initial draft for config and model * Small fixes * Fix first batch of tests * Look in cache when internet is down * Fix last tests * Bad black, not fixing all quality errors * Make diff less * Implement change for TF and Flax models * Add tokenizer and feature extractor * For compatibility with main * Add utils to move the cache and auto-do it at first use. * Quality * Deal with empty commit shas * Deal with empty etag * Address review comments * Fix `test_dbmdz_english` by updating expected values (#18482) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Move cache folder to huggingface/hub for consistency with hf_hub (#18492) * Move cache folder to just huggingface * Thank you VsCode for this needless import * Move to hub * Forgot one * Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Forgot one new_ for cache migration * disable Onnx test for google/long-t5-tglobal-base (#18454) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Typo reported by Joel Grus on TWTR (#18493) * Just re-reading the whole doc every couple of months 😬 (#18489) * Delete valohai.yaml * NLP => ML * typo * website supports https * datasets * 60k + modalities * unrelated link fixing for accelerate * Ok those links were actually broken * Fix link * Make `AutoTokenizer` auto-link * wording tweak * add at least one non-nlp task * `transformers-cli login` => `huggingface-cli login` (#18490) * zero chance anyone's using that constant no? * `transformers-cli login` => `huggingface-cli login` * `transformers-cli repo create` => `huggingface-cli repo create` * `make style` * Add seed setting to image classification example (#18519) * [DX fix] Fixing QA pipeline streaming a dataset. (#18516) * [DX fix] Fixing QA pipeline streaming a dataset. QuestionAnsweringArgumentHandler would iterate over the whole dataset effectively killing all properties of the pipeline. This restores nice properties when using `Dataset` or `Generator` since those are meant to be consumed lazily. * Handling TF better. * Clean up hub (#18497) * Clean up utils.hub * Remove imports * More fixes * Last fix * update fsdp docs (#18521) * updating fsdp documentation * typo fix * Fix compatibility with 1.12 (#17925) * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * Fix compatibility with 1.12 * Remove pin from examples requirements * Update torch scatter version * fix torch.onnx.symbolic_opset12 import * Reject bad version Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Remove debug statement * Specify en in doc-builder README example (#18526) Co-authored-by: Ankur Goyal <ankur@impira.com> * New cache fixes: add safeguard before looking in folders (#18522) * unpin resampy (#18527) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * ✨ update to use interlibrary links instead of Markdown (#18500) * Add example of multimodal usage to pipeline tutorial (#18498) * 📝 add example of multimodal usage to pipeline tutorial * 🖍 apply feedbacks * 🖍 apply niels feedback * [VideoMAE] Add model to doc tests (#18523) * Add videomae to doc tests * Add pip install decord Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update perf_train_gpu_one.mdx (#18532) * Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473) * Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script * make fixup changes * PR comments * changed input to Acceletor based on PR comment, ran make fixup * Added comment explaining the sync_gradients statement * Fixed lr scheduler max steps * Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper * Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper * Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script * make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py * removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script * Add Spanish translation of converting_tensorflow_models.mdx (#18512) * Add file in spanish docs to be translated * Finish translation to Spanish * Improve Spanish wording * Add suggested changes from review * Spanish translation of summarization.mdx (#15947) (#18477) * Add Spanish translation of summarization.mdx * Apply suggestions from code review Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> * Let's not cast them all (#18471) * add correct dtypes when checking for params dtype * forward contrib credits * Update src/transformers/modeling_utils.py Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * more comments - added more comments on why we cast only floating point parameters * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * fix: data2vec-vision Onnx ready-made configuration. (#18427) * feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization * fix: wrong config * Add mt5 onnx config (#18394) * update features * MT5OnnxConfig added with updated with tests and docs * fix imports * fix onnc_config_cls for mt5 Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai> * Minor update of `run_call_with_unpacked_inputs` (#18541) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * BART - Fix attention mask device issue on copied models (#18540) * attempt to fix attn mask device * fix bart `_prepare_decoder_attention_mask` - add correct device - run `make fix-copies` to propagate the fix * Adding a new `align_to_words` param to qa pipeline. (#18010) * Adding a new `align_to_words` param to qa pipeline. * Update src/transformers/pipelines/question_answering.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Import protection. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * 📝 update metric with evaluate (#18535) * Restore _init_weights value in no_init_weights (#18504) * Recover _init_weights value in no_init_weights For potential nested use. In addition, users might modify private no_init_weights as well. * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove private variable change check Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up comment * 📝 update documentation build section (#18548) * `bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901) * first commit * correct replace function * add final changes - works like charm! - cannot implement tests yet - tested * clean up a bit * add bitsandbytes dependencies * working version - added import function - added bitsandbytes utils file * small fix * small fix - fix import issue * fix import issues * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit - move bitsandbytes utils to utils - change comments on functions * reformat docstring - reformat docstring on init_empty_weights_8bit * Update src/transformers/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert bad formatting * change to bitsandbytes * refactor a bit - remove init8bit since it is useless * more refactoring - fixed init empty weights issue - added threshold param * small hack to make it work * Update src/transformers/modeling_utils.py * Update src/transformers/modeling_utils.py * revmoe the small hack * modify utils file * make style + refactor a bit * create correctly device map * add correct dtype for device map creation * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions - remove with torch.grad - do not rely on Python bool magic! * add docstring - add docstring for new kwargs * add docstring - comment `replace_8bit_linear` function - fix weird formatting * - added more documentation - added new utility function for memory footprint tracking - colab demo to add * few modifs - typo doc - force cast into float16 when load_in_8bit is enabled * added colab link * add test architecture + docstring a bit * refactor a bit testing class * make style + refactor a bit * enhance checks - add more checks - start writing saving test * clean up a bit * male style * add more details on doc * add more tests - still needs to fix 2 tests * replace by "or" - could not fix it from GitHub GUI Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor a bit testing code + add readme * make style * fix import issue * Update src/transformers/modeling_utils.py Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * add few comments * add more doctring + make style * more docstring * raise error when loaded in 8bit * make style * add warning if loaded on CPU * add small sanity check * fix small comment * add bitsandbytes on dockerfile * Improve documentation - improve documentation from comments * add few comments * slow tests pass on the VM but not on the CI VM * Fix merge conflict * make style * another test should pass on a multi gpu setup * fix bad import in testing file * Fix slow tests - remove dummy batches - no more CUDA illegal memory errors * odify dockerfile * Update docs/source/en/main_classes/model.mdx * Update Dockerfile * Update model.mdx * Update Dockerfile * Apply suggestions from code review * few modifications - lm head can stay on disk/cpu - change model name so that test pass * change test value - change test value to the correct output - torch bmm changed to baddmm in bloom modeling when merging * modify installation guidelines * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace `n`by `name` * merge `load_in_8bit` and `low_cpu_mem_usage` * first try - keep the lm head in full precision * better check - check the attribute `base_model_prefix` instead of computing the number of parameters * added more tests * Update src/transformers/utils/bitsandbytes.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit * improve documentation - fix typos for installation - change title in the documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * TF: XLA-trainable DeBERTa v2 (#18546) * fix deberta issues * add different code paths for gpu and tpu * shorter gpu take along axis * Stable Dropout without tf cond * variable must be float * Preserve hub-related kwargs in AutoModel.from_pretrained (#18545) * Preserve hub-related kwargs in AutoModel.from_pretrained * Fix tests * Remove debug statement * TF Examples Rewrite (#18451) * Finished QA example * Dodge a merge conflict * Update text classification and LM examples * Update NER example * New Keras metrics WIP, fix NER example * Update NER example * Update MC, summarization and translation examples * Add XLA warnings when shapes are variable * Make sure batch_size is consistently scaled by num_replicas * Add PushToHubCallback to all models * Add docs links for KerasMetricCallback * Add docs links for prepare_tf_dataset and jit_compile * Correct inferred model names * Don't assume the dataset has 'lang' * Don't assume the dataset has 'lang' * Write metrics in text classification * Add 'framework' to TrainingArguments and TFTrainingArguments * Export metrics in all examples and add tests * Fix training args for Flax * Update command line args for translation test * make fixup * Fix accidentally running other tests in fp16 * Remove do_train/do_eval from run_clm.py * Remove do_train/do_eval from run_mlm.py * Add tensorflow tests to circleci * Fix circleci * Update examples/tensorflow/language-modeling/run_mlm.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/test_tensorflow_examples.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/translation/run_translation.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update examples/tensorflow/token-classification/run_ner.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Fix save path for tests * Fix some model card kwargs * Explain the magical -1000 * Actually enable tests this time * Skip text classification PR until we fix shape inference * make fixup Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Use commit hash to look in cache instead of calling head (#18534) * Use commit hash to look in cache instead of calling head * Add tests * Add attr for local configs too * Stupid typos * Fix tests * Update src/transformers/utils/hub.py Co-authored-by: Julien Chaumond <julien@huggingface.co> * Address Julien's comments Co-authored-by: Julien Chaumond <julien@huggingface.co> * `pipeline` support for `device="mps"` (or any other string) (#18494) * `pipeline` support for `device="mps"` (or any other string) * Simplify `if` nesting * Update src/transformers/pipelines/base.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix? @sgugger * passing `attr=None` is not the same as not passing `attr` 🤯 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update philosophy to include other preprocessing classes (#18550) * 📝 update philosophy to include other preprocessing classes * 🖍 apply feedbacks * Properly move cache when it is not in default path (#18563) * Adds CLIP to models exportable with ONNX (#18515) * onnx config for clip * default opset as 14 * changes from the original repo * input values order fix * outputs fix * remove unused import * ran make fix-copies * black format * review comments: forward ref, import fix, model change revert, .to cleanup * make style * formatting fixes * revert groupvit * comment for cast to int32 * comment fix * make .T as .t() for onnx conversion * ran make fix-copies * remove unneeded comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * remove comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * raise atol for MT5OnnxConfig (#18560) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * fix string (#18568) * Segformer TF: fix output size in documentation (#18572) * Segformer TF: fix output size in doc * Segformer pytorch: fix output size in doc Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> * Fix resizing bug in OWL-ViT (#18573) * Fixes resizing bug in OWL-ViT * Defaults to square resize if size is set to an int * Sets do_center_crop default value to False * Fix LayoutLMv3 documentation (#17932) * fix typos * fix sequence_length docs of LayoutLMv3Model * delete trailing white spaces * fix layoutlmv3 docs more * apply make fixup & quality * change to two versions of input docstring * apply make fixup & quality * Skip broken tests * Change BartLearnedPositionalEmbedding's forward method signature to support Opacus training (#18486) * changing BartLearnedPositionalEmbedding forward signature and references to it * removing debugging dead code (thanks style checker) * blackened modeling_bart file * removing copy inconsistencies via make fix-copies * changing references to copied signatures in Bart variants * make fix-copies once more * using expand over repeat (thanks @michaelbenayoun) * expand instead of repeat for all model copies Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> * german docs translation (#18544) * Create _config.py * Create _toctree.yml * Create index.mdx not sure about "du / ihr" oder "sie" * Create quicktour.mdx * Update _toctree.yml * Update build_documentation.yml * Update build_pr_documentation.yml * fix build * Update index.mdx * Update quicktour.mdx * Create installation.mdx * Update _toctree.yml * Deberta V2: Fix critical trace warnings to allow ONNX export (#18272) * Fix critical trace warnings to allow ONNX export * Force input to `sqrt` to be float type * Cleanup code * Remove unused import statement * Update model sew * Small refactor Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Use broadcasting instead of repeat * Implement suggestion Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * Match deberta v2 changes in sew_d * Improve code quality * Update code quality * Consistency of small refactor * Match changes in sew_d Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> * [FX] _generate_dummy_input supports audio-classification models for labels (#18580) * Support audio classification architectures for labels generation, as well as provides a flag to print warnings or not * Use ENV_VARS_TRUE_VALUES * Fix docstrings with last version of hf-doc-builder styler (#18581) * Fix docstrings with last version of hf-doc-builder styler * Remove empty Parameter block * Bump nbconvert from 6.0.1 to 6.3.0 in /examples/research_projects/lxmert (#18565) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump nbconvert in /examples/research_projects/visual_bert (#18566) Bumps [nbconvert](https://github.com/jupyter/nbconvert) from 6.0.1 to 6.3.0. - [Release notes](https://github.com/jupyter/nbconvert/releases) - [Commits](https://github.com/jupyter/nbconvert/compare/6.0.1...6.3.0) --- updated-dependencies: - dependency-name: nbconvert dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix owlvit tests, update docstring examples (#18586) * Return the permuted hidden states if return_dict=True (#18578) * Load sharded pt to flax (#18419) * initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by: sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add type hints for ViLT models (#18577) * Add type hints for Vilt models * Add missing return type for TokenClassification class * update doc for perf_train_cpu_many, add intel mpi introduction (#18576) * update doc for perf_train_cpu_many, add mpi introduction Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Update docs/source/en/perf_train_cpu_many.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/perf_train_cpu_many.mdx Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * typos (#18594) * FSDP bug fix for `load_state_dict` (#18596) * Add `TFAutoModelForSemanticSegmentation` to the main `__init__.py` (#18600) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Generate: validate `model_kwargs` (and catch typos in generate arguments) (#18261) * validate generate model_kwargs * generate tests -- not all models have an attn mask * Supporting seq2seq models for `bitsandbytes` integration (#18579) * Supporting seq2seq models for `bitsandbytes` integration - `bitsandbytes` integration supports now seq2seq models - check if a model has tied weights as an additional check * small modification - tie the weights before looking at tied weights! * Add Donut (#18488) * First draft * Improve script * Update script * Make conversion work * Add final_layer_norm attribute to Swin's config * Add DonutProcessor * Convert more models * Improve feature extractor and convert base models * Fix bug * Improve integration tests * Improve integration tests and add model to README * Add doc test * Add feature extractor to docs * Fix integration tests * Remove register_buffer * Fix toctree and add missing attribute * Add DonutSwin * Make conversion script work * Improve conversion script * Address comment * Fix bug * Fix another bug * Remove deprecated method from docs * Make Swin and Swinv2 untouched * Fix code examples * Fix processor * Update model_type to donut-swin * Add feature extractor tests, add token2json method, improve feature extractor * Fix failing tests, remove integration test * Add do_thumbnail for consistency * Improve code examples * Add code example for document parsing * Add DonutSwin to MODEL_NAMES_MAPPING * Add model to appropriate place in toctree * Update namespace to appropriate organization Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Fix URLs (#18604) Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> * Update BLOOM parameter counts (#18531) * Update BLOOM parameter counts * Update BLOOM parameter counts * [doc] fix anchors (#18591) the manual anchors end up being duplicated with automatically added anchors and no longer work. * [fsmt] deal with -100 indices in decoder ids (#18592) * [fsmt] deal with -100 indices in decoder ids Fixes: https://github.com/huggingface/transformers/issues/17945 decoder ids get the default index -100, which breaks the model - like t5 and many other models add a fix to replace -100 with the correct pad index. For some reason this use case hasn't been used with this model until recently - so this issue was there since the beginning it seems. Any suggestions to how to add a simple test here? or perhaps we have something similar already? user's script is quite massive. * style * small change (#18584) * Flax Remat for LongT5 (#17994) * [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * add gradient_checkpointing to examples * Add gradient_checkpointing to run_mlm_flax * Add remat to longt5 * Add gradient checkpointing test longt5 * Fix args errors * Fix remaining tests * Make fixup & quality fixes * replace kwargs * remove unecessary kwargs * Make fixup changes * revert long_t5_flax changes * Remove return_dict and copy to LongT5 * Remove test_gradient_checkpointing Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> * mac m1 `mps` integration (#18598) * mac m1 `mps` integration * Update docs/source/en/main_classes/trainer.mdx Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * Apply suggestions from code review Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * resolve comment Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> * Change scheduled CIs to use torch 1.12.1 (#18644) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Add checks for some workflow jobs (#18583) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * TF: Fix generation repetition penalty with XLA (#18648) * Update longt5.mdx (#18634) * Update run_translation_no_trainer.py (#18637) * Update run_translation_no_trainer.py found an error in selecting `no_decay` parameters and some small modifications when the user continues to train from a checkpoint * fixs `no_decay` and `resume_step` issue 1. change `no_decay` list 2. if use continue to train their model from provided checkpoint, the `resume_step` will not be initialized properly if `args.gradient_accumulation_steps != 1` * [bnb] Minor modifications (#18631) * bnb minor modifications - refactor documentation - add troubleshooting README - add PyPi library on DockerFile * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * put in one block - put bash instructions in one block * update readme - refactor a bit hardware requirements * change text a bit * Apply suggestions from code review Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * apply suggestions Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * add link to paper * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update tests/mixed_int8/README.md * Apply suggestions from code review * refactor a bit * add instructions Turing & Amperer Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * add A6000 * clarify a bit * remove small part * Update tests/mixed_int8/README.md Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Examples: add Bloom support for token classification (#18632) * examples: add Bloom support for token classification (FLAX, PyTorch and TensorFlow) * examples: remove support for Bloom in token classication (FLAX and TensorFlow currently have no support for it) * Fix Yolos ONNX export test (#18606) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Fixup * Fix up * Move PIL default arguments inside function for safe imports * Add image utils to toctree * Update `rescale` method to reflect changes in #18677 * Update docs/source/en/internal/image_processing_utils.mdx Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Address Niels PR comments * Apply suggestions from code review - remove defaults to None Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix docstrings and revert to PIL.Image.XXX resampling Use PIL.Image.XXX resampling values instead of PIL.Image.Resampling.XXX enum as it's only in the recent version >= 9.10 and version is not yet pinned and older version support deprecated * Some more docstrings and PIL.Image tidy up * Reorganise arguments so flags by modifiers * Few last docstring fixes Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Amy Roberts <amyeroberts@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com> Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Seunghwan Hong <harrydrippin@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Julien Chaumond <julien@huggingface.co> Co-authored-by: regisss <15324346+regisss@users.noreply.github.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com> Co-authored-by: Ankur Goyal <ankrgyl@gmail.com> Co-authored-by: Ankur Goyal <ankur@impira.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by: Mishig Davaadorj <dmishig@gmail.com> Co-authored-by: Rasmus Arpe Fogh Jensen <Rasmus.arpe@gmail.com> Co-authored-by: Ian Castillo <7807897+donelianc@users.noreply.github.com> Co-authored-by: AguilaCudicio <aguila.cudicio@gmail.com> Co-authored-by: Omar U. Espejel <espejelomar@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Niklas Hansson <niklas.sven.hansson@gmail.com> Co-authored-by: Thomas Chaigneau <t.chaigneau.tc@gmail.com> Co-authored-by: YouJiacheng <1503679330@qq.com> Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Dhruv Karan <k4r4n.dhruv@gmail.com> Co-authored-by: Michael Wyatt <mrwyattii@gmail.com> Co-authored-by: Maxime G <joihn@users.noreply.github.com> Co-authored-by: Maxime Gardoni <maxime.gardoni@ecorobotix.com> Co-authored-by: Wonseok Lee (Jack) <rollerkid02@snu.ac.kr> Co-authored-by: Dan Jones <dan.j.jones2@gmail.com> Co-authored-by: Daniel Jones <jonesdaniel@microsoft.com> Co-authored-by: flozi00 <flozi00.fz@gmail.com> Co-authored-by: iiLaurens <iiLaurens@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Wang, Yi <yi.a.wang@intel.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com> Co-authored-by: Karim Foda <35491698+KMFODA@users.noreply.github.com> Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: Dan Saattrup Nielsen <47701536+saattrupdan@users.noreply.github.com> Co-authored-by: zhoutang776 <47708118+zhoutang776@users.noreply.github.com> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
This commit is contained in:
parent
a2c90a7f7b
commit
1973b7716b
@ -521,6 +521,8 @@
|
|||||||
title: Utilities for Trainer
|
title: Utilities for Trainer
|
||||||
- local: internal/generation_utils
|
- local: internal/generation_utils
|
||||||
title: Utilities for Generation
|
title: Utilities for Generation
|
||||||
|
- local: internal/image_processing_utils
|
||||||
|
title: Utilities for Image Processors
|
||||||
- local: internal/file_utils
|
- local: internal/file_utils
|
||||||
title: General Utilities
|
title: General Utilities
|
||||||
title: Internal Helpers
|
title: Internal Helpers
|
||||||
|
30
docs/source/en/internal/image_processing_utils.mdx
Normal file
30
docs/source/en/internal/image_processing_utils.mdx
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
<!--Copyright 2022 The HuggingFace Team. All rights reserved.
|
||||||
|
|
||||||
|
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
|
||||||
|
the License. You may obtain a copy of the License at
|
||||||
|
|
||||||
|
http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
|
||||||
|
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
|
||||||
|
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
|
||||||
|
specific language governing permissions and limitations under the License.
|
||||||
|
-->
|
||||||
|
|
||||||
|
# Utilities for Image Processors
|
||||||
|
|
||||||
|
This page lists all the utility functions used by the image processors, mainly the functional
|
||||||
|
transformations used to process the images.
|
||||||
|
|
||||||
|
Most of those are only useful if you are studying the code of the image processors in the library.
|
||||||
|
|
||||||
|
## Image Transformations
|
||||||
|
|
||||||
|
[[autodoc]] image_transforms.rescale
|
||||||
|
|
||||||
|
[[autodoc]] image_transforms.resize
|
||||||
|
|
||||||
|
[[autodoc]] image_transforms.to_pil_image
|
||||||
|
|
||||||
|
## ImageProcessorMixin
|
||||||
|
|
||||||
|
[[autodoc]] image_processing_utils.ImageProcessorMixin
|
@ -680,6 +680,8 @@ except OptionalDependencyNotAvailable:
|
|||||||
name for name in dir(dummy_vision_objects) if not name.startswith("_")
|
name for name in dir(dummy_vision_objects) if not name.startswith("_")
|
||||||
]
|
]
|
||||||
else:
|
else:
|
||||||
|
_import_structure["image_processing_utils"] = ["ImageProcessorMixin"]
|
||||||
|
_import_structure["image_transforms"] = ["rescale", "resize", "to_pil_image"]
|
||||||
_import_structure["image_utils"] = ["ImageFeatureExtractionMixin"]
|
_import_structure["image_utils"] = ["ImageFeatureExtractionMixin"]
|
||||||
_import_structure["models.beit"].append("BeitFeatureExtractor")
|
_import_structure["models.beit"].append("BeitFeatureExtractor")
|
||||||
_import_structure["models.clip"].append("CLIPFeatureExtractor")
|
_import_structure["models.clip"].append("CLIPFeatureExtractor")
|
||||||
@ -3648,6 +3650,8 @@ if TYPE_CHECKING:
|
|||||||
except OptionalDependencyNotAvailable:
|
except OptionalDependencyNotAvailable:
|
||||||
from .utils.dummy_vision_objects import *
|
from .utils.dummy_vision_objects import *
|
||||||
else:
|
else:
|
||||||
|
from .image_processing_utils import ImageProcessorMixin
|
||||||
|
from .image_transforms import rescale, resize, to_pil_image
|
||||||
from .image_utils import ImageFeatureExtractionMixin
|
from .image_utils import ImageFeatureExtractionMixin
|
||||||
from .models.beit import BeitFeatureExtractor
|
from .models.beit import BeitFeatureExtractor
|
||||||
from .models.clip import CLIPFeatureExtractor
|
from .models.clip import CLIPFeatureExtractor
|
||||||
|
54
src/transformers/image_processing_utils.py
Normal file
54
src/transformers/image_processing_utils.py
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
# coding=utf-8
|
||||||
|
# Copyright 2022 The HuggingFace Inc. team.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
|
||||||
|
from .feature_extraction_utils import BatchFeature as BaseBatchFeature
|
||||||
|
from .feature_extraction_utils import FeatureExtractionMixin
|
||||||
|
from .utils import logging
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.get_logger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# TODO: Move BatchFeature to be imported by both feature_extraction_utils and image_processing_utils
|
||||||
|
# We override the class string here, but logic is the same.
|
||||||
|
class BatchFeature(BaseBatchFeature):
|
||||||
|
r"""
|
||||||
|
Holds the output of the image processor specific `__call__` methods.
|
||||||
|
|
||||||
|
This class is derived from a python dictionary and can be used as a dictionary.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
data (`dict`):
|
||||||
|
Dictionary of lists/arrays/tensors returned by the __call__ method ('pixel_values', etc.).
|
||||||
|
tensor_type (`Union[None, str, TensorType]`, *optional*):
|
||||||
|
You can give a tensor_type here to convert the lists of integers in PyTorch/TensorFlow/Numpy Tensors at
|
||||||
|
initialization.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
# We use aliasing whilst we phase out the old API. Once feature extractors for vision models
|
||||||
|
# are deprecated, ImageProcessor mixin will be implemented. Any shared logic will be abstracted out.
|
||||||
|
ImageProcessorMixin = FeatureExtractionMixin
|
||||||
|
|
||||||
|
|
||||||
|
class BaseImageProcessor(ImageProcessorMixin):
|
||||||
|
def __init__(self, **kwargs):
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
|
||||||
|
def __call__(self, images, **kwargs) -> BatchFeature:
|
||||||
|
return self.preprocess(images, **kwargs)
|
||||||
|
|
||||||
|
def preprocess(self, images, **kwargs) -> BatchFeature:
|
||||||
|
raise NotImplementedError("Each image processor must implement its own preprocess method")
|
259
src/transformers/image_transforms.py
Normal file
259
src/transformers/image_transforms.py
Normal file
@ -0,0 +1,259 @@
|
|||||||
|
# coding=utf-8
|
||||||
|
# Copyright 2022 The HuggingFace Inc. team.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
|
||||||
|
from typing import TYPE_CHECKING, List, Optional, Tuple, Union
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
from transformers.utils.import_utils import is_flax_available, is_tf_available, is_torch_available, is_vision_available
|
||||||
|
|
||||||
|
|
||||||
|
if is_vision_available():
|
||||||
|
import PIL
|
||||||
|
|
||||||
|
from .image_utils import (
|
||||||
|
ChannelDimension,
|
||||||
|
get_image_size,
|
||||||
|
infer_channel_dimension_format,
|
||||||
|
is_jax_tensor,
|
||||||
|
is_tf_tensor,
|
||||||
|
is_torch_tensor,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
if is_torch_available():
|
||||||
|
import torch
|
||||||
|
if is_tf_available():
|
||||||
|
import tensorflow as tf
|
||||||
|
if is_flax_available():
|
||||||
|
import jax.numpy as jnp
|
||||||
|
|
||||||
|
|
||||||
|
def to_channel_dimension_format(image: np.ndarray, channel_dim: Union[ChannelDimension, str]) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Converts `image` to the channel dimension format specified by `channel_dim`.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`numpy.ndarray`):
|
||||||
|
The image to have its channel dimension set.
|
||||||
|
channel_dim (`ChannelDimension`):
|
||||||
|
The channel dimension format to use.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`np.ndarray`: The image with the channel dimension set to `channel_dim`.
|
||||||
|
"""
|
||||||
|
if not isinstance(image, np.ndarray):
|
||||||
|
raise ValueError(f"Input image must be of type np.ndarray, got {type(image)}")
|
||||||
|
|
||||||
|
current_channel_dim = infer_channel_dimension_format(image)
|
||||||
|
target_channel_dim = ChannelDimension(channel_dim)
|
||||||
|
if current_channel_dim == target_channel_dim:
|
||||||
|
return image
|
||||||
|
|
||||||
|
if target_channel_dim == ChannelDimension.FIRST:
|
||||||
|
image = image.transpose((2, 0, 1))
|
||||||
|
elif target_channel_dim == ChannelDimension.LAST:
|
||||||
|
image = image.transpose((1, 2, 0))
|
||||||
|
else:
|
||||||
|
raise ValueError("Unsupported channel dimension format: {}".format(channel_dim))
|
||||||
|
|
||||||
|
return image
|
||||||
|
|
||||||
|
|
||||||
|
def rescale(
|
||||||
|
image: np.ndarray, scale: float, data_format: Optional[ChannelDimension] = None, dtype=np.float32
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Rescales `image` by `scale`.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`np.ndarray`):
|
||||||
|
The image to rescale.
|
||||||
|
scale (`float`):
|
||||||
|
The scale to use for rescaling the image.
|
||||||
|
data_format (`ChannelDimension`, *optional*):
|
||||||
|
The channel dimension format of the image. If not provided, it will be the same as the input image.
|
||||||
|
dtype (`np.dtype`, *optional*, defaults to `np.float32`):
|
||||||
|
The dtype of the output image. Defaults to `np.float32`. Used for backwards compatibility with feature
|
||||||
|
extractors.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`np.ndarray`: The rescaled image.
|
||||||
|
"""
|
||||||
|
if not isinstance(image, np.ndarray):
|
||||||
|
raise ValueError(f"Input image must be of type np.ndarray, got {type(image)}")
|
||||||
|
|
||||||
|
rescaled_image = image * scale
|
||||||
|
if data_format is not None:
|
||||||
|
rescaled_image = to_channel_dimension_format(rescaled_image, data_format)
|
||||||
|
rescaled_image = rescaled_image.astype(dtype)
|
||||||
|
return rescaled_image
|
||||||
|
|
||||||
|
|
||||||
|
def to_pil_image(
|
||||||
|
image: Union[np.ndarray, PIL.Image.Image, "torch.Tensor", "tf.Tensor", "jnp.Tensor"],
|
||||||
|
do_rescale: Optional[bool] = None,
|
||||||
|
) -> PIL.Image.Image:
|
||||||
|
"""
|
||||||
|
Converts `image` to a PIL Image. Optionally rescales it and puts the channel dimension back as the last axis if
|
||||||
|
needed.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`PIL.Image.Image` or `numpy.ndarray` or `torch.Tensor` or `tf.Tensor`):
|
||||||
|
The image to convert to the `PIL.Image` format.
|
||||||
|
do_rescale (`bool`, *optional*):
|
||||||
|
Whether or not to apply the scaling factor (to make pixel values integers between 0 and 255). Will default
|
||||||
|
to `True` if the image type is a floating type, `False` otherwise.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`PIL.Image.Image`: The converted image.
|
||||||
|
"""
|
||||||
|
if isinstance(image, PIL.Image.Image):
|
||||||
|
return image
|
||||||
|
|
||||||
|
# Convert all tensors to numpy arrays before converting to PIL image
|
||||||
|
if is_torch_tensor(image) or is_tf_tensor(image):
|
||||||
|
image = image.numpy()
|
||||||
|
elif is_jax_tensor(image):
|
||||||
|
image = np.array(image)
|
||||||
|
elif not isinstance(image, np.ndarray):
|
||||||
|
raise ValueError("Input image type not supported: {}".format(type(image)))
|
||||||
|
|
||||||
|
# If the channel as been moved to first dim, we put it back at the end.
|
||||||
|
image = to_channel_dimension_format(image, ChannelDimension.LAST)
|
||||||
|
|
||||||
|
# PIL.Image can only store uint8 values, so we rescale the image to be between 0 and 255 if needed.
|
||||||
|
do_rescale = isinstance(image.flat[0], float) if do_rescale is None else do_rescale
|
||||||
|
if do_rescale:
|
||||||
|
image = rescale(image, 255)
|
||||||
|
image = image.astype(np.uint8)
|
||||||
|
return PIL.Image.fromarray(image)
|
||||||
|
|
||||||
|
|
||||||
|
def get_resize_output_image_size(
|
||||||
|
input_image: np.ndarray,
|
||||||
|
size: Union[int, Tuple[int, int], List[int], Tuple[int]],
|
||||||
|
default_to_square: bool = True,
|
||||||
|
max_size: Optional[int] = None,
|
||||||
|
) -> tuple:
|
||||||
|
"""
|
||||||
|
Find the target (height, width) dimension of the output image after resizing given the input image and the desired
|
||||||
|
size.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
input_image (`np.ndarray`):
|
||||||
|
The image to resize.
|
||||||
|
size (`int` or `Tuple[int, int]` or List[int] or Tuple[int]):
|
||||||
|
The size to use for resizing the image. If `size` is a sequence like (h, w), output size will be matched to
|
||||||
|
this.
|
||||||
|
|
||||||
|
If `size` is an int and `default_to_square` is `True`, then image will be resized to (size, size). If
|
||||||
|
`size` is an int and `default_to_square` is `False`, then smaller edge of the image will be matched to this
|
||||||
|
number. i.e, if height > width, then image will be rescaled to (size * height / width, size).
|
||||||
|
default_to_square (`bool`, *optional*, defaults to `True`):
|
||||||
|
How to convert `size` when it is a single int. If set to `True`, the `size` will be converted to a square
|
||||||
|
(`size`,`size`). If set to `False`, will replicate
|
||||||
|
[`torchvision.transforms.Resize`](https://pytorch.org/vision/stable/transforms.html#torchvision.transforms.Resize)
|
||||||
|
with support for resizing only the smallest edge and providing an optional `max_size`.
|
||||||
|
max_size (`int`, *optional*):
|
||||||
|
The maximum allowed for the longer edge of the resized image: if the longer edge of the image is greater
|
||||||
|
than `max_size` after being resized according to `size`, then the image is resized again so that the longer
|
||||||
|
edge is equal to `max_size`. As a result, `size` might be overruled, i.e the smaller edge may be shorter
|
||||||
|
than `size`. Only used if `default_to_square` is `False`.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`tuple`: The target (height, width) dimension of the output image after resizing.
|
||||||
|
"""
|
||||||
|
if isinstance(size, (tuple, list)):
|
||||||
|
if len(size) == 2:
|
||||||
|
return tuple(size)
|
||||||
|
elif len(size) == 1:
|
||||||
|
# Perform same logic as if size was an int
|
||||||
|
size = size[0]
|
||||||
|
else:
|
||||||
|
raise ValueError("size must have 1 or 2 elements if it is a list or tuple")
|
||||||
|
|
||||||
|
if default_to_square:
|
||||||
|
return (size, size)
|
||||||
|
|
||||||
|
height, width = get_image_size(input_image)
|
||||||
|
short, long = (width, height) if width <= height else (height, width)
|
||||||
|
requested_new_short = size
|
||||||
|
|
||||||
|
if short == requested_new_short:
|
||||||
|
return (height, width)
|
||||||
|
|
||||||
|
new_short, new_long = requested_new_short, int(requested_new_short * long / short)
|
||||||
|
|
||||||
|
if max_size is not None:
|
||||||
|
if max_size <= requested_new_short:
|
||||||
|
raise ValueError(
|
||||||
|
f"max_size = {max_size} must be strictly greater than the requested "
|
||||||
|
f"size for the smaller edge size = {size}"
|
||||||
|
)
|
||||||
|
if new_long > max_size:
|
||||||
|
new_short, new_long = int(max_size * new_short / new_long), max_size
|
||||||
|
|
||||||
|
return (new_long, new_short) if width <= height else (new_short, new_long)
|
||||||
|
|
||||||
|
|
||||||
|
def resize(
|
||||||
|
image,
|
||||||
|
size: Tuple[int, int],
|
||||||
|
resample=PIL.Image.BILINEAR,
|
||||||
|
data_format: Optional[ChannelDimension] = None,
|
||||||
|
return_numpy: bool = True,
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Resizes `image` to (h, w) specified by `size` using the PIL library.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`PIL.Image.Image` or `np.ndarray` or `torch.Tensor`):
|
||||||
|
The image to resize.
|
||||||
|
size (`Tuple[int, int]`):
|
||||||
|
The size to use for resizing the image.
|
||||||
|
resample (`int`, *optional*, defaults to `PIL.Image.BILINEAR`):
|
||||||
|
The filter to user for resampling.
|
||||||
|
data_format (`ChannelDimension`, *optional*):
|
||||||
|
The channel dimension format of the output image. If `None`, will use the inferred format from the input.
|
||||||
|
return_numpy (`bool`, *optional*, defaults to `True`):
|
||||||
|
Whether or not to return the resized image as a numpy array. If False a `PIL.Image.Image` object is
|
||||||
|
returned.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`np.ndarray`: The resized image.
|
||||||
|
"""
|
||||||
|
if not len(size) == 2:
|
||||||
|
raise ValueError("size must have 2 elements")
|
||||||
|
|
||||||
|
# For all transformations, we want to keep the same data format as the input image unless otherwise specified.
|
||||||
|
# The resized image from PIL will always have channels last, so find the input format first.
|
||||||
|
data_format = infer_channel_dimension_format(image) if data_format is None else data_format
|
||||||
|
|
||||||
|
# To maintain backwards compatibility with the resizing done in previous image feature extractors, we use
|
||||||
|
# the pillow library to resize the image and then convert back to numpy
|
||||||
|
if not isinstance(image, PIL.Image.Image):
|
||||||
|
# PIL expects image to have channels last
|
||||||
|
image = to_channel_dimension_format(image, ChannelDimension.LAST)
|
||||||
|
image = to_pil_image(image)
|
||||||
|
height, width = size
|
||||||
|
# PIL images are in the format (width, height)
|
||||||
|
resized_image = image.resize((width, height), resample=resample)
|
||||||
|
|
||||||
|
if return_numpy:
|
||||||
|
resized_image = np.array(resized_image)
|
||||||
|
resized_image = to_channel_dimension_format(resized_image, data_format)
|
||||||
|
return resized_image
|
@ -14,33 +14,128 @@
|
|||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
|
|
||||||
import os
|
import os
|
||||||
from typing import List, Union
|
from typing import TYPE_CHECKING, List, Tuple, Union
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import PIL.Image
|
|
||||||
import PIL.ImageOps
|
|
||||||
|
|
||||||
import requests
|
import requests
|
||||||
|
|
||||||
from .utils import is_torch_available
|
from .utils import is_flax_available, is_tf_available, is_torch_available, is_vision_available
|
||||||
from .utils.constants import ( # noqa: F401
|
from .utils.constants import ( # noqa: F401
|
||||||
IMAGENET_DEFAULT_MEAN,
|
IMAGENET_DEFAULT_MEAN,
|
||||||
IMAGENET_DEFAULT_STD,
|
IMAGENET_DEFAULT_STD,
|
||||||
IMAGENET_STANDARD_MEAN,
|
IMAGENET_STANDARD_MEAN,
|
||||||
IMAGENET_STANDARD_STD,
|
IMAGENET_STANDARD_STD,
|
||||||
)
|
)
|
||||||
from .utils.generic import _is_torch
|
from .utils.generic import ExplicitEnum, _is_jax, _is_tensorflow, _is_torch, to_numpy
|
||||||
|
|
||||||
|
|
||||||
|
if is_vision_available():
|
||||||
|
import PIL.Image
|
||||||
|
import PIL.ImageOps
|
||||||
|
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
if is_torch_available():
|
||||||
|
import torch
|
||||||
|
|
||||||
|
|
||||||
ImageInput = Union[
|
ImageInput = Union[
|
||||||
PIL.Image.Image, np.ndarray, "torch.Tensor", List[PIL.Image.Image], List[np.ndarray], List["torch.Tensor"] # noqa
|
"PIL.Image.Image", np.ndarray, "torch.Tensor", List["PIL.Image.Image"], List[np.ndarray], List["torch.Tensor"]
|
||||||
]
|
] # noqa
|
||||||
|
|
||||||
|
|
||||||
|
class ChannelDimension(ExplicitEnum):
|
||||||
|
FIRST = "channels_first"
|
||||||
|
LAST = "channels_last"
|
||||||
|
|
||||||
|
|
||||||
def is_torch_tensor(obj):
|
def is_torch_tensor(obj):
|
||||||
return _is_torch(obj) if is_torch_available() else False
|
return _is_torch(obj) if is_torch_available() else False
|
||||||
|
|
||||||
|
|
||||||
|
def is_tf_tensor(obj):
|
||||||
|
return _is_tensorflow(obj) if is_tf_available() else False
|
||||||
|
|
||||||
|
|
||||||
|
def is_jax_tensor(obj):
|
||||||
|
return _is_jax(obj) if is_flax_available() else False
|
||||||
|
|
||||||
|
|
||||||
|
def is_valid_image(img):
|
||||||
|
return (
|
||||||
|
isinstance(img, (PIL.Image.Image, np.ndarray))
|
||||||
|
or is_torch_tensor(img)
|
||||||
|
or is_tf_tensor(img)
|
||||||
|
or is_jax_tensor(img)
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def valid_images(imgs):
|
||||||
|
return all(is_valid_image(img) for img in imgs)
|
||||||
|
|
||||||
|
|
||||||
|
def is_batched(img):
|
||||||
|
if isinstance(img, (list, tuple)):
|
||||||
|
return is_valid_image(img[0])
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def to_numpy_array(img) -> np.ndarray:
|
||||||
|
if isinstance(img, PIL.Image.Image):
|
||||||
|
return np.array(img)
|
||||||
|
return to_numpy(img)
|
||||||
|
|
||||||
|
|
||||||
|
def infer_channel_dimension_format(image: np.ndarray) -> ChannelDimension:
|
||||||
|
"""
|
||||||
|
Infers the channel dimension format of `image`.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`np.ndarray`):
|
||||||
|
The image to infer the channel dimension of.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
The channel dimension of the image.
|
||||||
|
"""
|
||||||
|
if image.ndim == 3:
|
||||||
|
first_dim, last_dim = 0, 2
|
||||||
|
elif image.ndim == 4:
|
||||||
|
first_dim, last_dim = 1, 3
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unsupported number of image dimensions: {image.ndim}")
|
||||||
|
|
||||||
|
if image.shape[first_dim] in (1, 3):
|
||||||
|
return ChannelDimension.FIRST
|
||||||
|
elif image.shape[last_dim] in (1, 3):
|
||||||
|
return ChannelDimension.LAST
|
||||||
|
raise ValueError("Unable to infer channel dimension format")
|
||||||
|
|
||||||
|
|
||||||
|
def get_image_size(image: np.ndarray, channel_dim: ChannelDimension = None) -> Tuple[int, int]:
|
||||||
|
"""
|
||||||
|
Returns the (height, width) dimensions of the image.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`np.ndarray`):
|
||||||
|
The image to get the dimensions of.
|
||||||
|
channel_dim (`ChannelDimension`, *optional*):
|
||||||
|
Which dimension the channel dimension is in. If `None`, will infer the channel dimension from the image.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A tuple of the image's height and width.
|
||||||
|
"""
|
||||||
|
if channel_dim is None:
|
||||||
|
channel_dim = infer_channel_dimension_format(image)
|
||||||
|
|
||||||
|
if channel_dim == ChannelDimension.FIRST:
|
||||||
|
return image.shape[-2], image.shape[-1]
|
||||||
|
elif channel_dim == ChannelDimension.LAST:
|
||||||
|
return image.shape[-3], image.shape[-2]
|
||||||
|
else:
|
||||||
|
raise ValueError(f"Unsupported data format: {channel_dim}")
|
||||||
|
|
||||||
|
|
||||||
def load_image(image: Union[str, "PIL.Image.Image"]) -> "PIL.Image.Image":
|
def load_image(image: Union[str, "PIL.Image.Image"]) -> "PIL.Image.Image":
|
||||||
"""
|
"""
|
||||||
Loads `image` to a PIL Image.
|
Loads `image` to a PIL Image.
|
||||||
@ -236,7 +331,7 @@ class ImageFeatureExtractionMixin:
|
|||||||
else:
|
else:
|
||||||
return (image - mean) / std
|
return (image - mean) / std
|
||||||
|
|
||||||
def resize(self, image, size, resample=PIL.Image.BILINEAR, default_to_square=True, max_size=None):
|
def resize(self, image, size, resample=None, default_to_square=True, max_size=None):
|
||||||
"""
|
"""
|
||||||
Resizes `image`. Enforces conversion of input to PIL.Image.
|
Resizes `image`. Enforces conversion of input to PIL.Image.
|
||||||
|
|
||||||
@ -266,6 +361,8 @@ class ImageFeatureExtractionMixin:
|
|||||||
Returns:
|
Returns:
|
||||||
image: A resized `PIL.Image.Image`.
|
image: A resized `PIL.Image.Image`.
|
||||||
"""
|
"""
|
||||||
|
resample = resample if resample is not None else PIL.Image.BILINEAR
|
||||||
|
|
||||||
self._ensure_format_supported(image)
|
self._ensure_format_supported(image)
|
||||||
|
|
||||||
if not isinstance(image, PIL.Image.Image):
|
if not isinstance(image, PIL.Image.Image):
|
||||||
@ -393,7 +490,7 @@ class ImageFeatureExtractionMixin:
|
|||||||
|
|
||||||
return image[::-1, :, :]
|
return image[::-1, :, :]
|
||||||
|
|
||||||
def rotate(self, image, angle, resample=PIL.Image.NEAREST, expand=0, center=None, translate=None, fillcolor=None):
|
def rotate(self, image, angle, resample=None, expand=0, center=None, translate=None, fillcolor=None):
|
||||||
"""
|
"""
|
||||||
Returns a rotated copy of `image`. This method returns a copy of `image`, rotated the given number of degrees
|
Returns a rotated copy of `image`. This method returns a copy of `image`, rotated the given number of degrees
|
||||||
counter clockwise around its centre.
|
counter clockwise around its centre.
|
||||||
@ -406,6 +503,8 @@ class ImageFeatureExtractionMixin:
|
|||||||
Returns:
|
Returns:
|
||||||
image: A rotated `PIL.Image.Image`.
|
image: A rotated `PIL.Image.Image`.
|
||||||
"""
|
"""
|
||||||
|
resample = resample if resample is not None else PIL.Image.NEAREST
|
||||||
|
|
||||||
self._ensure_format_supported(image)
|
self._ensure_format_supported(image)
|
||||||
|
|
||||||
if not isinstance(image, PIL.Image.Image):
|
if not isinstance(image, PIL.Image.Image):
|
||||||
|
@ -14,126 +14,11 @@
|
|||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
"""Feature extractor class for GLPN."""
|
"""Feature extractor class for GLPN."""
|
||||||
|
|
||||||
from typing import Optional, Union
|
from ...utils import logging
|
||||||
|
from .image_processing_glpn import GLPNImageProcessor
|
||||||
import numpy as np
|
|
||||||
from PIL import Image
|
|
||||||
|
|
||||||
from ...feature_extraction_utils import BatchFeature, FeatureExtractionMixin
|
|
||||||
from ...image_utils import ImageFeatureExtractionMixin, ImageInput, is_torch_tensor
|
|
||||||
from ...utils import TensorType, logging
|
|
||||||
|
|
||||||
|
|
||||||
logger = logging.get_logger(__name__)
|
logger = logging.get_logger(__name__)
|
||||||
|
|
||||||
|
# Feature extractor for GLPN is being replaced by image processor
|
||||||
class GLPNFeatureExtractor(FeatureExtractionMixin, ImageFeatureExtractionMixin):
|
GLPNFeatureExtractor = GLPNImageProcessor
|
||||||
r"""
|
|
||||||
Constructs a GLPN feature extractor.
|
|
||||||
|
|
||||||
This feature extractor inherits from [`FeatureExtractionMixin`] which contains most of the main methods. Users
|
|
||||||
should refer to this superclass for more information regarding those methods.
|
|
||||||
|
|
||||||
Args:
|
|
||||||
do_resize (`bool`, *optional*, defaults to `True`):
|
|
||||||
Whether to resize the input based on certain `size_divisor`.
|
|
||||||
size_divisor (`int` or `Tuple(int)`, *optional*, defaults to 32):
|
|
||||||
Make sure the input is divisible by this value. Only has an effect if `do_resize` is set to `True`.
|
|
||||||
resample (`int`, *optional*, defaults to `PIL.Image.BILINEAR`):
|
|
||||||
An optional resampling filter. This can be one of `PIL.Image.NEAREST`, `PIL.Image.BOX`,
|
|
||||||
`PIL.Image.BILINEAR`, `PIL.Image.HAMMING`, `PIL.Image.BICUBIC` or `PIL.Image.LANCZOS`. Only has an effect
|
|
||||||
if `do_resize` is set to `True`.
|
|
||||||
do_rescale (`bool`, *optional*, defaults to `True`):
|
|
||||||
Whether or not to apply the scaling factor (to make pixel values floats between 0. and 1.).
|
|
||||||
"""
|
|
||||||
|
|
||||||
model_input_names = ["pixel_values"]
|
|
||||||
|
|
||||||
def __init__(self, do_resize=True, size_divisor=32, resample=Image.BILINEAR, do_rescale=True, **kwargs):
|
|
||||||
super().__init__(**kwargs)
|
|
||||||
self.do_resize = do_resize
|
|
||||||
self.size_divisor = size_divisor
|
|
||||||
self.resample = resample
|
|
||||||
self.do_rescale = do_rescale
|
|
||||||
|
|
||||||
def _resize(self, image, size_divisor, resample):
|
|
||||||
if not isinstance(image, Image.Image):
|
|
||||||
image = self.to_pil_image(image)
|
|
||||||
|
|
||||||
width, height = image.size
|
|
||||||
new_h, new_w = height // size_divisor * size_divisor, width // size_divisor * size_divisor
|
|
||||||
|
|
||||||
image = self.resize(image, size=(new_w, new_h), resample=resample)
|
|
||||||
|
|
||||||
return image
|
|
||||||
|
|
||||||
def __call__(
|
|
||||||
self, images: ImageInput, return_tensors: Optional[Union[str, TensorType]] = None, **kwargs
|
|
||||||
) -> BatchFeature:
|
|
||||||
"""
|
|
||||||
Main method to prepare for the model one or several image(s).
|
|
||||||
|
|
||||||
<Tip warning={true}>
|
|
||||||
|
|
||||||
NumPy arrays and PyTorch tensors are converted to PIL images when resizing, so the most efficient is to pass
|
|
||||||
PIL images.
|
|
||||||
|
|
||||||
</Tip>
|
|
||||||
|
|
||||||
Args:
|
|
||||||
images (`PIL.Image.Image`, `np.ndarray`, `torch.Tensor`, `List[PIL.Image.Image]`, `List[np.ndarray]`, `List[torch.Tensor]`):
|
|
||||||
The image or batch of images to be prepared. Each image can be a PIL image, NumPy array or PyTorch
|
|
||||||
tensor. In case of a NumPy array/PyTorch tensor, each image should be of shape (C, H, W), where C is a
|
|
||||||
number of channels, H and W are image height and width.
|
|
||||||
|
|
||||||
return_tensors (`str` or [`~utils.TensorType`], *optional*, defaults to `'np'`):
|
|
||||||
If set, will return tensors of a particular framework. Acceptable values are:
|
|
||||||
|
|
||||||
- `'tf'`: Return TensorFlow `tf.constant` objects.
|
|
||||||
- `'pt'`: Return PyTorch `torch.Tensor` objects.
|
|
||||||
- `'np'`: Return NumPy `np.ndarray` objects.
|
|
||||||
- `'jax'`: Return JAX `jnp.ndarray` objects.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
[`BatchFeature`]: A [`BatchFeature`] with the following fields:
|
|
||||||
|
|
||||||
- **pixel_values** -- Pixel values to be fed to a model, of shape (batch_size, num_channels, height,
|
|
||||||
width).
|
|
||||||
"""
|
|
||||||
# Input type checking for clearer error
|
|
||||||
valid_images = False
|
|
||||||
|
|
||||||
# Check that images has a valid type
|
|
||||||
if isinstance(images, (Image.Image, np.ndarray)) or is_torch_tensor(images):
|
|
||||||
valid_images = True
|
|
||||||
elif isinstance(images, (list, tuple)):
|
|
||||||
if len(images) == 0 or isinstance(images[0], (Image.Image, np.ndarray)) or is_torch_tensor(images[0]):
|
|
||||||
valid_images = True
|
|
||||||
|
|
||||||
if not valid_images:
|
|
||||||
raise ValueError(
|
|
||||||
"Images must of type `PIL.Image.Image`, `np.ndarray` or `torch.Tensor` (single example), "
|
|
||||||
"`List[PIL.Image.Image]`, `List[np.ndarray]` or `List[torch.Tensor]` (batch of examples)."
|
|
||||||
)
|
|
||||||
|
|
||||||
is_batched = bool(
|
|
||||||
isinstance(images, (list, tuple))
|
|
||||||
and (isinstance(images[0], (Image.Image, np.ndarray)) or is_torch_tensor(images[0]))
|
|
||||||
)
|
|
||||||
|
|
||||||
if not is_batched:
|
|
||||||
images = [images]
|
|
||||||
|
|
||||||
# transformations (resizing + rescaling)
|
|
||||||
if self.do_resize and self.size_divisor is not None:
|
|
||||||
images = [
|
|
||||||
self._resize(image=image, size_divisor=self.size_divisor, resample=self.resample) for image in images
|
|
||||||
]
|
|
||||||
if self.do_rescale:
|
|
||||||
images = [self.to_numpy_array(image=image) for image in images]
|
|
||||||
|
|
||||||
# return as BatchFeature
|
|
||||||
data = {"pixel_values": images}
|
|
||||||
encoded_inputs = BatchFeature(data=data, tensor_type=return_tensors)
|
|
||||||
|
|
||||||
return encoded_inputs
|
|
||||||
|
186
src/transformers/models/glpn/image_processing_glpn.py
Normal file
186
src/transformers/models/glpn/image_processing_glpn.py
Normal file
@ -0,0 +1,186 @@
|
|||||||
|
# coding=utf-8
|
||||||
|
# Copyright 2022 The HuggingFace Inc. team. All rights reserved.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
"""Image processor class for GLPN."""
|
||||||
|
|
||||||
|
from typing import List, Optional, Union
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import PIL.Image
|
||||||
|
|
||||||
|
from transformers.utils.generic import TensorType
|
||||||
|
|
||||||
|
from ...image_processing_utils import BaseImageProcessor, BatchFeature
|
||||||
|
from ...image_transforms import rescale, resize, to_channel_dimension_format
|
||||||
|
from ...image_utils import ChannelDimension, get_image_size, is_batched, to_numpy_array, valid_images
|
||||||
|
from ...utils import logging
|
||||||
|
|
||||||
|
|
||||||
|
logger = logging.get_logger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
class GLPNImageProcessor(BaseImageProcessor):
|
||||||
|
r"""
|
||||||
|
Constructs a GLPN image processor.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
do_resize (`bool`, *optional*, defaults to `True`):
|
||||||
|
Set the class default for the `do_resize` parameter. Controls whether to resize the image's (height, width)
|
||||||
|
dimensions, rounding them down to the closest multiple of `size_divisor`.
|
||||||
|
size_divisor (`int`, *optional*, defaults to 32):
|
||||||
|
Set the class default for the `size_divisor` parameter. When `do_resize` is `True`, images are resized so
|
||||||
|
their height and width are rounded down to the closest multiple of `size_divisor`.
|
||||||
|
resample (`PIL.Image` resampling filter, *optional*, defaults to `PIL.Image.BILINEAR`):
|
||||||
|
Set the class default for `resample`. Defines the resampling filter to use if resizing the image.
|
||||||
|
do_rescale (`bool`, *optional*, defaults to `True`):
|
||||||
|
Set the class default for the `do_rescale` parameter. Controls whether or not to apply the scaling factor
|
||||||
|
(to make pixel values floats between 0. and 1.).
|
||||||
|
"""
|
||||||
|
|
||||||
|
model_input_names = ["pixel_values"]
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
do_resize: bool = True,
|
||||||
|
size_divisor: int = 32,
|
||||||
|
resample=PIL.Image.BILINEAR,
|
||||||
|
do_rescale: bool = True,
|
||||||
|
**kwargs
|
||||||
|
) -> None:
|
||||||
|
self.do_resize = do_resize
|
||||||
|
self.do_rescale = do_rescale
|
||||||
|
self.size_divisor = size_divisor
|
||||||
|
self.resample = resample
|
||||||
|
super().__init__(**kwargs)
|
||||||
|
|
||||||
|
def resize(
|
||||||
|
self, image: np.ndarray, size_divisor: int, resample, data_format: Optional[ChannelDimension] = None, **kwargs
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Resize the image, rounding the (height, width) dimensions down to the closest multiple of size_divisor.
|
||||||
|
|
||||||
|
If the image is of dimension (3, 260, 170) and size_divisor is 32, the image will be resized to (3, 256, 160).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`np.ndarray`):
|
||||||
|
The image to resize.
|
||||||
|
size_divisor (`int`):
|
||||||
|
The image is resized so its height and width are rounded down to the closest multiple of
|
||||||
|
`size_divisor`.
|
||||||
|
resample:
|
||||||
|
`PIL.Image` resampling filter to use when resizing the image e.g. `PIL.Image.BILINEAR`.
|
||||||
|
data_format (`ChannelDimension`, *optional*):
|
||||||
|
The channel dimension format for the output image. If `None`, the channel dimension format of the input
|
||||||
|
image is used. Can be one of:
|
||||||
|
- `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
|
||||||
|
- `ChannelDimension.LAST`: image in (height, width, num_channels) format.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`np.ndarray`: The resized image.
|
||||||
|
"""
|
||||||
|
height, width = get_image_size(image)
|
||||||
|
# Rounds the height and width down to the closest multiple of size_divisor
|
||||||
|
new_h = height // size_divisor * size_divisor
|
||||||
|
new_w = width // size_divisor * size_divisor
|
||||||
|
image = resize(image, (new_h, new_w), resample=resample, data_format=data_format, **kwargs)
|
||||||
|
return image
|
||||||
|
|
||||||
|
def rescale(
|
||||||
|
self, image: np.ndarray, scale: float, data_format: Optional[ChannelDimension] = None, **kwargs
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""
|
||||||
|
Rescale the image by the given scaling factor `scale`.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image (`np.ndarray`):
|
||||||
|
The image to rescale.
|
||||||
|
scale (`float`):
|
||||||
|
The scaling factor to rescale pixel values by.
|
||||||
|
data_format (`ChannelDimension`, *optional*):
|
||||||
|
The channel dimension format for the output image. If `None`, the channel dimension format of the input
|
||||||
|
image is used. Can be one of:
|
||||||
|
- `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
|
||||||
|
- `ChannelDimension.LAST`: image in (height, width, num_channels) format.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
`np.ndarray`: The rescaled image.
|
||||||
|
"""
|
||||||
|
return rescale(image=image, scale=scale, data_format=data_format, **kwargs)
|
||||||
|
|
||||||
|
def preprocess(
|
||||||
|
self,
|
||||||
|
images: Union["PIL.Image.Image", TensorType, List["PIL.Image.Image"], List[TensorType]],
|
||||||
|
do_resize: Optional[bool] = None,
|
||||||
|
size_divisor: Optional[int] = None,
|
||||||
|
resample=None,
|
||||||
|
do_rescale: Optional[bool] = None,
|
||||||
|
return_tensors: Optional[Union[TensorType, str]] = None,
|
||||||
|
data_format: ChannelDimension = ChannelDimension.FIRST,
|
||||||
|
**kwargs
|
||||||
|
) -> BatchFeature:
|
||||||
|
"""
|
||||||
|
Preprocess the given images.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
images (`PIL.Image.Image` or `TensorType` or `List[np.ndarray]` or `List[TensorType]`):
|
||||||
|
The image or images to preprocess.
|
||||||
|
do_resize (`bool`, *optional*, defaults to `self.do_resize`):
|
||||||
|
Whether to resize the input such that the (height, width) dimensions are a multiple of `size_divisor`.
|
||||||
|
size_divisor (`int`, *optional*, defaults to `self.size_divisor`):
|
||||||
|
When `do_resize` is `True`, images are resized so their height and width are rounded down to the
|
||||||
|
closest multiple of `size_divisor`.
|
||||||
|
resample (`PIL.Image` resampling filter, *optional*, defaults to `self.resample`):
|
||||||
|
`PIL.Image` resampling filter to use if resizing the image e.g. `PIL.Image.BILINEAR`. Only has an
|
||||||
|
effect if `do_resize` is set to `True`.
|
||||||
|
do_rescale (`bool`, *optional*, defaults to `self.do_rescale`):
|
||||||
|
Whether or not to apply the scaling factor (to make pixel values floats between 0. and 1.).
|
||||||
|
return_tensors (`str`, *optional*):
|
||||||
|
The type of tensors to return. Can be one of:
|
||||||
|
- `None`: Return a list of `np.ndarray`.
|
||||||
|
- `TensorType.TENSORFLOW` or `'tf'`: Return a batch of type `tf.Tensor`.
|
||||||
|
- `TensorType.PYTORCH` or `'pt'`: Return a batch of type `torch.Tensor`.
|
||||||
|
- `TensorType.NUMPY` or `'np'`: Return a batch of type `np.ndarray`.
|
||||||
|
- `TensorType.JAX` or `'jax'`: Return a batch of type `jax.numpy.ndarray`.
|
||||||
|
data_format (`ChannelDimension`, *optional*, defaults to `ChannelDimension.FIRST`):
|
||||||
|
The channel dimension format for the output image. Can be one of:
|
||||||
|
- `ChannelDimension.FIRST`: image in (num_channels, height, width) format.
|
||||||
|
- `ChannelDimension.LAST`: image in (height, width, num_channels) format.
|
||||||
|
"""
|
||||||
|
do_resize = do_resize if do_resize is not None else self.do_resize
|
||||||
|
do_rescale = do_rescale if do_rescale is not None else self.do_rescale
|
||||||
|
size_divisor = size_divisor if size_divisor is not None else self.size_divisor
|
||||||
|
resample = resample if resample is not None else self.resample
|
||||||
|
|
||||||
|
if do_resize and size_divisor is None:
|
||||||
|
raise ValueError("size_divisor is required for resizing")
|
||||||
|
|
||||||
|
if not is_batched(images):
|
||||||
|
images = [images]
|
||||||
|
|
||||||
|
if not valid_images(images):
|
||||||
|
raise ValueError("Invalid image(s)")
|
||||||
|
|
||||||
|
# All transformations expect numpy arrays.
|
||||||
|
images = [to_numpy_array(img) for img in images]
|
||||||
|
|
||||||
|
if do_resize:
|
||||||
|
images = [self.resize(image, size_divisor=size_divisor, resample=resample) for image in images]
|
||||||
|
|
||||||
|
if do_rescale:
|
||||||
|
images = [self.rescale(image, scale=1 / 255) for image in images]
|
||||||
|
|
||||||
|
images = [to_channel_dimension_format(image, data_format) for image in images]
|
||||||
|
|
||||||
|
data = {"pixel_values": images}
|
||||||
|
return BatchFeature(data=data, tensor_type=return_tensors)
|
@ -164,6 +164,7 @@ SAFE_WEIGHTS_NAME = "model.safetensors"
|
|||||||
SAFE_WEIGHTS_INDEX_NAME = "model.safetensors.index.json"
|
SAFE_WEIGHTS_INDEX_NAME = "model.safetensors.index.json"
|
||||||
CONFIG_NAME = "config.json"
|
CONFIG_NAME = "config.json"
|
||||||
FEATURE_EXTRACTOR_NAME = "preprocessor_config.json"
|
FEATURE_EXTRACTOR_NAME = "preprocessor_config.json"
|
||||||
|
IMAGE_PROCESSOR_NAME = FEATURE_EXTRACTOR_NAME
|
||||||
MODEL_CARD_NAME = "modelcard.json"
|
MODEL_CARD_NAME = "modelcard.json"
|
||||||
|
|
||||||
SENTENCEPIECE_UNDERLINE = "▁"
|
SENTENCEPIECE_UNDERLINE = "▁"
|
||||||
|
@ -3,6 +3,25 @@
|
|||||||
from ..utils import DummyObject, requires_backends
|
from ..utils import DummyObject, requires_backends
|
||||||
|
|
||||||
|
|
||||||
|
class ImageProcessorMixin(metaclass=DummyObject):
|
||||||
|
_backends = ["vision"]
|
||||||
|
|
||||||
|
def __init__(self, *args, **kwargs):
|
||||||
|
requires_backends(self, ["vision"])
|
||||||
|
|
||||||
|
|
||||||
|
def rescale(*args, **kwargs):
|
||||||
|
requires_backends(rescale, ["vision"])
|
||||||
|
|
||||||
|
|
||||||
|
def resize(*args, **kwargs):
|
||||||
|
requires_backends(resize, ["vision"])
|
||||||
|
|
||||||
|
|
||||||
|
def to_pil_image(*args, **kwargs):
|
||||||
|
requires_backends(to_pil_image, ["vision"])
|
||||||
|
|
||||||
|
|
||||||
class ImageFeatureExtractionMixin(metaclass=DummyObject):
|
class ImageFeatureExtractionMixin(metaclass=DummyObject):
|
||||||
_backends = ["vision"]
|
_backends = ["vision"]
|
||||||
|
|
||||||
|
174
tests/test_image_transforms.py
Normal file
174
tests/test_image_transforms.py
Normal file
@ -0,0 +1,174 @@
|
|||||||
|
# coding=utf-8
|
||||||
|
# Copyright 2022 HuggingFace Inc.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||||
|
# you may not use this file except in compliance with the License.
|
||||||
|
# You may obtain a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||||
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
|
# See the License for the specific language governing permissions and
|
||||||
|
# limitations under the License.
|
||||||
|
|
||||||
|
import unittest
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
from parameterized import parameterized
|
||||||
|
from transformers.testing_utils import require_flax, require_tf, require_torch, require_vision
|
||||||
|
from transformers.utils.import_utils import is_flax_available, is_tf_available, is_torch_available, is_vision_available
|
||||||
|
|
||||||
|
|
||||||
|
if is_torch_available():
|
||||||
|
import torch
|
||||||
|
|
||||||
|
if is_tf_available():
|
||||||
|
import tensorflow as tf
|
||||||
|
|
||||||
|
if is_flax_available():
|
||||||
|
import jax
|
||||||
|
|
||||||
|
if is_vision_available():
|
||||||
|
import PIL.Image
|
||||||
|
|
||||||
|
from transformers.image_transforms import (
|
||||||
|
get_resize_output_image_size,
|
||||||
|
resize,
|
||||||
|
to_channel_dimension_format,
|
||||||
|
to_pil_image,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_random_image(height, width, num_channels=3, channels_first=True):
|
||||||
|
shape = (num_channels, height, width) if channels_first else (height, width, num_channels)
|
||||||
|
random_array = np.random.randint(0, 256, shape, dtype=np.uint8)
|
||||||
|
return random_array
|
||||||
|
|
||||||
|
|
||||||
|
@require_vision
|
||||||
|
class ImageTransformsTester(unittest.TestCase):
|
||||||
|
@parameterized.expand(
|
||||||
|
[
|
||||||
|
("numpy_float_channels_first", (3, 4, 5), np.float32),
|
||||||
|
("numpy_float_channels_last", (4, 5, 3), np.float32),
|
||||||
|
("numpy_int_channels_first", (3, 4, 5), np.int32),
|
||||||
|
("numpy_uint_channels_first", (3, 4, 5), np.uint8),
|
||||||
|
]
|
||||||
|
)
|
||||||
|
@require_vision
|
||||||
|
def test_to_pil_image(self, name, image_shape, dtype):
|
||||||
|
image = np.random.randint(0, 256, image_shape).astype(dtype)
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
@require_tf
|
||||||
|
def test_to_pil_image_from_tensorflow(self):
|
||||||
|
# channels_first
|
||||||
|
image = tf.random.uniform((3, 4, 5))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
# channels_last
|
||||||
|
image = tf.random.uniform((4, 5, 3))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
@require_torch
|
||||||
|
def test_to_pil_image_from_torch(self):
|
||||||
|
# channels first
|
||||||
|
image = torch.rand((3, 4, 5))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
# channels last
|
||||||
|
image = torch.rand((4, 5, 3))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
@require_flax
|
||||||
|
def test_to_pil_image_from_jax(self):
|
||||||
|
key = jax.random.PRNGKey(0)
|
||||||
|
# channel first
|
||||||
|
image = jax.random.uniform(key, (3, 4, 5))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
# channel last
|
||||||
|
image = jax.random.uniform(key, (4, 5, 3))
|
||||||
|
pil_image = to_pil_image(image)
|
||||||
|
self.assertIsInstance(pil_image, PIL.Image.Image)
|
||||||
|
self.assertEqual(pil_image.size, (5, 4))
|
||||||
|
|
||||||
|
def test_to_channel_dimension_format(self):
|
||||||
|
# Test that function doesn't reorder if channel dim matches the input.
|
||||||
|
image = np.random.rand(3, 4, 5)
|
||||||
|
image = to_channel_dimension_format(image, "channels_first")
|
||||||
|
self.assertEqual(image.shape, (3, 4, 5))
|
||||||
|
|
||||||
|
image = np.random.rand(4, 5, 3)
|
||||||
|
image = to_channel_dimension_format(image, "channels_last")
|
||||||
|
self.assertEqual(image.shape, (4, 5, 3))
|
||||||
|
|
||||||
|
# Test that function reorders if channel dim doesn't match the input.
|
||||||
|
image = np.random.rand(3, 4, 5)
|
||||||
|
image = to_channel_dimension_format(image, "channels_last")
|
||||||
|
self.assertEqual(image.shape, (4, 5, 3))
|
||||||
|
|
||||||
|
image = np.random.rand(4, 5, 3)
|
||||||
|
image = to_channel_dimension_format(image, "channels_first")
|
||||||
|
self.assertEqual(image.shape, (3, 4, 5))
|
||||||
|
|
||||||
|
def test_get_resize_output_image_size(self):
|
||||||
|
image = np.random.randint(0, 256, (3, 224, 224))
|
||||||
|
|
||||||
|
# Test the output size defaults to (x, x) if an int is given.
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, 10), (10, 10))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, [10]), (10, 10))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, (10,)), (10, 10))
|
||||||
|
|
||||||
|
# Test the output size is the same as the input if a two element tuple/list is given.
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, (10, 20)), (10, 20))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, [10, 20]), (10, 20))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, (10, 20), default_to_square=True), (10, 20))
|
||||||
|
# To match pytorch behaviour, max_size is only relevant if size is an int
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, (10, 20), max_size=5), (10, 20))
|
||||||
|
|
||||||
|
# Test output size = (int(size * height / width), size) if size is an int and height > width
|
||||||
|
image = np.random.randint(0, 256, (3, 50, 40))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, 20, default_to_square=False), (25, 20))
|
||||||
|
|
||||||
|
# Test output size = (size, int(size * width / height)) if size is an int and width <= height
|
||||||
|
image = np.random.randint(0, 256, (3, 40, 50))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, 20, default_to_square=False), (20, 25))
|
||||||
|
|
||||||
|
# Test size is resized if longer size > max_size
|
||||||
|
image = np.random.randint(0, 256, (3, 50, 40))
|
||||||
|
self.assertEqual(get_resize_output_image_size(image, 20, default_to_square=False, max_size=22), (22, 17))
|
||||||
|
|
||||||
|
def test_resize(self):
|
||||||
|
image = np.random.randint(0, 256, (3, 224, 224))
|
||||||
|
|
||||||
|
# Check the channel order is the same by default
|
||||||
|
resized_image = resize(image, (30, 40))
|
||||||
|
self.assertIsInstance(resized_image, np.ndarray)
|
||||||
|
self.assertEqual(resized_image.shape, (3, 30, 40))
|
||||||
|
|
||||||
|
# Check channel order is changed if specified
|
||||||
|
resized_image = resize(image, (30, 40), data_format="channels_last")
|
||||||
|
self.assertIsInstance(resized_image, np.ndarray)
|
||||||
|
self.assertEqual(resized_image.shape, (30, 40, 3))
|
||||||
|
|
||||||
|
# Check PIL.Image.Image is return if return_numpy=False
|
||||||
|
resized_image = resize(image, (30, 40), return_numpy=False)
|
||||||
|
self.assertIsInstance(resized_image, PIL.Image.Image)
|
||||||
|
# PIL size is in (width, height) order
|
||||||
|
self.assertEqual(resized_image.size, (40, 30))
|
@ -17,8 +17,10 @@ import unittest
|
|||||||
|
|
||||||
import datasets
|
import datasets
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
import pytest
|
||||||
|
|
||||||
from transformers import is_torch_available, is_vision_available
|
from transformers import is_torch_available, is_vision_available
|
||||||
|
from transformers.image_utils import ChannelDimension
|
||||||
from transformers.testing_utils import require_torch, require_vision
|
from transformers.testing_utils import require_torch, require_vision
|
||||||
|
|
||||||
|
|
||||||
@ -29,7 +31,7 @@ if is_vision_available():
|
|||||||
import PIL.Image
|
import PIL.Image
|
||||||
|
|
||||||
from transformers import ImageFeatureExtractionMixin
|
from transformers import ImageFeatureExtractionMixin
|
||||||
from transformers.image_utils import load_image
|
from transformers.image_utils import get_image_size, infer_channel_dimension_format, load_image
|
||||||
|
|
||||||
|
|
||||||
def get_random_image(height, width):
|
def get_random_image(height, width):
|
||||||
@ -485,3 +487,51 @@ class LoadImageTester(unittest.TestCase):
|
|||||||
img_arr_with_exif_transpose.shape,
|
img_arr_with_exif_transpose.shape,
|
||||||
(500, 333, 3),
|
(500, 333, 3),
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class UtilFunctionTester(unittest.TestCase):
|
||||||
|
def test_get_image_size(self):
|
||||||
|
# Test we can infer the size and channel dimension of an image.
|
||||||
|
image = np.random.randint(0, 256, (32, 64, 3))
|
||||||
|
self.assertEqual(get_image_size(image), (32, 64))
|
||||||
|
|
||||||
|
image = np.random.randint(0, 256, (3, 32, 64))
|
||||||
|
self.assertEqual(get_image_size(image), (32, 64))
|
||||||
|
|
||||||
|
# Test the channel dimension can be overriden
|
||||||
|
image = np.random.randint(0, 256, (3, 32, 64))
|
||||||
|
self.assertEqual(get_image_size(image, channel_dim=ChannelDimension.LAST), (3, 32))
|
||||||
|
|
||||||
|
def test_infer_channel_dimension(self):
|
||||||
|
# Test we fail with invalid input
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
infer_channel_dimension_format(np.random.randint(0, 256, (10, 10)))
|
||||||
|
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
infer_channel_dimension_format(np.random.randint(0, 256, (10, 10, 10, 10, 10)))
|
||||||
|
|
||||||
|
# Test we fail if neither first not last dimension is of size 3 or 1
|
||||||
|
with pytest.raises(ValueError):
|
||||||
|
infer_channel_dimension_format(np.random.randint(0, 256, (10, 1, 50)))
|
||||||
|
|
||||||
|
# Test we correctly identify the channel dimension
|
||||||
|
image = np.random.randint(0, 256, (3, 4, 5))
|
||||||
|
inferred_dim = infer_channel_dimension_format(image)
|
||||||
|
self.assertEqual(inferred_dim, ChannelDimension.FIRST)
|
||||||
|
|
||||||
|
image = np.random.randint(0, 256, (1, 4, 5))
|
||||||
|
inferred_dim = infer_channel_dimension_format(image)
|
||||||
|
self.assertEqual(inferred_dim, ChannelDimension.FIRST)
|
||||||
|
|
||||||
|
image = np.random.randint(0, 256, (4, 5, 3))
|
||||||
|
inferred_dim = infer_channel_dimension_format(image)
|
||||||
|
self.assertEqual(inferred_dim, ChannelDimension.LAST)
|
||||||
|
|
||||||
|
image = np.random.randint(0, 256, (4, 5, 1))
|
||||||
|
inferred_dim = infer_channel_dimension_format(image)
|
||||||
|
self.assertEqual(inferred_dim, ChannelDimension.LAST)
|
||||||
|
|
||||||
|
# We can take a batched array of images and find the dimension
|
||||||
|
image = np.random.randint(0, 256, (1, 3, 4, 5))
|
||||||
|
inferred_dim = infer_channel_dimension_format(image)
|
||||||
|
self.assertEqual(inferred_dim, ChannelDimension.FIRST)
|
||||||
|
@ -353,6 +353,7 @@ SPECIAL_MODULE_TO_TEST_MAP = {
|
|||||||
"feature_extraction_sequence_utils.py": "test_sequence_feature_extraction_common.py",
|
"feature_extraction_sequence_utils.py": "test_sequence_feature_extraction_common.py",
|
||||||
"feature_extraction_utils.py": "test_feature_extraction_common.py",
|
"feature_extraction_utils.py": "test_feature_extraction_common.py",
|
||||||
"file_utils.py": ["utils/test_file_utils.py", "utils/test_model_output.py"],
|
"file_utils.py": ["utils/test_file_utils.py", "utils/test_model_output.py"],
|
||||||
|
"image_transforms.py": "test_image_transforms.py",
|
||||||
"utils/generic.py": ["utils/test_file_utils.py", "utils/test_model_output.py", "utils/test_generic.py"],
|
"utils/generic.py": ["utils/test_file_utils.py", "utils/test_model_output.py", "utils/test_generic.py"],
|
||||||
"utils/hub.py": "utils/test_hub_utils.py",
|
"utils/hub.py": "utils/test_hub_utils.py",
|
||||||
"modelcard.py": "utils/test_model_card.py",
|
"modelcard.py": "utils/test_model_card.py",
|
||||||
|
Loading…
Reference in New Issue
Block a user