Commit Graph

98 Commits

Author SHA1 Message Date
Matt
d128f2ffab
Stop requiring Torch for our TF examples! (#21997)
* Stop requiring Torch for our TF examples!

* Slight tweak to logging in the example itself
2023-03-07 15:54:10 +00:00
Matt
5d8efc79db
Add TF contrastive image text finetuning example (#21939)
* Initial commit

* stash commit

* Add model checkpointing and pushing

* Fix model name inference

* Update README

* Update README

* Remove a couple of Torch references

* Update copyright date

* make fixup

* Update PushToHubCallback args!

* Remove the torch summary

* Add strategy.scope
2023-03-06 16:57:40 +00:00
Matt
1d3a1cc44b
Add check for different embedding types in examples (#21881)
* Add check for different embedding types in examples

* Correctly update summarization example
2023-03-01 16:57:06 +00:00
Aaron Gokaslan
5e8c8eb5ba
Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
Sylvain Gugger
6f79d26442
Update quality tooling for formatting (#21480)
* Result of black 23.1

* Update target to Python 3.7

* Switch flake8 to ruff

* Configure isort

* Configure isort

* Apply isort with line limit

* Put the right black version

* adapt black in check copies

* Fix copies
2023-02-06 18:10:56 -05:00
amyeroberts
e5db7051a8
Add TF image classification example script (#19956)
* TF image classification script

* Update requirements

* Fix up

* Add tests

* Update test fetcher
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix directory path

* Adding `zero-shot-object-detection` pipeline doctest. (#20274)

* Adding `zero-shot-object-detection` pipeline doctest.

* Remove nested_simplify.

* Add generate kwargs to `AutomaticSpeechRecognitionPipeline` (#20952)

* Add generate kwargs to AutomaticSpeechRecognitionPipeline

* Add test for generation kwargs

* Trigger CI

* Data collator returns np

* Update feature extractor -> image processor

* Bug fixes - updates to reflect changes in API

* Update flags to match PT & run faster

* Update instructions - Maria's comment

* Update examples/tensorflow/image-classification/README.md

* Remove slow decorator

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
Co-authored-by: bofeng huang <bofenghuang7@gmail.com>
Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>
2023-02-01 19:09:36 +00:00
Matt
071529bd54
Use return_tensors="np" instead of "tf" (#21266)
Return NP instead of TF tensors for our data loading pipeline
2023-01-24 13:37:49 +00:00
Sylvain Gugger
7119bb052a
v4.27.0.dev0 2023-01-23 16:52:35 -05:00
Roy Hvaara
35a7052b61
[NumPy] Remove references to deprecated NumPy type aliases (#21022)
[NumPy] Remove references to deprecated NumPy type aliases.

This change replaces references to a number of deprecated NumPy type aliases (np.bool, np.int, np.float, np.complex, np.object, np.str) with their recommended replacement (bool, int, float, complex, object, str).

NumPy 1.24 drops the deprecated aliases, so we must remove uses before updating NumPy.

Co-authored-by: Peter Hawkins <phawkins@google.com>

Co-authored-by: Peter Hawkins <phawkins@google.com>
2023-01-05 13:02:10 -05:00
Sylvain Gugger
60d1f31bb0
v4.26.0.dev0 2022-12-01 16:19:33 -05:00
Sylvain Gugger
a3f7458066
Pin to the right version... 2022-11-18 07:12:55 -05:00
Sylvain Gugger
06886d5a68
Only resize embeddings when necessary (#20043)
* Only resize embeddings when necessary

* Add comment
2022-11-03 12:05:04 -04:00
Sylvain Gugger
c3a93d8d82
v4.25.0.dev0 2022-10-31 21:48:40 -04:00
Lysandre
10100979ed Dev version 2022-10-10 17:25:40 -04:00
Matt
83dc6377d0
Reduce LR for TF MLM example test (#19156) 2022-09-22 08:51:27 -04:00
Lysandre
16913b3c92 Dev version 2022-09-14 14:58:20 -04:00
Matt
6eb51450fa
TF Examples Rewrite (#18451)
* Finished QA example

* Dodge a merge conflict

* Update text classification and LM examples

* Update NER example

* New Keras metrics WIP, fix NER example

* Update NER example

* Update MC, summarization and translation examples

* Add XLA warnings when shapes are variable

* Make sure batch_size is consistently scaled by num_replicas

* Add PushToHubCallback to all models

* Add docs links for KerasMetricCallback

* Add docs links for prepare_tf_dataset and jit_compile

* Correct inferred model names

* Don't assume the dataset has 'lang'

* Don't assume the dataset has 'lang'

* Write metrics in text classification

* Add 'framework' to TrainingArguments and TFTrainingArguments

* Export metrics in all examples and add tests

* Fix training args for Flax

* Update command line args for translation test

* make fixup

* Fix accidentally running other tests in fp16

* Remove do_train/do_eval from run_clm.py

* Remove do_train/do_eval from run_mlm.py

* Add tensorflow tests to circleci

* Fix circleci

* Update examples/tensorflow/language-modeling/run_mlm.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/test_tensorflow_examples.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/translation/run_translation.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update examples/tensorflow/token-classification/run_ner.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix save path for tests

* Fix some model card kwargs

* Explain the magical -1000

* Actually enable tests this time

* Skip text classification PR until we fix shape inference

* make fixup

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2022-08-10 16:49:51 +01:00
Julien Chaumond
9129fd0377
transformers-cli login => huggingface-cli login (#18490)
* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`
2022-08-06 09:42:55 +02:00
Julien Chaumond
8d1f9039d0
Just re-reading the whole doc every couple of months 😬 (#18489)
* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task
2022-08-06 09:38:55 +02:00
Sylvain Gugger
941d233153
Fix ROUGE add example check and update README (#18398)
* Fix ROUGE add example check and update README

* Stay consistent in values
2022-08-01 11:14:49 -04:00
Sylvain Gugger
986526a0e4
Replace as_target context managers by direct calls (#18325)
* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Vijay S Kalmath
a2586795e5
Migrate metric to Evaluate library for tensorflow examples (#18327)
* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate `metric` to Evaluate for all tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.
2022-07-28 14:24:27 -04:00
Lysandre
c89a592e87 Dev version 2022-07-27 17:13:57 +02:00
John Giorgi
fde22c75a1
Add summarization name mapping for MultiNews (#18117)
* Add summarization name mapping for MultiNews

* Add summarization name mapping for MultiNews
2022-07-13 08:19:20 -04:00
Sylvain Gugger
7c6ec195ad v4.21.0.dev0 2022-06-16 12:20:53 -04:00
Sylvain Gugger
3cab90279f
Add examples telemetry (#17552)
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
2022-06-07 11:57:52 -04:00
Sylvain Gugger
afe5d42d8d
Black preview (#17217)
* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black
2022-05-12 16:25:55 -04:00
Lysandre Debut
5294fa12ee Dev version 2022-05-12 11:04:23 -04:00
Leonid Boytsov
c82e017aa9
Misc. fixes for Pytorch QA examples: (#16958)
1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed).
2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion).
3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions.
4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.
2022-04-27 08:51:39 -04:00
Wonjae Kim
b74a955325
fix rum_clm.py seeking text column name twice (#16624) 2022-04-19 14:38:25 +01:00
Lysandre Debut
a180efe7fd Dev version 2022-04-06 11:08:12 -04:00
Karim Foda
24a85cca61
Add use_auth to load_datasets for private datasets to PT and TF examples (#16521)
* fix formatting and remove use_auth

* Add use_auth_token to Flax examples
2022-04-04 10:27:45 -04:00
Stas Bekman
a73281e3e4
[examples] max samples can't be bigger than the len of dataset (#16501)
* [examples] max samples can't be bigger than then len of dataset

* do tf and flax
2022-03-30 12:33:16 -07:00
Sylvain Gugger
088c1880b7
Big file_utils cleanup (#16396)
* Big file_utils cleanup

* This one still needs to be treated separately
2022-03-25 07:25:20 -04:00
Sylvain Gugger
4975002df5
Reorganize file utils (#16264)
* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
Lysandre Debut
eca77f4719
Updates the default branch from master to main (#16326)
* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
Joao Gante
e7f34ccd4f
Swag example: Update doc format (#16014) 2022-03-09 13:25:34 +00:00
Joao Gante
62d847602a
Update TF multiple choice example (#15868) 2022-03-08 13:16:34 +00:00
Sylvain Gugger
79d28e80b6 v4.18.0.dev.0 2022-03-03 10:19:58 -05:00
Joao Gante
05c237ea94
Update TF QA example (#15870) 2022-03-02 10:38:13 +00:00
Joao Gante
3f2e636850
Update TF LM examples (#15855) 2022-03-01 14:12:58 +00:00
Joao Gante
3956b133b6
TF text classification examples (#15704)
* Working example with to_tf_dataset

* updated text_classification

* more comments
2022-02-21 17:17:59 +00:00
Sylvain Gugger
d0b5ed110a
Harder check for IndexErrors in QA scripts (#15438)
* Harder check for IndexErrors in QA scripts

* Make test stronger
2022-02-01 15:49:13 -05:00
Lysandre
eab338104d Docs for version v4.16.0 2022-01-27 13:11:51 -05:00
Lysandre
f87db5e412 Release: v4.16.0 2022-01-27 13:06:33 -05:00
Russell Klopfer
27b819b0e3
use block_size instead of max_seq_length in tf run_clm example (#15036)
* use block_size instead of max_seq_length

* fixup

* remove pad_to_block_size

Co-authored-by: Russell Klopfer <russell@kloper.us>
2022-01-12 08:57:00 -05:00
Patrick von Platen
fa39ff9fc4 Docs for v4.16.0dev0 2021-12-22 20:39:44 +01:00
Patrick von Platen
05fa1a7ac1 Release: v4.15.0 2021-12-22 18:43:15 +01:00
Lysandre
7c9c41f43c Docs for v4.14.0 2021-12-15 18:29:53 +01:00
Lysandre
960d8cb41d Release: v4.14.0 2021-12-15 18:20:35 +01:00