Patrick von Platen
31d06729f4
Update README.md
2021-07-20 14:19:37 +02:00
Patrick von Platen
c6b9095cb2
Update README.md
2021-07-17 19:22:26 +02:00
Patrick von Platen
b4b562d834
[Wav2Vec2] Padded vectors should not allowed to be sampled ( #12764 )
...
* fix_torch_device_generate_test
* remove @
* finish
* correct script
* correct script
2021-07-16 19:07:08 +02:00
Suraj Patil
8ef3f36561
fix typos ( #12757 )
2021-07-16 16:44:59 +05:30
Patrick von Platen
a76dd7ee82
Update README.md
2021-07-16 00:16:30 +01:00
Patrick von Platen
2e9fb13fb1
[Wav2Vec2] Correctly pad mask indices for PreTraining ( #12748 )
...
* fix_torch_device_generate_test
* remove @
* start adding tests
* correct wav2vec2 pretraining
* up
* up
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-15 21:40:25 +01:00
Suraj Patil
44f5b260fe
flax model parallel training ( #12590 )
...
* update scripts
* add copyright
* add logging
* cleanup
* add z loss
* add readme
* shard description
* update readme
2021-07-14 22:55:44 +05:30
Omar Sanseviero
c523b241c2
Update timeline for Flax event evaluation
2021-07-12 21:24:58 +02:00
Eduardo Gonzalez Ponferrada
2dd9440d08
Point to the right file for hybrid CLIP ( #12599 )
2021-07-12 12:16:22 +05:30
Omar Sanseviero
8fe836af5a
Add Flax sprint project evaluation section ( #12592 )
2021-07-09 08:52:30 +02:00
Suraj Patil
d7e156bd1a
fix loading clip vision model ( #12566 )
2021-07-07 22:50:27 +05:30
Patrick von Platen
7d321b7689
[Flax] Allow retraining from save checkpoint ( #12559 )
...
* fix_torch_device_generate_test
* remove @
* finish
2021-07-07 19:13:43 +05:30
SaulLu
09af5bdea3
Replace nn.Moudle
by nn.Module
( #12541 )
2021-07-06 11:31:45 -04:00
Patrick von Platen
f42a0abf4b
Update README.md
2021-07-06 15:14:48 +01:00
Suzana Ilić
029b9d3f40
Update README ( #12540 )
2021-07-06 16:12:16 +02:00
Suraj Patil
f5b0c1ecf0
[Flax] Fix hybrid clip ( #12519 )
...
* fix saving and loading
* update readme
2021-07-06 11:12:47 +05:30
Patrick von Platen
7d6285a921
[Wav2Vec2] Flax - Adapt wav2vec2 script ( #12520 )
...
* fix_torch_device_generate_test
* remove @
* adapt flax pretrain script
2021-07-05 23:49:47 +01:00
Patrick von Platen
9b90810558
[Flax] Dataset streaming example ( #12470 )
...
* fix_torch_device_generate_test
* remove @
* upload
* finish dataset streaming
* adapt readme
* finish
* up
* up
* up
* up
* Apply suggestions from code review
* finish
* make style
* make style2
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
2021-07-05 15:13:10 +01:00
Navjot
eceb1042c1
flax.linen.apply takes state as the first param, followed by the input ( #12510 )
2021-07-05 19:33:14 +05:30
Akmal
e799e0f1ed
[Flax] Fix wav2vec2 pretrain arguments ( #12498 )
2021-07-05 13:35:20 +01:00
Suraj Patil
23ab0b6980
[examples/flax] clip style image-text training example ( #12491 )
...
* clip style example
* fix post init
* add requirements
* update readme, few small fixes
2021-07-05 13:26:44 +05:30
Lysandre Debut
89a8739f0c
Add Repository
import to the FLAX example script ( #12501 )
2021-07-05 03:51:11 -04:00
Patrick von Platen
2df63282e0
Update README.md
2021-07-04 13:16:29 +01:00
Omar Sanseviero
a76eebfc80
Add guide on how to build demos for the Flax sprint ( #12468 )
2021-07-02 20:35:17 +02:00
Patrick von Platen
b21905e03d
Update README.md
2021-07-02 14:12:47 +01:00
Patrick von Platen
d24a523130
Update README.md
2021-07-02 13:41:14 +01:00
Patrick von Platen
e3fce2f868
Update README.md
...
Thanks a lot @BirgerMoell
2021-07-02 12:12:54 +01:00
Matthew LeMay
b4ecc6bef2
fixed typo in flax-projects readme ( #12466 )
2021-07-02 12:27:39 +05:30
Patrick von Platen
7f87bfc910
Add TPU README ( #12463 )
...
* Add TPU README
* Apply suggestions from code review
* Update examples/research_projects/jax-projects/README.md
* Update examples/research_projects/jax-projects/README.md
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2021-07-01 17:11:54 +01:00
Patrick von Platen
1457839fc5
Update README.md
2021-07-01 15:52:11 +01:00
Suzana Ilić
c18af5d40c
Added talk details ( #12465 )
2021-07-01 16:19:23 +02:00
Patrick von Platen
b655f16d4e
[Flax community event] How to use hub during training ( #12447 )
...
* fix_torch_device_generate_test
* remove @
* upload
* finish doc
* Apply suggestions from code review
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* finish
Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2021-07-01 11:41:22 +01:00
Patrick von Platen
0d1f67e651
[Flax] Add wav2vec2 ( #12271 )
...
* fix_torch_device_generate_test
* remove @
* start flax wav2vec2
* save intermediate
* forward pass has correct shape
* add weight norm
* add files
* finish ctc
* make style
* finish gumbel quantizer
* correct docstrings
* correct some more files
* fix vit
* finish quality
* correct tests
* correct docstring
* correct tests
* start wav2vec2 pretraining script
* save intermediate
* start pretraining script
* finalize pretraining script
* finish
* finish
* small typo
* finish
* correct
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* make style
* push
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-06-30 18:44:23 +01:00
Suraj Patil
3f36a2c064
[JAX/Flax readme] add philosophy doc ( #12419 )
...
* add philosophy doc
* fix typos
* update doc
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* address Patricks suggestions
* add a training example and fix typos
* jit the training step
* jit train step
* fix example code
* typo
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-06-30 21:40:12 +05:30
Suzana Ilić
1ad1c4a864
Add to talks section ( #12442 )
2021-06-30 16:58:03 +02:00
Suzana Ilić
90d69456eb
Added to talks section ( #12433 )
...
Added one more confirmed speaker, zoom links and gcal event links
2021-06-30 13:14:11 +02:00
Suzana Ilić
b440b8d1ce
Added talks ( #12415 )
2021-06-29 16:01:16 +01:00
Shamane Siri
5257818e68
minor fixes in original RAG training ( #12395 )
2021-06-29 13:39:48 +01:00
Patrick von Platen
31c3e7e75b
[Flax] Add T5 pretraining script ( #12355 )
...
* fix_torch_device_generate_test
* remove @
* add length computatan
* finish masking
* finish
* upload
* fix some bugs
* finish
* fix dependency table
* correct tensorboard
* Apply suggestions from code review
* correct processing
* slight change init
* correct some more mistakes
* apply suggestions
* improve readme
* fix indent
* Apply suggestions from code review
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
* correct tokenizer
* finish
* finish
* finish
* finish
Co-authored-by: Patrick von Platen <patrick@huggingface.co>
Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>
2021-06-28 20:11:29 +01:00
Patrick von Platen
27b6ac4611
Update README.md
2021-06-28 17:22:10 +01:00
Patrick von Platen
89b57a6669
[Flax community event] Add more description to readme ( #12398 )
...
* fix_torch_device_generate_test
* remove @
* boom boom
* correct typos
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Apply suggestions from code review
Co-authored-by: Suzana Ilić <io.suzanai@gmail.com>
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Suzana Ilić <io.suzanai@gmail.com>
2021-06-28 17:18:42 +01:00
Stas Bekman
4a872caef4
remove extra white space from log format ( #12360 )
2021-06-25 13:20:14 -07:00
Vasudev Gupta
332a245861
Add FlaxBigBird QuestionAnswering script ( #12233 )
...
* port bigbird script
* adapt script a bit
* change location
* adapt more
* save progress
* init commit
* style
* dataset script tested
* readme add
2021-06-25 18:05:48 +01:00
Patrick von Platen
aa550c4a11
Update README.md
2021-06-25 11:55:51 +01:00
Marc van Zee
f2c4ce7e33
Add flax/jax quickstart ( #12342 )
2021-06-24 17:04:18 +01:00
Patrick von Platen
44739c8180
[Flax/JAX] Add how to propose projects markdown ( #12311 )
...
* fix_torch_device_generate_test
* remove @
* finish
* make style
2021-06-23 14:50:35 +01:00
Patrick von Platen
64029abe4c
[Flax] Main doc for event orga ( #12305 )
...
* fix_torch_device_generate_test
* remove @
* push
* finish
* some typos
* add more info on communication
* add suggestions
2021-06-22 18:02:52 +01:00
Vishal Burman
b53bc55ba9
Fix for making student ProphetNet for Seq2Seq Distillation ( #12130 )
...
* make_student.py: fix to make student ProphetNet
* reformat
2021-06-21 09:36:44 -04:00
Stas Bekman
88e84186e5
[style] consistent nn. and nn.functional: part 4 examples
( #12156 )
...
* consistent nn. and nn.functional: p4 examples
* restore
2021-06-14 12:28:24 -07:00
Stas Bekman
61e191987d
rm require_version_examples ( #12088 )
2021-06-09 11:02:52 -07:00
Anton Lozhkov
d472bd7b18
Wav2Vec2 Pretraining ( #11306 )
...
* Working quantizer forward
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Working quantizer forward
* Clean up unused model parts, test reproducibility
* Remove custom outputs from the shared ones
* correct conversion
* correct bug
* add first pretrain script
* save intermediate
* static shapes
* save intermediate
* finish first pretrain script version
* more refactor
* remove wanddb
* refactor more
* improve test
* correct perplexity compute bug
* finish model implementation
* add to docs
* finish docs
* finish pretraining script
* finish pretraining script
* remove wandb
* finish PR for merge
* finish config
* finish
* make deepspeed work
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply suggestions
* fix flaky test
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-09 18:40:56 +01:00
Stas Bekman
d14e0af274
sync LayerDrop for Wav2Vec2Encoder + tests ( #12076 )
2021-06-09 13:21:03 +01:00
Stas Bekman
11d86d3de4
[Deepspeed Wav2vec2] integration ( #11638 )
...
* wip
* wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044
* cleanup
* workaround
* working 5/8 modes
* solve fp32 distributed zero3
* style
* sync
* sync
* rework
* deprecation
* cleanup
* https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged
* clean up
* add a guide
* more prose
* more prose
* fix
* more prose
* sub_group_size was too big
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* refactor
* bug fix
* make the true check explicit
* new deepspeed release
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-06-08 12:32:03 -07:00
Mario Šaško
f5eec0d8e9
Replace legacy tensor.Tensor with torch.tensor/torch.empty ( #12027 )
...
* Replace legacy torch.Tensor constructor with torch.{tensor, empty}
* Remove torch.Tensor in examples
2021-06-08 13:58:38 +01:00
Shamane Siri
e33085d648
updated the original RAG implementation to be compatible with latest Pytorch-Lightning ( #11806 )
...
* updated the original RAG implementation to be compatible with the latest PL version
* updated the requirements.txt file
* execute make style
* code quality test
* code quality
* conflix resolved in requirement.txt
* code quality
* changed the MyDDP class name to CustomDDP
2021-06-08 13:42:49 +01:00
dependabot[bot]
6db3a87de2
Bump urllib3 from 1.25.8 to 1.26.5 in /examples/research_projects/lxmert ( #11983 )
...
Bumps [urllib3](https://github.com/urllib3/urllib3 ) from 1.25.8 to 1.26.5.
- [Release notes](https://github.com/urllib3/urllib3/releases )
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst )
- [Commits](https://github.com/urllib3/urllib3/compare/1.25.8...1.26.5 )
---
updated-dependencies:
- dependency-name: urllib3
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-02 03:40:20 -04:00
Shamane Siri
9ec0f01b6c
RAG-2nd2end-revamp ( #11893 )
...
* initial
* code quality test
* code quality
* added test functions in test_modeling_rag.py and test_retrieval_rag.py to test end2end retreiver
* minor change in test_modeling_rag
* fixed tests
* Update examples/research_projects/rag-end2end-retriever/README.md
typo corrected as suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* Update examples/research_projects/rag-end2end-retriever/finetune_rag.py
type change suggested by lhoestq
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* Update src/transformers/models/rag/retrieval_rag.py
Adding this change as mentioned by lhoestq.
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
* completed the minor changes suggested by the reviewers
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
2021-06-01 07:32:26 +01:00
Philip May
77f4c46b50
remove defaults to None if optional ( #11703 )
2021-05-12 09:11:10 -04:00
Quentin Lhoest
1a0b41781d
Update requirements.txt ( #11634 )
2021-05-10 11:19:52 +05:30
Tommy Chiang
7e406f4a65
[Examples] Fix invalid links after reorg ( #11650 )
2021-05-10 11:16:48 +05:30
Manuel Romero
58c789e3d2
Update README.md ( #11489 )
...
Add link to code
2021-04-30 04:29:59 -04:00
Jaimeen Ahn
0661abc545
Variable Correction for Consistency in Distillation Example ( #11444 )
...
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively, the correction makes the example work
2021-04-26 13:30:48 -04:00
Patrick von Platen
32dbb2d954
make style ( #11442 )
2021-04-26 13:50:34 +02:00
Sudharsan S T
f25444cb22
Close open files to suppress ResourceWarning ( #11240 )
...
Co-authored-by: Sudharsan Thirumalai <sudharsan.t@sprinklr.com>
2021-04-14 10:31:04 -04:00
Nithin Holla
653076ca30
Save the Wav2Vec2 processor before training starts ( #10910 )
...
Co-authored-by: nithin19 <nithin@amberscript.com>
2021-04-14 14:52:06 +03:00
Stas Bekman
c9035e4537
fix: The 'warn' method is deprecated ( #11105 )
...
* The 'warn' method is deprecated
* fix test
2021-04-07 09:20:06 -04:00
Stas Bekman
3d39226a51
s|Pretrained|PreTrained| ( #11048 )
2021-04-04 18:08:42 -07:00
versis
335c0ca35c
fixed typo: logging instead of logger ( #11025 )
2021-04-02 09:22:22 -04:00
Yih-Dar
e031162a6b
fix md file to avoid evaluation crash ( #10962 )
2021-03-30 21:26:22 +03:00
Stas Bekman
05c966f24b
[vulnerability] dep fix ( #10954 )
...
Fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pygments/open
@LysandreJik
2021-03-29 17:25:47 -04:00
Stas Bekman
3c27d246e5
[vulnerability] fix dependency ( #10914 )
...
this PR fixes https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/PyYAML/open
2021-03-26 09:06:11 -04:00
Stas Bekman
8fb4671811
[vulnerability] in example deps fix ( #10817 )
...
Takes care of:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/jinja2/open
@LysandreJik
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-22 09:05:24 -04:00
dependabot[bot]
dbfe379514
Bump jinja2 from 2.11.2 to 2.11.3 in /examples/research_projects/lxmert ( #10818 )
...
Bumps [jinja2](https://github.com/pallets/jinja ) from 2.11.2 to 2.11.3.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/master/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/2.11.2...2.11.3 )
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-03-22 08:54:50 -04:00
Qiushi Pan
29904a967b
Update FINE_TUNE_XLSR_WAV2VEC2.md ( #10849 )
...
Fix typo.
2021-03-22 07:58:59 -04:00
Patrick von Platen
0f226f78ce
push ( #10846 )
2021-03-22 10:32:21 +03:00
Suraj Patil
82b8d8c7b0
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 22:47:09 +05:30
Patrick von Platen
af6125ffdb
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-21 12:31:33 +03:00
Patrick von Platen
5aaf6e1460
small improvements for wav2vec2 info script ( #10829 )
2021-03-21 11:41:44 +03:00
Suraj Patil
68b55885ed
add doc for Local machine ( #10828 )
2021-03-21 13:25:34 +05:30
Julien Chaumond
1438c487df
wav2vec doc tweaks ( #10808 )
...
* wording/typos tweaks
* Make model upload instructions simpler
2021-03-19 12:48:54 -04:00
Patrick von Platen
b9570a813c
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 19:45:28 +03:00
Patrick von Platen
e8968bd03a
[XLSR-Wav2Vec2 Info doc] Add a couple of lines ( #10806 )
...
* finish
* fix
* fix
* fix
* fix
2021-03-19 12:52:54 +03:00
Stas Bekman
427ea3fecb
addressing vulnerability report in research project deps ( #10802 )
...
Following up on a security alert:
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/Pillow/open
2021-03-18 22:02:10 -04:00
Patrick von Platen
2ae678229f
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:29:20 +03:00
Patrick von Platen
68a3215949
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:27:40 +03:00
Patrick von Platen
03df3fbcb4
Update FINE_TUNE_XLSR_WAV2VEC2.md
2021-03-19 00:26:49 +03:00
Patrick von Platen
e84adbed40
Add XLSR-Wav2Vec2 Fine-Tuning README.md ( #10786 )
...
* upload
* upload fine-tuning script
* improve
* adapt
* Apply suggestions from code review
* correct
* upload
* finalize
* remove @
* correct typos
2021-03-19 00:22:43 +03:00
Suraj Patil
5f19c07a70
add run_common_voice script ( #10767 )
...
* add initial script
* finish script
* add shell script example
* accept chars_to_ignor as cl arg
* align the script with other example scripts
* add torchaudio dep
2021-03-18 17:21:16 +05:30
Mohamed El-Geish
af8afdc88d
wav2vec2: support datasets other than LibriSpeech ( #10581 )
...
* wav2vec2: support datasets other than LibriSpeech
* Formatting run_asr.py to pass code quality test
* bundled orthography options and added verbose logs
* fixing a typo in timit fine-tuning script
* update comment for clarity
* resize_lm_head and load custom vocab from file
* adding a max_duration_in_seconds filter
* do not assign `duration_filter` lambda, use a def
* log untransliterated text as well
* fix base model for arabic
* fix duration filter when target_sr is not set
* drop duration_in_seconds when unneeded
* script for wav2vec2-large-lv60-timit-asr
* fix for "tha" in arabic corpus (huggingface#10581)
* adding more options to work with common_voice
* PR feedback (huggingface#10581)
* small README change
2021-03-18 10:20:26 +03:00
Joe Davison
966ba081c9
zero-shot pipeline multi_class -> multi_label ( #10727 )
2021-03-15 16:02:46 -06:00
Stas Bekman
f284089ec4
[examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me ( #10561 )
...
* batch 1
* this is tpu
* deebert attempt
* the rest
2021-03-08 11:11:40 -08:00
Patrick von Platen
395ffcd757
fix run seq2seq ( #10547 )
2021-03-05 18:17:12 +03:00
Patrick von Platen
0234de8418
Add Fine-Tuning for Wav2Vec2 ( #10145 )
...
* add encode labels function to tokenizer
* start adding finetuning
* init dropout
* upload
* correct convert script
* apply changes
* fix second typo
* make first dummy training run
* adapt convert script
* push confg for comparison
* remove conf
* finish training
* adapt data collator
* add research folder
* update according to fairseq feedback
* some minor corrections
* refactor masking indices a bit
* some minor changes
* clean tokenizer
* finish clean-up
* remove previous logic
* update run script
* correct training
* finish changes
* finish model
* correct bug
* fix training a bit more
* add some tests
* finish gradient checkpointing
* finish example
* correct gradient checkpointing
* improve tokenization method
* revert changes in tokenizer
* revert general change
* adapt fine-tuning
* update
* save intermediate test
* Update README.md
* finish finetuning
* delete conversion script
* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py
* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* finish wav2vec2 script
* finish wav2vec2 fine-tuning
* finalize test
* correct test
* adapt tests
* finish
* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-03-01 12:13:17 +03:00
Joe Davison
cbadb5243c
Zero shot distillation script cuda patch ( #10284 )
2021-02-19 14:06:57 -05:00
Joe Davison
c6fe17557e
Script for distilling zero-shot classifier to more efficient student ( #10244 )
...
* add zero-shot distillation script
* readme wordsmithing
* clean up code
* add multi-gpu teacher inference
plus tidying up more code
* add use_fast_tokenizer arg
* update results in readme
* more readme wordsmithing
* style
* Add handle to readme
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix code block
* add error+docs about distributed & tpu
* add @sgugger format requests
* xla -> tpu
* support fp16 for teacher preds
* no checkpoint by default
* add demo colab link
* add model sharing prompt + model link
* correct resulting acc of example
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-02-18 17:08:45 -05:00
Lysandre Debut
0d8e554d42
Line endings should be LF across repo and not CRLF ( #10119 )
2021-02-10 10:50:00 -05:00
Stas Bekman
d55e10beab
[research proj] [lxmert] rm bleach dependency ( #9970 )
...
Looks like a vulnerability and it's not really used anywhere in the code, so just as well remove it completely from deps.
https://github.com/huggingface/transformers/security/dependabot/examples/research_projects/lxmert/requirements.txt/bleach/open
2021-02-03 05:24:40 -05:00
wlhgtc
1682804ebd
Fit chinese wwm to new datasets ( #9887 )
...
* MOD: fit chinese wwm to new datasets
* MOD: move wwm to new folder
* MOD: formate code
* Styling
* MOD add param and recover trainer
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2021-02-01 03:37:59 -05:00
Sylvain Gugger
3ec40299c1
Remove nested lxmert ( #9440 )
2021-01-07 04:10:41 -05:00
Patrick von Platen
eef66035a2
[PyTorch Bart] Split Bart into different models ( #9343 )
...
* first try
* remove old template
* finish bart
* finish mbart
* delete unnecessary line
* init pegasus
* save intermediate
* correct pegasus
* finish pegasus
* remove cookie cutter leftover
* add marian
* finish blenderbot
* replace in file
* correctly split blenderbot
* delete "old" folder
* correct "add statement"
* adapt config for tf comp
* correct configs for tf
* remove ipdb
* fix more stuff
* fix mbart
* push pegasus fix
* fix mbart
* more fixes
* fix research projects code
* finish docs for bart, mbart, and marian
* delete unnecessary file
* correct attn typo
* correct configs
* remove pegasus for seq class
* correct peg docs
* correct peg docs
* finish configs
* further improve docs
* add copied from statements to mbart
* fix copied from in mbart
* add copy statements to marian
* add copied from to marian
* add pegasus copied from
* finish pegasus
* finish copied from
* Apply suggestions from code review
* make style
* backward comp blenderbot
* apply lysandres and sylvains suggestions
* apply suggestions
* push last fixes
* fix docs
* fix tok tests
* fix imports code style
* fix doc
2021-01-05 22:00:05 +01:00