mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-03 03:31:05 +06:00
parent
eb79b55bf3
commit
8017a59091
@ -225,7 +225,7 @@ For users, a rule of thumb is:
|
|||||||
|
|
||||||
- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
|
- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
|
||||||
only way to go.**
|
only way to go.**
|
||||||
- If you are latency constrained (live product doing inference), don't batch
|
- If you are latency constrained (live product doing inference), don't batch.
|
||||||
- If you are using CPU, don't batch.
|
- If you are using CPU, don't batch.
|
||||||
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:
|
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user