mirror of
https://github.com/huggingface/transformers.git
synced 2025-08-02 11:11:05 +06:00
parent
eb79b55bf3
commit
8017a59091
@ -225,7 +225,7 @@ For users, a rule of thumb is:
|
||||
|
||||
- **Measure performance on your load, with your hardware. Measure, measure, and keep measuring. Real numbers are the
|
||||
only way to go.**
|
||||
- If you are latency constrained (live product doing inference), don't batch
|
||||
- If you are latency constrained (live product doing inference), don't batch.
|
||||
- If you are using CPU, don't batch.
|
||||
- If you are using throughput (you want to run your model on a bunch of static data), on GPU, then:
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user