mirror of
https://github.com/huggingface/transformers.git
synced 2025-07-13 17:48:22 +06:00
![]() * add tests for linear shape behavior * fix linear shape behavior ended up adding the reshape at the end, after f8f8bf16_rowwise, because adding it directly after quantize_fp8_per_row caused f8f8bf16_rowwise to drop the seq_len dimension. (i.e., (17, 23, 1014) -> (17, 1024)) * save shape up front + comment |
||
---|---|---|
.. | ||
__init__.py | ||
test_fbgemm_fp8.py |