-
Notifications
You must be signed in to change notification settings - Fork 528
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add optional zero_start_index_M argument to triton fp8 rowwise quantization
cla signed
fb-exported
#3628
opened Jan 28, 2025 by
jwfromm
Loading…
finish #1808 cherry-pick, adjust interface
cla signed
fb-exported
#3627
opened Jan 28, 2025 by
coconutruben
Loading…
Fix calling
numel
on symbolic shapes issue
cla signed
fb-exported
#3621
opened Jan 27, 2025 by
Microve
Loading…
Re-land D67407935 (Optimized backward pass for ROCm devices, pt 2)
ciflow/rocm
cla signed
fb-exported
module: rocm
#3619
opened Jan 27, 2025 by
q10
Loading…
Performance Optimization: Optimized TileShape Configuration for f8
cla signed
#3617
opened Jan 27, 2025 by
MatrixAssembler
Loading…
Fix handling of dynamic FP8 grouped gemm on Nvidia
cla signed
fb-exported
#3616
opened Jan 26, 2025 by
jwfromm
Loading…
Updating split_table_batched_embeddings_ops_training.py
cla signed
fb-exported
#3613
opened Jan 24, 2025 by
basilwong
Loading…
Replace runners prefix amz2023. (#2895)
cla signed
fb-exported
module: rocm
#3612
opened Jan 24, 2025 by
q10
Loading…
Port oss f16_fast_gemv into fbcode
cla signed
fb-exported
#3610
opened Jan 23, 2025 by
YUNQIUGUO
Loading…
Fix autodeps for torch/custom_class.h and use it in kv_tensor_wrapper_cpu (#677)
cla signed
fb-exported
#3596
opened Jan 22, 2025 by
r-barnes
Loading…
Performance Optimization: Optimized TileShape Configuration for bf16 and Mixed Formats
cla signed
#3591
opened Jan 20, 2025 by
MatrixAssembler
Loading…
Support INT4 Dequant onto GPU for Seq INT TBE look up
cla signed
fb-exported
#3584
opened Jan 17, 2025 by
faran928
Loading…
Unifying TBE API using List (Backend)
cla signed
fb-exported
#3563
opened Jan 11, 2025 by
spcyppt
Loading…
Refactor FP8 grouped GEMM with dynamic and static versions
cla signed
fb-exported
#3561
opened Jan 10, 2025 by
jiawenliu64
Loading…
Support FP8 grouped GEMM with rowwise scailing
cla signed
fb-exported
#3560
opened Jan 10, 2025 by
jiawenliu64
Loading…
Switch dynamic FP8 grouped gemm to accept tensor inputs
cla signed
fb-exported
#3552
opened Jan 6, 2025 by
jwfromm
Loading…
env variable to select rounding mode
cla signed
fb-exported
#3515
opened Dec 19, 2024 by
hhyuanf
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.