loose unit test atol rtol tolerance to eliminate ut flakiness (#3664

) Summary: Pull Request resolved: #3664 X-link: facebookresearch/FBGEMM#739 upon further local testings, detected around 1 out of 10 iterations, the fast_gemv unit test can fail due to a few (0.1%~ 0.2%) noisy outputs. Loose the tolerance level to reduce flakiness. as an example failed test e.g. ``` Mismatched elements: 1 / 1280 (0.1%) Greatest absolute difference: 0.001953125 at index (0, 1064) (up to 0.001 allowed) Greatest relative difference: 0.007415771484375 at index (0, 1064) (up to 0.001 allowed) ``` Reviewed By: q10 Differential Revision: D69208318 fbshipit-source-id: b2f820b4e7ab5f5e8fcfc59e398f9b3328a963d6
pytorch · Feb 6, 2025 · 9a343a0 · 9a343a0
1 parent f98ab29
commit 9a343a0
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/fbgemm_gpu/experimental/gen_ai/test/quantize/quantize_test.py b/fbgemm_gpu/experimental/gen_ai/test/quantize/quantize_test.py
@@ -1131,7 +1131,7 @@ def test_bf16_gemv(self) -> None:
             z = torch.ops.fbgemm.bf16_fast_gemv(x, w)
             z_ref = (x @ w.T).to(torch.bfloat16).to("cuda")
 
-            torch.testing.assert_close(z, z_ref, atol=1.0e-3, rtol=1.0e-3)
+            torch.testing.assert_close(z, z_ref, atol=9.0e-3, rtol=9.0e-3)
 
     @unittest.skipIf(
         torch.version.hip, "Skip on AMD: cuda quantize op is yet supported."