From 19c8586e7c47f5d0dc9d79ec36541bb3bcd211ba Mon Sep 17 00:00:00 2001 From: PawelGorny Date: Mon, 24 Jan 2022 20:18:13 +0100 Subject: [PATCH 1/2] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3fe77df..25d5bfa 100644 --- a/README.md +++ b/README.md @@ -73,7 +73,7 @@ Go to WifSolverCuda/ subfolder and execute _make all_. If your device does not s Performance ----------- -One's must modify number of blocks and number of threads in each block to find the ones which are the best for his card. Number of test performed by each thread also could have impact of global performance/latency. +User should modify number of blocks and number of threads in each block to find values which are the best for his card. Number of test performed by each thread also could have impact of global performance/latency. Test card: RTX3060 (eGPU!) with 224 BLOCKS & 640 BLOCK_THREADS (program default values) checks around 10000 MKey/s for compressed address with missing characters in the middle (collision with checksum) and around 1300-1400 Mkey/s for other cases; other results (using default values of blocks, threads and steps per thread): From 3c56de816c4501fad9dfc389fb817054c668af47 Mon Sep 17 00:00:00 2001 From: PawelGorny Date: Mon, 24 Jan 2022 20:39:49 +0100 Subject: [PATCH 2/2] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 25d5bfa..eb94d03 100644 --- a/README.md +++ b/README.md @@ -73,7 +73,7 @@ Go to WifSolverCuda/ subfolder and execute _make all_. If your device does not s Performance ----------- -User should modify number of blocks and number of threads in each block to find values which are the best for his card. Number of test performed by each thread also could have impact of global performance/latency. +User should modify number of blocks and number of threads in each block to find values which are the best for his card. Number of tests performed by each thread also could have impact of global performance/latency. Test card: RTX3060 (eGPU!) with 224 BLOCKS & 640 BLOCK_THREADS (program default values) checks around 10000 MKey/s for compressed address with missing characters in the middle (collision with checksum) and around 1300-1400 Mkey/s for other cases; other results (using default values of blocks, threads and steps per thread):