Skip to content

Latest commit

 

History

History
3430 lines (3396 loc) · 73.3 KB

acc.md

File metadata and controls

3430 lines (3396 loc) · 73.3 KB

Comparison with other methods

To ensure a fair comparison as much as possible and alleviate overfitting in perplexity evaluation on Wikitext or C4, we utilized 512 samples from NeelNanda/pile-10k for all methods during calibration unless explicitly stated. For wikitext2/ptb-new/c4-new ppl, we follow the code of gptq and set the sequence length to 2048. For lm-eval wikitext ppl, we adopt lm-eval. The lm-eval-harness git id we used in the following is 008fc2a23245c40384f2312718433eeb1e0f87a9 and we evaluated on qdq fake models.

Due to memory constraints, we maintained the original sequence length of 512 for AWQ, while for GPTQ,Omniquant and our approach, a sequence length of 2048 is used. And HQQ is a data free method, no need to calibrate.

For GPTQ, we have enabled act-order and true-seqential, and also activated static group in scenarios where group_size!=-1. The notation GPTQ* indicates that we adjusted the random seed or data preprocessing to address issues related to the non-positive definite Hessian matrix or other issues.

For Omniquant, we adhere to the official settings, which include running for 20 epochs and disabling 'let'. We conducted calibration tests using sample sizes of 512 and 128, as well as a sample size of 512 with a batch size of 4. Our findings show that using a sample size of 512 typically results in comparable or slight higher performance for models <=13B. Therefore, we present the results based on the sample size of 512. For 70B models, due the the NAN loss issue and to reduce the tuning cost, we adopted 128 samples for calibration.

For AutoRound, we used the default setting, iters 200, enable_quanted_input and enable_minmax_tuning, both the lr and minmax_lr are set to 1/iters,i.e. 5e-3.

With these configurations, the tuning costs for GPTQ, AWQ, and ours are similar, while HQQ is much faster and Omniquant is noticebal slower.


1. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G-1.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 55.92 66.10 59.01 71.35 80.14 24.85 29.00 79.17 57.76 77.95 45.99 58.84
GPTQ 58.22 73.45 59.47 74.03 80.20 26.93 31.00 81.50 64.98 78.24 47.01 61.37
AWQ 57.20 71.45 59.21 73.64 79.43 25.34 30.40 82.69 68.95 79.25 47.44 61.36
HQQ 52.65 66.58 59.09 70.56 79.60 23.13 27.80 80.03 59.57 77.02 46.33 58.40
Omniquant 57.52 70.00 60.27 72.93 79.87 23.99 30.80 81.53 63.90 78.54 46.42 60.52
Ours 59.52 73.76 60.75 73.32 80.09 27.17 33.00 82.02 66.07 80.47 49.49 62.33
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 36.87 67.96 55.63 68.51 76.82 26.19 30.60 73.64 58.84 74.07 41.30 55.49
GPTQ 39.66 71.92 55.89 68.03 77.58 25.09 30.20 76.67 62.09 75.55 41.72 56.76
AWQ 40.24 71.20 56.26 69.61 76.93 26.07 32.60 77.31 63.18 75.00 41.30 57.25
HQQ 28.94 43.96 48.43 59.43 71.82 23.62 24.80 52.11 53.79 64.90 34.73 46.05
Omniquant 39.82 71.45 55.76 67.56 76.88 25.09 30.80 76.15 64.98 74.12 40.19 56.62
Ours 39.97 71.63 56.52 68.43 77.91 25.70 31.60 76.18 65.70 76.01 42.58 57.48
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 50.37 74.35 59.12 71.98 79.00 24.85 33.00 81.77 64.98 79.08 46.59 60.46
GPTQ 51.14 75.37 59.14 72.06 78.02 25.34 32.20 80.46 62.09 77.36 44.54 59.79
AWQ 51.16 75.98 59.51 70.80 78.40 25.21 34.60 78.26 66.79 79.12 46.59 60.58
HQQ 35.92 49.54 46.27 58.01 72.47 23.99 19.80 61.77 51.26 62.84 33.19 46.82
Omniquant 51.01 75.45 59.48 71.74 78.94 24.60 33.20 77.37 66.07 78.75 46.76 60.31
Ours 52.30 75.96 59.79 72.30 78.84 25.58 34.00 80.15 66.79 79.38 48.12 61.20
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 63.85 77.62 63.38 76.72 81.50 28.89 37.80 83.39 68.23 81.99 54.10 65.22
GPTQ 64.81 79.27 63.86 76.87 81.61 31.46 36.40 82.23 70.04 82.53 54.18 65.75
AWQ 65.08 78.77 64.14 77.11 81.45 30.48 37.20 83.64 72.92 82.49 55.80 66.28
HQQ 56.45 66.74 53.67 73.32 76.50 25.58 33.40 67.95 61.73 72.90 43.94 57.47
Omniquant 64.40 79.20 63.91 76.95 81.94 31.70 37.60 82.35 69.31 82.24 54.18 65.80
Ours 65.43 79.55 64.47 78.06 82.10 30.60 36.40 83.91 71.12 82.53 54.78 66.27
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 31.34 70.02 55.35 69.77 77.69 20.32 32.60 73.43 59.57 74.45 41.30 55.08
GPTQ 29.06 71.08 55.11 70.01 77.37 20.93 32.20 72.69 63.90 74.66 41.64 55.33
AWQ 33.33 70.81 55.98 68.27 78.07 21.18 31.40 74.37 64.62 74.03 41.21 55.75
Ours 31.80 71.96 56.57 69.53 79.00 21.91 33.20 75.72 66.79 74.83 43.09 56.76
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 39.57 70.93 58.82 71.98 78.02 24.85 32.00 78.20 66.43 75.67 44.62 58.28
GPTQ* 40.01 74.67 58.92 71.03 78.45 26.44 33.60 77.09 68.23 76.85 44.97 59.12
AWQ 44.56 74.13 59.13 71.27 78.94 25.83 33.20 76.42 66.06 76.89 46.67 59.37
Ours 43.94 75.82 59.51 72.22 78.78 25.70 32.80 77.34 67.51 76.47 46.67 59.71
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 53.05 75.65 62.08 74.82 80.09 25.95 35.80 81.87 63.54 79.76 50.26 62.08
GPTQ 53.04 77.22 61.95 73.80 80.69 27.29 34.60 81.07 66.06 78.79 49.15 62.15
AWQ 54.13 76.77 62.78 74.11 81.07 27.78 35.00 82.66 67.15 79.97 51.71 63.01
Ours 54.72 77.84 62.91 75.06 80.69 26.68 36.40 82.60 66.79 80.13 52.13 63.27
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 58.74 76.42 64.12 76.72 81.01 29.25 38.60 84.13 70.40 80.72 51.88 64.73
GPTQ* 59.10 78.17 63.78 75.69 81.34 28.27 38.40 83.76 68.59 80.98 51.62 64.52
AWQ 58.86 77.37 63.86 76.56 80.85 28.27 35.20 83.94 71.48 78.75 50.94 64.19
Ours 59.21 79.16 64.37 76.64 81.34 26.81 37.80 84.40 69.68 80.98 51.79 64.74

2. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W4G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 59.72 74.44 61.06 73.40 80.36 27.17 32.60 83.67 64.62 79.63 49.32 62.36
GPTQ 59.17 74.52 60.37 74.90 80.58 26.68 31.00 83.33 67.15 79.67 48.12 62.32
AWQ 60.20 75.14 60.43 73.80 80.03 27.05 30.40 84.01 62.09 80.39 50.26 62.16
HQQ 60.02 75.41 60.79 74.11 81.01 27.29 32.60 82.97 66.79 79.92 49.32 62.75
Omniquant 59.71 73.94 60.62 73.56 80.36 26.68 30.80 83.58 65.70 80.01 49.06 62.18
Ours 60.47 75.59 61.03 73.88 80.09 27.54 31.60 83.09 66.07 79.97 49.49 62.62
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 40.91 72.44 56.91 68.35 77.58 24.97 31.20 77.61 56.32 76.26 43.52 56.92
GPTQ 42.57 73.28 56.36 69.06 78.02 25.34 30.20 75.72 57.04 75.63 42.15 56.85
AWQ 41.00 72.60 56.40 68.98 77.31 25.70 31.60 78.75 58.48 76.14 43.86 57.35
HQQ 41.79 73.20 56.21 68.43 77.58 25.83 31.60 76.09 62.82 75.84 42.15 57.41
Omniquant 41.72 73.04 56.59 68.98 77.91 24.97 30.80 75.81 61.37 75.76 43.34 57.30
Ours 41.82 72.75 56.79 68.67 78.13 25.58 30.20 77.49 63.54 75.76 42.58 57.57
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 52.10 76.27 59.77 72.14 78.62 24.72 34.20 80.24 62.09 79.00 47.95 60.65
GPTQ 52.66 76.54 59.76 72.14 78.35 25.70 34.00 79.33 66.43 78.58 47.53 61.00
AWQ 52.39 76.89 59.97 73.24 79.00 25.21 32.60 80.40 63.54 79.04 47.70 60.91
HQQ 52.09 75.74 59.46 72.14 78.45 24.36 33.60 79.17 66.06 79.00 47.01 60.65
Omniquant 52.01 76.17 59.53 72.06 78.35 23.87 33.40 80.80 66.07 78.37 47.18 60.51
Ours 51.92 76.46 59.87 71.67 79.00 25.83 35.20 79.60 63.54 79.25 47.01 60.85
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 64.91 79.06 63.93 78.14 81.66 30.11 37.00 83.61 68.59 82.79 54.78 65.87
GPTQ 65.63 79.22 64.45 78.22 81.88 31.09 37.00 84.19 69.31 82.79 54.61 66.22
AWQ 65.79 79.76 64.48 77.58 82.32 30.72 38.00 83.06 68.95 82.70 55.12 66.23
HQQ 65.34 79.14 64.56 77.35 81.56 30.48 37.20 83.67 69.31 82.83 55.20 66.06
Omniquant 65.30 79.39 64.52 77.51 81.88 30.60 37.40 83.39 68.23 82.91 55.12 66.02
Ours 65.65 79.49 64.60 78.30 82.05 31.58 37.40 84.83 68.95 82.87 54.52 66.39
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 32.63 72.31 56.26 70.01 78.45 20.93 33.60 74.74 64.26 74.71 42.75 56.42
GPTQ 31.16 72.40 55.85 70.09 78.13 22.28 30.40 74.65 64.26 74.20 40.19 55.78
AWQ 33.42 72.95 56.30 68.75 77.97 21.42 32.80 74.89 62.09 75.00 41.21 56.07
Ours 32.15 72.85 56.45 70.17 78.51 22.28 32.80 75.14 67.87 75.13 41.89 56.84
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 42.71 75.26 59.30 72.53 79.54 25.95 32.60 76.76 65.34 76.98 45.82 59.34
GPTQ* 42.65 75.41 59.51 72.93 79.33 24.97 32.40 77.49 68.23 76.89 45.56 59.58
AWQ 42.66 75.76 59.50 72.77 78.89 26.56 33.60 77.46 68.59 76.94 45.48 59.84
Ours 42.27 76.17 59.53 73.56 79.33 25.70 32.80 78.20 70.04 76.94 46.25 60.07
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 54.24 77.02 62.90 74.35 80.52 27.29 34.20 81.96 67.15 80.89 52.05 62.96
GPTQ 54.20 77.41 62.79 75.14 80.41 27.54 34.60 81.93 67.51 80.05 50.51 62.92
AWQ 55.14 77.49 63.08 75.77 80.52 27.29 34.20 82.87 67.15 80.43 52.90 63.35
Ours 54.68 77.90 62.93 74.82 80.47 28.15 35.80 82.39 66.79 80.13 51.11 63.20
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 59.53 79.51 64.63 77.35 80.96 27.91 38.40 84.43 71.48 81.48 52.22 65.26
GPTQ* 60.47 78.79 64.45 76.24 81.18 28.03 37.40 83.85 68.95 81.57 53.07 64.91
AWQ 59.45 79.31 64.67 76.72 81.56 28.15 38.00 84.43 71.12 81.10 52.13 65.15
Ours 58.93 79.22 64.48 77.03 81.28 27.91 38.60 84.31 70.76 81.19 52.22 65.08

3. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W3G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 53.49 68.74 58.12 68.27 79.33 24.60 29.60 79.97 57.40 76.89 43.77 58.20
GPTQ 55.84 73.04 57.61 70.24 78.67 24.85 30.80 81.44 63.54 77.27 45.65 59.91
AWQ 55.61 73.69 57.86 71.27 79.82 26.07 29.00 81.10 59.21 79.00 46.93 59.96
HQQ 53.97 68.66 58.59 72.22 78.73 25.70 30.00 80.24 63.90 76.81 43.86 59.33
Omniquant 54.79 69.34 58.42 68.51 79.38 24.85 28.80 80.15 56.68 77.74 45.14 58.53
Ours 57.54 73.01 59.60 72.85 79.54 25.70 31.60 81.74 58.12 78.70 46.33 60.43
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 34.22 65.96 54.90 67.56 76.28 24.48 30.80 71.68 54.51 72.98 38.57 53.81
GPTQ 36.11 69.61 53.66 68.59 76.01 21.91 27.80 73.43 54.51 73.74 40.19 54.14
AWQ 35.82 69.90 54.98 67.40 76.01 25.21 29.80 74.68 57.76 74.07 41.64 55.21
HQQ 34.40 66.64 53.27 67.01 75.46 25.46 28.80 73.58 61.37 72.94 38.48 54.31
Omniquant 34.51 69.75 54.42 66.69 76.77 24.24 31.40 73.21 56.68 74.37 39.85 54.72
Ours 40.13 71.01 55.33 68.27 76.82 25.34 32.80 75.32 60.29 75.25 42.92 56.68
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 48.01 72.33 57.74 70.72 78.07 25.21 32.00 77.28 60.65 77.69 44.62 58.57
GPTQ 49.56 75.24 57.83 70.88 78.56 24.97 33.40 78.44 62.82 77.99 45.65 59.58
AWQ 49.77 75.22 58.58 71.82 77.75 24.11 34.20 79.97 53.43 77.95 44.62 58.86
HQQ 48.40 73.22 57.66 69.77 77.31 24.11 30.60 76.97 60.29 77.15 43.60 58.10
Omniquant 47.25 73.67 58.46 70.01 78.40 24.36 33.60 79.79 64.62 77.86 46.16 59.18
Ours 49.64 75.20 59.11 71.59 78.29 24.85 34.20 78.47 58.12 78.58 45.82 59.44
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 61.15 77.95 61.98 77.90 80.79 29.74 36.00 81.28 64.62 81.10 52.39 64.08
GPTQ 63.15 79.06 62.94 77.66 81.45 30.72 36.20 81.53 67.87 81.65 53.67 65.08
AWQ 64.09 79.47 63.75 76.48 81.77 29.74 37.20 82.69 66.06 81.40 53.67 65.12
HQQ 63.45 78.05 63.12 77.03 81.01 29.38 36.60 82.23 66.43 81.78 53.67 64.80
Omniquant 63.18 78.63 63.54 76.48 81.50 30.35 35.80 82.57 70.40 81.02 52.82 65.12
Ours 64.94 78.89 63.83 76.56 81.50 31.21 37.20 81.41 68.59 81.73 52.56 65.31
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 28.00 67.67 53.43 66.38 76.50 21.42 31.20 72.72 59.21 70.92 38.31 53.25
GPTQ 30.16 66.31 53.92 67.48 76.82 21.42 29.60 71.31 59.21 72.22 38.74 53.38
AWQ 30.33 70.19 54.53 68.98 76.71 20.81 31.60 74.68 64.62 73.23 38.91 54.96
Ours 25.85 70.95 55.45 69.69 77.37 21.66 32.00 73.88 60.29 73.48 39.33 54.54
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 34.87 69.65 57.25 70.48 77.31 26.93 32.00 71.44 62.82 75.63 43.94 56.57
GPTQ 35.51 73.08 57.89 70.80 77.37 24.48 31.40 77.52 62.82 74.41 43.26 57.14
AWQ 40.53 73.94 57.89 69.53 78.94 26.68 33.40 74.83 65.34 75.93 45.05 58.37
Ours 39.16 75.22 58.64 71.59 78.94 25.95 35.20 76.30 65.34 76.52 45.39 58.93
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 52.41 75.08 61.45 74.27 79.87 25.95 33.00 81.38 65.34 79.12 48.89 61.52
GPTQ 51.39 74.97 60.35 75.30 79.60 26.93 34.80 82.75 64.62 78.11 48.46 61.57
AWQ 53.84 76.71 61.94 75.14 80.03 25.34 34.40 81.90 67.15 79.59 50.77 62.44
Ours 54.39 77.49 62.13 74.03 80.47 27.30 35.00 79.76 68.59 79.46 48.98 62.51
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 57.47 77.43 63.23 75.93 80.41 28.64 38.40 82.69 66.43 80.22 51.19 63.82
GPTQ* 57.92 78.69 62.98 76.87 80.63 27.66 37.60 84.16 68.95 80.89 51.19 64.32
AWQ 58.87 77.94 63.77 75.37 80.96 27.66 36.80 85.02 71.12 81.10 50.34 64.45
Ours 58.30 78.11 63.60 76.56 80.85 29.50 37.80 84.80 70.04 80.22 50.68 64.59

4. Accuracies $\uparrow$ across 11 tasks(0-shot) of LLaMA and Mistral models at W2G128.

Mmlu Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. Avg.
Mistral-7B FP16 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 63.30
RTN 23.45 0.14 27.43 49.64 54.30 24.24 15.20 38.69 51.99 29.08 21.59 30.52
GPTQ 25.23 30.47 38.28 53.83 64.91 24.11 17.40 58.29 50.90 47.77 24.57 39.61
AWQ 25.38 0.00 25.71 52.01 51.58 23.99 17.60 37.83 47.29 26.98 22.27 30.06
HQQ 23.35 0.85 27.77 51.62 56.69 26.68 15.80 40.55 53.43 28.62 20.14 31.41
Omniquant 23.24 5.38 29.38 49.72 56.09 26.32 16.60 41.99 52.71 32.11 20.39 32.17
Ours 40.46 58.61 50.87 62.90 75.84 24.85 22.80 78.56 57.04 70.88 37.03 52.71
V2-7B FP16 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 57.98
RTN 23.98 0.02 26.04 49.49 52.50 24.85 15.20 41.01 49.10 27.48 19.71 29.94
GPTQ 23.65 11.72 32.59 55.17 58.32 25.95 15.80 52.14 51.99 40.45 21.25 35.37
AWQ 25.38 0.00 25.69 49.96 52.34 23.75 17.80 37.83 52.71 24.62 21.08 30.10
HQQ 24.51 0.02 26.06 49.49 53.26 24.72 13.80 37.92 50.90 26.52 21.33 29.87
Omniquant 22.97 35.53 40.28 55.88 65.13 22.89 15.60 63.24 53.07 50.13 23.46 40.74
Ours 27.20 55.25 47.35 61.01 72.96 24.85 25.60 68.07 54.51 65.99 32.25 48.64
V2-13B FP16 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 61.42
RTN 23.77 7.47 33.08 49.01 57.94 26.19 16.00 47.74 53.43 32.03 21.93 33.51
GPTQ 24.69 45.20 41.06 55.80 67.08 23.26 19.80 54.40 52.35 55.60 27.82 42.46
AWQ 27.04 0.00 25.80 51.85 52.99 23.62 13.60 62.17 47.29 26.22 23.12 32.16
HQQ 23.48 8.17 31.27 52.17 61.86 24.85 17.20 50.46 54.51 42.85 21.25 35.28
Omniquant 25.53 49.84 46.23 57.93 70.13 24.60 21.80 66.85 55.60 63.22 30.29 46.55
Ours 34.33 63.92 53.35 64.33 76.17 25.70 26.00 72.75 61.73 71.17 38.57 53.46
V2-70B FP16 66.23 79.64 64.77 77.98 82.15 30.60 37.20 83.70 67.87 82.70 54.44 66.12
RTN 24.20 20.18 40.88 54.85 63.87 24.11 17.60 43.06 53.07 50.51 27.22 38.14
GPTQ 23.12 0.00 25.04 49.57 49.51 0.00 27.60 37.83 52.71 25.08 22.70 28.47
AWQ 24.46 0.00 25.46 51.38 52.50 23.50 14.20 62.17 52.71 25.76 22.35 32.23
HQQ 23.16 19.46 35.45 56.67 66.00 22.52 20.00 40.46 52.71 52.06 23.12 37.42
Omniquant 33.84 61.83 52.44 64.33 74.10 24.48 28.20 71.68 53.07 67.21 33.28 51.31
Ours 54.04 72.97 59.65 74.90 79.00 29.01 34.80 79.63 69.68 78.37 46.59 61.69
V1-7B FP16 32.74 73.53 56.94 70.01 78.67 22.03 34.60 75.08 66.43 75.25 41.81 57.01
RTN 24.36 0.52 27.24 49.25 54.24 24.24 15.20 39.63 57.40 27.86 21.84 31.07
GPTQ 22.95 12.75 33.36 51.70 60.07 23.99 13.40 48.62 53.07 40.82 21.50 34.75
AWQ 23.12 0.00 25.37 53.28 52.56 25.21 13.80 37.83 52.71 25.63 22.53 30.18
Ours 24.46 13.53 42.16 56.99 70.02 24.60 25.20 62.91 47.29 60.90 31.74 41.80
V1-13B FP16 44.21 76.21 59.92 72.77 79.16 25.70 33.20 77.89 70.76 77.40 46.42 60.33
RTN 24.66 4.97 29.67 49.33 57.24 25.58 12.40 44.10 53.79 32.07 22.01 32.35
GPTQ* 26.43 40.48 39.47 58.25 66.97 23.50 18.60 52.78 50.54 51.52 25.00 41.23
AWQ 27.04 0.00 25.59 50.36 53.05 24.11 15.60 62.17 47.29 25.97 23.21 32.22
Ours 31.87 59.65 51.25 67.64 76.28 25.58 27.80 69.11 58.48 70.71 37.12 52.32
V1-30B FP16 55.14 77.55 63.33 75.85 81.12 28.27 36.00 82.78 66.79 80.39 52.90 63.65
RTN 23.24 5.55 27.22 53.99 56.80 21.79 18.20 51.65 53.07 36.74 21.33 33.60
GPTQ 30.47 49.93 45.05 61.88 68.88 23.26 22.60 68.29 51.99 60.69 30.72 46.70
AWQ 27.04 0.00 25.41 50.20 52.94 24.48 16.60 62.17 47.29 24.71 23.38 32.20
Ours 40.83 67.92 56.73 68.90 76.17 24.36 31.60 75.54 62.45 74.92 42.41 56.53
V1-65B FP16 59.79 79.12 64.53 77.35 81.23 27.91 38.00 84.86 69.68 81.36 52.82 65.15
RTN 24.48 32.78 43.59 57.85 67.52 22.89 22.80 61.53 50.54 52.10 28.24 42.21
GPTQ* 37.06 67.44 53.97 69.46 76.44 24.36 28.00 73.64 60.29 71.34 38.57 54.60
AWQ 25.38 0.00 25.58 49.96 53.10 24.24 11.00 37.83 52.71 24.96 22.44 29.75
Ours 47.21 72.07 60.06 73.24 78.62 25.46 34.20 80.64 62.82 77.48 46.76 59.87

Other data W4G128


Model Method Acc AVG. MMLU Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. wikitext2 ppl ptb_new ppl c4_new ppl lm_eval wikitext ppl
Intel/neural-chat-7b-v3-3 FP16 67.92 61.13 73.03 66.39 76.40 81.01 47.37 38.8 86.97 75.81 82.66 57.51 6.00 48.96 9.65 -
Ours 66.90 60.56 72.19 65.28 75.37 81.18 46.76 36.0 86.91 73.29 81.73 56.66 6.21 59.78 10.01 -
Ours iters=1K,disable_quanted_input, minmax_lr=0.002 67.70 60.57 73.74 65.62 77.43 80.85 47.61 36.8 86.94 75.09 82.66 57.34 6.17 59.12 9.83 -
mistralai/Mixtral-8x7B-v0.1 BF16 67.16 69.83 78.44 64.89 76.40 82.43 34.15 35.40 84.98 71.12 84.22 56.91 3.84 19.22 7.41 -
Ours 65.98 68.90 78.11 64.31 74.27 82.10 30.97 34.20 84.57 67.87 83.96 56.57 4.08 354 7.56 -
Ours iters=1K,disable_quanted_input 66.78 68.68 78.61 64.40 76.56 81.99 32.56 34.80 85.96 70.76 83.96 56.31 3.99 17.65 7.52 -
microsoft/phi-2 FP16 61.80 56.40 62.78 55.83 75.77 78.67 31.21 40.40 83.36 62.45 80.05 52.90 9.71 18.16 14.12 11.05
Ours 61.67 54.57 61.32 55.04 76.48 78.89 29.74 40.60 83.24 66.43 79.76 52.30 9.98 18.67 14.39 11.37
Ours iters=1K,disable_quanted_input 61.47 55.41 61.77 54.92 76.40 78.29 31.09 40.0 83.24 63.54 79.29 52.22 9.97 18.63 14.37 11.35

Other data W2G32

Model Method Acc AVG. MMLU Lamb. Hella. Wino. Piqa Truth. Open. Boolq RTE ARC-e ARC-c. wikitext2 ppl ptb_new ppl c4_new ppl lm_eval wikitext ppl
mistralai/Mistral-7B FP16 63.30 61.35 75.68 61.27 74.03 80.79 28.03 32.80 83.67 67.51 80.81 50.34 5.25 35.00 8.38 -
Ours iters=1K 56.44 47.38 67.26 55.06 67.88 77.75 26.19 26.40 78.07 58.12 74.20 42.49 7.14 56.78 10.71 -
Ours iters=4K,minmax_lr=0.0005 57.16 50.28 67.03 55.37 68.11 77.53 26.44 26.00 80.58 58.12 75.63 43.69 7.07 51.88 10.67 -
Meta/LLaMA-2-13B FP16 61.42 52.86 76.77 60.04 72.14 79.05 25.95 35.20 80.55 65.34 79.38 48.38 4.88 50.93 6.73 7.90
Ours iters=1K,minmax_lr=0.002 56.95 42.39 70.87 55.15 68.03 77.37 24.11 30.80 77.58 64.62 75.63 39.93 6.26 78.83 8.70 11.25
Ours iters=2K,minmax_lr=0.001 57.53 44.42 71.63 55.23 68.03 76.66 24.48 32.00 76.91 65.70 76.09 41.64 6.27 75.40 8.70 11.22
Meta/LLaMA-2-7B FP16 57.98 42.69 73.90 57.15 68.90 78.07 25.21 31.40 77.74 62.82 76.35 43.52 5.47 37.92 7.26 8.79
Ours iters=1K,minmax_lr=0.002 52.29 27.14 65.48 50.25 66.61 74.54 24.11 29.80 73.30 56.68 70.20 37.12 8.72 1692.95 10.06 12.80
Ours iters=2K,minmax_lr=0.0005 52.32 28.26 64.16 50.66 64.80 75.14 23.87 30.20 71.74 57.76 71.13 37.80 8.54 0.00 10.14 0.00