• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of nonzeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• KDB: the halfbandwidth after DB reordering method (without any dropoff). If DB is specified not to be executed, then this reports the original halfbandwidth
• KnoDrp: the halfbandwidth after DB and CM reordering but before dropoff
• K: the halfbandwidth after reordering and dropoff
• FRate: fillin rate. See NOTES below
• nuKf: nonuniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• TDB: time to run DB reordering for the matrix on the CPU
• TCM: time to run CM reordering for the matrix on the CPU
• TDrop: time to drop off offdiagonal elements to decrease bandwidth. Done on the CPU.
• TDtransf: data transfer from CPU to GPU
• TAsmbl: after reordering and dropoff, copy the sparse matrix to banded matrix stored in GPU memory
• LUM: LU method (complete, ILUT or ILUULT)
• Fillin: the fillin factor of ILUT (1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• TBC: time required to get offdiagonal right hand sides (Bs and Cs) from the banded matrix  done on the GPU
• TLU: LU time
• GFlpsLU: LU GFLOPs
• TSPK: time to solve for the spikes Vs and Ws  done on the GPU
• TLUrdcd: time required to factorize the reduced matrices  done on the GPU
• TPreP: the sum of all preprocessing times, see NOTES
• KryM: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylovsolve iterations to solve the problem
• TKry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + TKry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)
NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K  K_{iLeft}  K_{iRight})], where K_{iLeft} is the row halfbandwidth to the left of the diagonal while K_{iRight} is the row halfbandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E3 second)
N 
K 
d 
NPrtns 
Solves 
RelR 
TLU 
TSwDef 
TMMDef 
TPreP 
nItrs 
TSwInf 
TMVInf 
TKry 
TKryPIt 
TTotal 
200000 
50 
0.2 
80 
OK 
6.36112e10 
0 
51.7984 
0 
0 
58.155 
4.75 
162.209 
34.1493 
220.364 

200000 
200 
0.2 
80 
OK 
5.10344e10 
0 
375.029 
0 
0 
398.316 
2.75 
213.662 
77.6953 
611.978 

200000 
50 
0.19 
80 
OK 
1.72395e10 
0 
51.7148 
0 
0 
58.073 
5.75 
191.792 
33.3551 
249.865 

200000 
200 
0.19 
80 
OK 
9.59482e10 
0 
370.998 
0 
0 
393.993 
2.75 
212.865 
77.4055 
606.858 

200000 
50 
0.18 
80 
OK 
9.41806e10 
0 
52.6795 
0 
0 
59.124 
5.75 
192.826 
33.535 
251.95 

200000 
200 
0.18 
80 
OK 
1.66894e10 
0 
371.116 
0 
0 
394.142 
3.25 
241.504 
74.3089 
635.646 

200000 
50 
0.17 
80 
OK 
4.55752e10 
0 
50.7046 
0 
0 
57.067 
6.75 
221.294 
32.7843 
278.361 

200000 
200 
0.17 
80 
OK 
3.58456e10 
0 
370.178 
0 
0 
393.18 
3.25 
241.742 
74.3822 
634.922 

200000 
50 
0.16 
80 
OK 
5.85404e10 
0 
52.1065 
0 
0 
58.479 
7.75 
250.454 
32.3166 
308.933 

200000 
200 
0.16 
80 
OK 
8.11541e10 
0 
370.184 
0 
0 
393.152 
3.25 
241.739 
74.3812 
634.891 

200000 
50 
0.15 
80 
OK 
4.64543e10 
0 
51.5259 
0 
0 
57.88 
9.25 
292.756 
31.6493 
350.636 

200000 
200 
0.15 
80 
OK 
4.33092e10 
0 
374.119 
0 
0 
397.313 
3.25 
270.979 
83.3782 
668.292 

200000 
50 
0.14 
80 
OK 
9.95609e10 
0 
51.24 
0 
0 
57.631 
11.75 
368.416 
31.3546 
426.047 

200000 
200 
0.14 
80 
OK 
6.87913e11 
0 
370.125 
0 
0 
393.112 
3.75 
273.618 
72.9648 
666.73 

200000 
50 
0.13 
80 
OK 
3.02539e10 
0 
51.4644 
0 
0 
57.815 
26.25 
792.237 
30.1805 
850.052 

200000 
200 
0.13 
80 
OK 
2.44978e10 
0 
370.204 
0 
0 
393.198 
3.75 
273.307 
72.8819 
666.505 

200000 
50 
0.12 
80 
NConv 
2.00164e09 
0 
51.5974 
0 
0 
57.959 
183.75 
5429.71 
29.5494 
5487.67 

200000 
200 
0.12 
80 
OK 
9.9876e10 
0 
370.406 
0 
0 
394.63 
3.75 
276.724 
73.7931 
671.354 

200000 
50 
0.11 
80 
NConv 
1 
0 
52.0471 
0 
0 
58.47 
501.25 
14537.1 
29.0018 
14595.6 

200000 
200 
0.11 
80 
OK 
6.2028e11 
0 
370.327 
0 
0 
393.294 
4.75 
333.763 
70.2659 
727.057 

200000 
50 
0.1 
80 
NConv 
1 
0 
52.264 
0 
0 
58.687 
501.25 
14621.4 
29.1699 
14680.1 

200000 
200 
0.1 
80 
OK 
5.96008e10 
0 
370.125 
0 
0 
393.107 
4.75 
334.121 
70.3413 
727.228 