3-30-2016 New Banded Results



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


N

K

d

NPrtns

Solves

RelR

T-LU

T-SwDef

T-MMDef

T-PreP

nItrs

T-SwInf

T-MVInf

T-Kry

T-KryPIt

T-Total

1000

10

0.6

50

OK

2.00248e-07

0.826144

2.236

0.75

3.188

3.188

5.424

1000

10

0.6

80

OK

4.0537e-07

1.27392

2.213

2.75

7.963

2.89564

10.176

1000

10

0.6

80

OK

4.0537e-07

0.834656

1.111

2.75

22.403

8.14655

23.514

2000

10

0.6

50

OK

3.42891e-07

0.85264

2.265

0.25

2.424

2.424

4.689

2000

10

0.6

80

OK

2.89736e-07

1.35229

2.15

2.75

8.332

3.02982

10.482

2000

10

0.6

80

OK

2.89736e-07

0.85248

1.139

2.75

22.87

8.31636

24.009

5000

10

0.6

50

OK

1.60525e-07

0.854784

2.323

0.25

2.296

2.296

4.619

5000

10

0.6

80

OK

2.12233e-07

1.39638

2.305

2.75

8.901

3.23673

11.206

5000

10

0.6

80

OK

2.12233e-07

0.857504

1.15

2.75

23.465

8.53273

24.615

10000

10

0.6

50

OK

1.63548e-07

0.939392

2.741

0.25

3.368

3.368

6.109

10000

10

0.6

80

OK

7.38557e-07

1.27485

2.362

2.75

10.877

3.95527

13.239

10000

10

0.6

80

OK

7.38557e-07

0.866144

1.321

2.75

25.293

9.19745

26.614

20000

10

0.6

50

OK

1.64291e-07

1.44755

3.488

0.25

6.075

6.075

9.563

20000

10

0.6

80

OK

2.12089e-07

1.88666

2.892

2.75

14.097

5.12618

16.989

20000

10

0.6

80

OK

2.12089e-07

0.951488

1.362

2.75

27.469

9.98873

28.831

50000

10

0.6

50

OK

1.64607e-07

2.96771

5.953

0.25

14.229

14.229

20.182

50000

10

0.6

80

OK

8.59293e-07

2.18842

3.546

2.25

22.494

9.99733

26.04

50000

10

0.6

80

OK

8.59293e-07

1.64131

2.436

2.25

32.49

14.44

34.926

100000

10

0.6

50

OK

1.6421e-07

5.47741

9.342

0.25

26.058

26.058

35.4

100000

10

0.6

80

OK

4.82228e-07

2.9457

4.87

2.25

41.28

18.3467

46.15

100000

10

0.6

80

OK

4.82228e-07

2.80083

4.044

2.25

49.28

21.9022

53.324

200000

10

0.6

50

OK

1.63505e-07

10.4833

16.372

0.25

47.616

47.616

63.988

200000

10

0.6

80

OK

9.40335e-07

4.81462

8.111

2.25

70.425

31.3

78.536

200000

10

0.6

80

OK

9.40335e-07

5.05021

7.19

2.25

65.979

29.324

73.169

500000

10

0.6

50

OK

1.63474e-07

25.4496

36.454

0.25

108.483

108.483

144.937

500000

10

0.6

80

OK

5.8015e-07

10.7257

17.491

2.25

141.639

62.9507

159.13

500000

10

0.6

80

OK

5.8015e-07

11.7491

15.782

2.25

131.176

58.3004

146.958

1000000

10

0.6

50

OK

1.63729e-07

51.003

70.677

0.25

212.36

212.36

283.037

1000000

10

0.6

80

OK

3.50286e-07

20.6061

33.056

2.25

256.566

114.029

289.622

1000000

10

0.6

80

OK

3.50286e-07

22.9325

30.066

2.25

241.899

107.511

271.965

1000

20

0.6

50

OK

2.22957e-07

0.82944

2.297

0.75

3.286

3.286

5.583

1000

20

0.6

80

OK

5.65798e-07

1.39261

2.351

2.25

7.108

3.15911

9.459

1000

20

0.6

80

OK

5.65798e-07

0.855776

1.271

2.25

17.578

7.81244

18.849

2000

20

0.6

50

OK

2.19719e-07

0.829216

2.774

0.75

3.506

3.506

6.28

2000

20

0.6

80

OK

4.0133e-07

1.01757

1.807

2.25

6.79

3.01778

8.597

2000

20

0.6

80

OK

4.0133e-07

0.858368

1.02

2.25

17.63

7.83556

18.65

5000

20

0.6

50

OK

4.49258e-07

0.866944

3.038

0.25

3.64

3.64

6.678

5000

20

0.6

80

OK

8.87534e-07

1.21987

2.216

1.75

7.087

4.04971

9.303

5000

20

0.6

80

OK

8.87534e-07

0.883552

1.224

1.75

16.653

9.516

17.877

10000

20

0.6

50

OK

2.06028e-07

1.16019

3.51

0.25

3.894

3.894

7.404

10000

20

0.6

80

OK

5.58144e-07

1.87792

3.006

1.75

8.188

4.67886

11.194

10000

20

0.6

80

OK

5.58144e-07

1.04214

1.566

1.75

17.882

10.2183

19.448

20000

20

0.6

50

OK

2.10239e-07

1.89936

4.411

0.25

6.421

6.421

10.832

20000

20

0.6

80

OK

4.5097e-07

1.7905

2.8

1.75

10.093

5.76743

12.893

20000

20

0.6

80

OK

4.5097e-07

1.50896

2.066

1.75

19.559

11.1766

21.625

50000

20

0.6

50

OK

2.1091e-07

4.09235

7.65

0.25

14.172

14.172

21.822

50000

20

0.6

80

OK

4.51158e-07

2.29347

4.052

1.75

17.91

10.2343

21.962

50000

20

0.6

80

OK

4.51158e-07

3.11616

4.234

1.75

26.076

14.9006

30.31

100000

20

0.6

50

OK

2.10977e-07

7.87558

12.997

0.25

26.87

26.87

39.867

100000

20

0.6

80

OK

3.0805e-07

3.41856

6.489

1.75

32.773

18.7274

39.262

100000

20

0.6

80

OK

3.0805e-07

5.85885

7.825

1.75

40.238

22.9931

48.063

200000

20

0.6

50

OK

2.11231e-07

14.9447

22.959

0.25

50.875

50.875

73.834

200000

20

0.6

80

OK

2.33983e-07

5.90416

11.539

1.75

63.652

36.3726

75.191

200000

20

0.6

80

OK

2.33983e-07

11.0063

14.522

1.75

60.006

34.2891

74.528

500000

20

0.6

50

OK

2.11605e-07

36.6747

52.458

0.25

119.355

119.355

171.813

500000

20

0.6

80

OK

2.04366e-07

13.362

25.937

1.75

127.576

72.9006

153.513

500000

20

0.6

80

OK

2.04366e-07

27.1964

34.713

1.75

121.192

69.2526

155.905

1000000

20

0.6

50

OK

2.11512e-07

73.1472

101.875

0.25

225.824

225.824

327.699

1000000

20

0.6

80

OK

1.89829e-07

25.9411

50.123

1.75

231.093

132.053

281.216

1000000

20

0.6

80

OK

1.89829e-07

53.7204

67.83

1.75

222.181

126.961

290.011

1000

50

0.6

50

OK

7.96981e-07

1.21053

3.786

0.25

3.936

3.936

7.722

1000

50

0.6

80

OK

3.07104e-07

1.17363

2.219

1.75

7.043

4.02457

9.262

1000

50

0.6

80

OK

3.07104e-07

0.967456

1.373

1.75

16.57

9.46857

17.943

2000

50

0.6

50

OK

3.07656e-07

1.37814

4.781

0.75

5.296

5.296

10.077

2000

50

0.6

80

OK

5.6233e-07

1.29085

2.256

1.75

7.299

4.17086

9.555

2000

50

0.6

80

OK

5.6233e-07

1.04032

1.359

1.75

17.189

9.82229

18.548

5000

50

0.6

50

OK

3.2071e-07

1.76138

7.092

0.75

5.394

5.394

12.486

5000

50

0.6

80

OK

3.25937e-07

1.82787

2.768

1.75

7.08

4.04571

9.848

5000

50

0.6

80

OK

3.25937e-07

1.25997

1.661

1.75

17.347

9.91257

19.008

10000

50

0.6

50

OK

2.87186e-07

3.5359

9.293

0.25

6.965

6.965

16.258

10000

50

0.6

80

OK

3.35011e-07

1.92794

3.193

1.75

8.599

4.91371

11.792

10000

50

0.6

80

OK

3.35011e-07

2.28848

2.959

1.75

18.127

10.3583

21.086

20000

50

0.6

50

OK

3.30861e-07

7.17616

13.215

0.25

7.798

7.798

21.013

20000

50

0.6

80

OK

3.89157e-07

2.88845

4.264

1.75

12.048

6.88457

16.312

20000

50

0.6

80

OK

3.89157e-07

4.57942

5.327

1.75

20.112

11.4926

25.439

50000

50

0.6

50

OK

3.36738e-07

17.7675

25.382

0.25

16.676

16.676

42.058

50000

50

0.6

80

OK

8.0719e-07

5.84224

8.886

1.25

20.28

16.224

29.166

50000

50

0.6

80

OK

8.0719e-07

11.8111

13.462

1.25

25.664

20.5312

39.126

100000

50

0.6

50

OK

3.38714e-07

36.6116

46.862

0.25

32.76

32.76

79.622

100000

50

0.6

80

OK

5.38892e-07

11.2358

16.873

1.25

38.068

30.4544

54.941

100000

50

0.6

80

OK

5.38892e-07

23.7198

26.775

1.25

42.591

34.0728

69.366

200000

50

0.6

50

OK

3.39268e-07

72.122

87.26

0.25

60.447

60.447

147.707

200000

50

0.6

80

OK

6.29825e-07

21.6599

32.303

1.25

70.61

56.488

102.913

200000

50

0.6

80

OK

6.29825e-07

47.2582

52.948

1.25

68.648

54.9184

121.596

500000

50

0.6

50

OK

3.3976e-07

180.605

209.079

0.25

138.295

138.295

347.374

500000

50

0.6

80

OK

3.69019e-07

53.0191

78.254

1.25

151.179

120.943

229.433

500000

50

0.6

80

OK

3.69019e-07

119.773

132.58

1.25

145.574

116.459

278.154

1000000

50

0.6

50

OK

3.39915e-07

365.034

416.407

0.25

269.678

269.678

686.085

1000000

50

0.6

80

OK

3.23257e-07

105.004

154.47

1.25

283.319

226.655

437.789

1000000

50

0.6

80

OK

3.23257e-07

239.087

263.941

1.25

276.787

221.43

540.728

1000

100

0.6

50

OK

5.97616e-07

4.00077

7.243

0.25

6.666

6.666

13.909

1000

100

0.6

80

OK

5.53426e-07

2.00237

3.114

1.25

8.113

6.4904

11.227

1000

100

0.6

80

OK

5.53426e-07

1.80656

2.257

1.25

15.121

12.0968

17.378

2000

100

0.6

50

OK

4.37446e-07

5.05494

9.905

0.25

6.829

6.829

16.734

2000

100

0.6

80

OK

4.40778e-07

2.62528

3.594

1.75

8.958

5.11886

12.552

2000

100

0.6

80

OK

4.40778e-07

2.43309

2.808

1.75

18.855

10.7743

21.663

5000

100

0.6

50

OK

4.30175e-07

8.28192

16.917

0.25

7.5

7.5

24.417

5000

100

0.6

80

OK

6.98907e-07

3.13933

4.273

1.25

8.832

7.0656

13.105

5000

100

0.6

80

OK

6.98907e-07

3.6208

4.16

1.25

15.234

12.1872

19.394

10000

100

0.6

50

OK

6.50769e-07

14.2578

29.806

0.25

8.81

8.81

38.616

10000

100

0.6

80

OK

8.91759e-07

4.26518

5.767

1.25

9.781

7.8248

15.548

10000

100

0.6

80

OK

8.91759e-07

5.94749

6.829

1.25

15.681

12.5448

22.51

20000

100

0.6

50

OK

4.97735e-07

30.9674

47.203

0.25

9.388

9.388

56.591

20000

100

0.6

80

OK

5.28544e-07

7.12845

9.483

1.25

12.32

9.856

21.803

20000

100

0.6

80

OK

5.28544e-07

11.192

12.422

1.25

18.328

14.6624

30.75

50000

100

0.6

50

OK

4.5336e-07

81.5822

100.663

0.25

19.404

19.404

120.067

50000

100

0.6

80

OK

4.67074e-07

17.363

22.929

1.25

23.517

18.8136

46.446

50000

100

0.6

80

OK

4.67074e-07

30.6861

33.539

1.25

29.033

23.2264

62.572

100000

100

0.6

50

OK

4.60069e-07

165.834

189.544

0.25

37.06

37.06

226.604

100000

100

0.6

80

OK

3.89742e-07

34.9436

45.351

1.25

45.195

36.156

90.546

100000

100

0.6

80

OK

3.89742e-07

62.9861

68.376

1.25

57.453

45.9624

125.829

200000

100

0.6

50

OK

4.63598e-07

333.074

365.632

0.25

67.701

67.701

433.333

200000

100

0.6

80

OK

3.22318e-07

69.4524

89.769

1.25

83.702

66.9616

173.471

200000

100

0.6

80

OK

3.22318e-07

126.919

137.277

1.25

82.501

66.0008

219.778

500000

100

0.6

50

OK

4.64476e-07

837.808

896.76

0.25

157.558

157.558

1054.32

500000

100

0.6

80

OK

2.93739e-07

175.48

224.59

1.25

184.773

147.818

409.363

500000

100

0.6

80

OK

2.93739e-07

321.549

346.251

1.25

183.415

146.732

529.666

1000000

100

0.6

50

OK

4.64976e-07

1679.22

1782.19

0.25

307.516

307.516

2089.7

1000000

100

0.6

80

OK

6.73662e-07

348.912

446.606

1.25

282.286

225.829

728.892

1000000

100

0.6

80

OK

6.73662e-07

641.177

689.857

1.25

284.057

227.246

973.914

1000

200

0.6

50

OK

6.02784e-07

8.97248

13.657

0.25

8.995

8.995

22.652

1000

200

0.6

80

OK

5.81246e-07

4.32659

5.409

1.25

12.536

10.0288

17.945

1000

200

0.6

80

OK

5.81246e-07

4.25014

4.722

1.25

19.397

15.5176

24.119

2000

200

0.6

50

OK

5.97527e-07

10.9186

20.453

0.25

11.896

11.896

32.349

2000

200

0.6

80

OK

7.40485e-07

4.36771

5.335

1.25

9.025

7.22

14.36

2000

200

0.6

80

OK

7.40485e-07

4.98102

5.462

1.25

15.096

12.0768

20.558

5000

200

0.6

50

OK

7.79818e-07

21.0029

40.001

0.25

12.884

12.884

52.885

5000

200

0.6

80

OK

9.78257e-07

6.45242

7.822

1.25

10.439

8.3512

18.261

5000

200

0.6

80

OK

9.78257e-07

8.8455

9.602

1.25

15.372

12.2976

24.974

10000

200

0.6

50

OK

6.37447e-07

37.2705

81.206

0.25

14.722

14.722

95.928

10000

200

0.6

80

OK

4.50303e-07

9.28336

11.733

1.25

15.163

12.1304

26.896

10000

200

0.6

80

OK

4.50303e-07

14.7442

16.111

1.25

21.031

16.8248

37.142

20000

200

0.6

50

OK

5.8242e-07

70.7506

158.874

0.25

19.007

19.007

177.881

20000

200

0.6

80

OK

6.02812e-07

15.32

19.615

1.25

17.658

14.1264

37.273

20000

200

0.6

80

OK

6.02812e-07

27.5198

29.741

1.25

23.769

19.0152

53.51

50000

200

0.6

50

OK

6.05333e-07

200.854

294.913

0.25

26.028

26.028

320.941

50000

200

0.6

80

OK

9.64536e-07

39.1882

49.597

1.25

25.279

20.2232

74.876

50000

200

0.6

80

OK

9.64536e-07

75.0323

80.404

1.25

31.217

24.9736

111.621

100000

200

0.6

50

OK

6.24377e-07

417.824

520.855

0.25

49.087

49.087

569.942

100000

200

0.6

80

OK

8.70294e-07

83.2701

103.533

1.25

48.608

38.8864

152.141

100000

200

0.6

80

OK

8.70294e-07

158.922

169.323

1.25

55.472

44.3776

224.795

200000

200

0.6

50

OK

6.33242e-07

852.111

972.795

0.25

90.315

90.315

1063.11

200000

200

0.6

80

OK

4.99855e-07

169.714

209.451

1.25

92.2

73.76

301.651

200000

200

0.6

80

OK

4.99855e-07

325.967

346.216

1.25

95.29

76.232

441.506

500000

200

0.6

50

OK

6.3887e-07

2154.36

2327.58

0.25

209.802

209.802

2537.38

500000

200

0.6

80

OK

3.6468e-07

434.959

532.997

1.25

260.854

208.683

793.851

500000

200

0.6

80

OK

3.6468e-07

837.309

886.494

1.25

273.844

219.075

1160.34

1000000

200

0.6

50

OK

6.40405e-07

4373.88

4638.2

0.25

414.249

414.249

5052.45

1000000

200

0.6

80

OK

3.39284e-07

868.853

1063.68

1.25

402.978

322.382

1466.66

1000000

200

0.6

80

OK

3.39284e-07

1676.88

1774.64

1.25

430.187

344.15

2204.82

1000

500

0.6

50

OK

8.64151e-07

10.9167

11.491

0.25

9.236

9.236

20.727

1000

500

0.6

80

OK

8.64151e-07

11.0703

11.699

0.25

9.659

9.659

21.358

1000

500

0.6

80

OK

8.64151e-07

11.1435

11.73

0.25

9.406

9.406

21.136

2000

500

0.6

50

OK

9.01778e-07

30.1643

59.715

0.25

19.492

19.492

79.207

2000

500

0.6

80

OK

9.57158e-07

12.417

13.72

1.25

18.428

14.7424

32.148

2000

500

0.6

80

OK

9.57158e-07

18.1367

18.817

1.25

22.561

18.0488

41.378

5000

500

0.6

50

OK

6.41326e-07

80.5956

188.55

0.25

31.09

31.09

219.64

5000

500

0.6

80

OK

9.88144e-07

24.3544

26.95

1.25

20.102

16.0816

47.052

5000

500

0.6

80

OK

9.88144e-07

36.0011

37.43

1.25

24.779

19.8232

62.209

10000

500

0.6

50

OK

6.64306e-07

162.022

394.053

0.25

33.931

33.931

427.984

10000

500

0.6

80

OK

3.02423e-07

36.423

41.75

2.25

44.664

19.8507

86.414

10000

500

0.6

80

OK

3.02423e-07

66.8119

69.615

2.25

54.681

24.3027

124.296

20000

500

0.6

50

OK

8.6418e-07

328.656

820.267

0.25

43.535

43.535

863.802

20000

500

0.6

80

OK

5.15366e-07

66.5349

76.575

1.25

39.304

31.4432

115.879

20000

500

0.6

80

OK

5.15366e-07

126.824

131.863

1.25

46.868

37.4944

178.731

50000

500

0.6

50

OK

7.23157e-07

830.507

2035.46

0.25

73.567

73.567

2109.03

50000

500

0.6

80

OK

3.16642e-07

157.927

182.934

1.25

64.027

51.2216

246.961

50000

500

0.6

80

OK

3.16642e-07

310.831

323.344

1.25

75.345

60.276

398.689

100000

500

0.6

50

OK

9.20298e-07

1871.3

3094.29

0.25

88.216

88.216

3182.51

100000

500

0.6

80

OK

4.81977e-07

328.488

377.484

1.25

113.888

91.1104

491.372

100000

500

0.6

80

OK

4.81977e-07

650.768

675.434

1.25

155.93

124.744

831.364

200000

500

0.6

50

OK

9.53431e-07

3956.88

5223.93

0.25

162.644

162.644

5386.58

200000

500

0.6

80

OK

3.91306e-07

728.287

826.315

1.25

227.093

181.674

1053.41

200000

500

0.6

80

OK

3.91306e-07

1428.07

1477.06

1.25

307.156

245.725

1784.22

500000

500

0.6

50

OK

9.70971e-07

10212.8

11612.5

0.25

377.606

377.606

11990.1

500000

500

0.6

80

OK

7.06161e-07

1939.83

2183.6

1.25

398.154

318.523

2581.75

500000

500

0.6

80

OK

7.06161e-07

3805.83

3927

1.25

555.78

444.624

4482.78

1000000

500

0.6

50

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)