3-30-2016 New Banded Results



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


N

K

d

NPrtns

Solves

RelR

T-LU

T-SwDef

T-MMDef

T-PreP

nItrs

T-SwInf

T-MVInf

T-Kry

T-KryPIt

T-Total

1000

10

0.6

50

OK

8.00678e-08

0.817568

2.197

0.75

2.95

2.95

5.147

1000

10

0.6

80

OK

3.57013e-07

0.695008

1.387

2.75

8.139

2.95964

9.526

1000

10

0.6

80

OK

3.57013e-07

0.832672

0.981

2.75

22.656

8.23855

23.637

2000

10

0.6

50

OK

1.11626e-07

0.956832

2.226

0.25

2.481

2.481

4.707

2000

10

0.6

80

OK

2.17844e-07

0.719584

1.523

2.75

8.097

2.94436

9.62

2000

10

0.6

80

OK

2.17844e-07

0.843456

0.988

2.75

22.523

8.19018

23.511

5000

10

0.6

50

OK

4.06577e-09

0.8816

2.582

0.25

2.244

2.244

4.826

5000

10

0.6

80

OK

1.07166e-07

0.62368

1.532

2.75

8.894

3.23418

10.426

5000

10

0.6

80

OK

1.07166e-07

0.838208

1.138

2.75

23.593

8.57927

24.731

10000

10

0.6

50

OK

3.28059e-16

1.0776

3.011

0.25

3.82

3.82

6.831

10000

10

0.6

80

OK

6.91353e-07

0.697632

1.646

2.75

10.909

3.96691

12.555

10000

10

0.6

80

OK

6.91353e-07

0.850752

1.179

2.75

25.154

9.14691

26.333

20000

10

0.6

50

OK

3.17874e-16

1.73728

3.735

0.25

6.175

6.175

9.91

20000

10

0.6

80

OK

1.30571e-07

1.31171

2.388

2.75

13.868

5.04291

16.256

20000

10

0.6

80

OK

1.30571e-07

1.07008

1.463

2.75

27.589

10.0324

29.052

50000

10

0.6

50

OK

3.17973e-16

3.72026

6.619

0.25

14.416

14.416

21.035

50000

10

0.6

80

OK

8.46377e-07

1.9727

3.636

2.25

26.984

11.9929

30.62

50000

10

0.6

80

OK

8.46377e-07

1.95942

2.86

2.25

35.684

15.8596

38.544

100000

10

0.6

50

OK

3.17875e-16

6.97565

11.215

0.25

26.39

26.39

37.605

100000

10

0.6

80

OK

4.57895e-07

3.17661

5.902

2.25

43.758

19.448

49.66

100000

10

0.6

80

OK

4.57895e-07

3.38973

4.903

2.25

49.561

22.0271

54.464

200000

10

0.6

50

OK

3.17563e-16

13.4552

19.527

0.25

48.385

48.385

67.912

200000

10

0.6

80

OK

9.29011e-07

5.59526

9.966

2.25

74.631

33.1693

84.597

200000

10

0.6

80

OK

9.29011e-07

6.24627

8.445

2.25

66.62

29.6089

75.065

500000

10

0.6

50

OK

3.17572e-16

33.3575

45.263

0.25

111.135

111.135

156.398

500000

10

0.6

80

OK

5.61586e-07

12.662

22.129

2.25

146.892

65.2853

169.021

500000

10

0.6

80

OK

5.61586e-07

14.8121

19.068

2.25

133.548

59.3547

152.616

1000000

10

0.6

50

OK

3.17757e-16

65.5384

86.718

0.25

214.207

214.207

300.925

1000000

10

0.6

80

OK

3.18509e-07

24.5683

42.492

2.25

267.433

118.859

309.925

1000000

10

0.6

80

OK

3.18509e-07

29.073

36.668

2.25

246.68

109.636

283.348

1000

20

0.6

50

OK

5.03984e-10

0.83312

2.508

0.75

3.324

3.324

5.832

1000

20

0.6

80

OK

5.21227e-07

0.620192

1.434

2.25

6.738

2.99467

8.172

1000

20

0.6

80

OK

5.21227e-07

0.822272

0.975

2.25

17.468

7.76356

18.443

2000

20

0.6

50

OK

5.2288e-10

0.881216

3.379

0.75

3.612

3.612

6.991

2000

20

0.6

80

OK

3.38398e-07

0.618304

1.417

2.25

6.753

3.00133

8.17

2000

20

0.6

80

OK

3.38398e-07

0.851776

1.147

2.25

17.74

7.88444

18.887

5000

20

0.6

50

OK

8.94553e-10

1.19126

3.71

0.25

3.767

3.767

7.477

5000

20

0.6

80

OK

8.56005e-07

0.691264

1.645

1.75

6.512

3.72114

8.157

5000

20

0.6

80

OK

8.56005e-07

0.853664

1.198

1.75

16.286

9.30629

17.484

10000

20

0.6

50

OK

2.35799e-11

2.01062

4.68

0.25

3.891

3.891

8.571

10000

20

0.6

80

OK

5.06078e-07

1.17347

2.226

1.75

8.153

4.65886

10.379

10000

20

0.6

80

OK

5.06078e-07

1.13485

1.539

1.75

17.532

10.0183

19.071

20000

20

0.6

50

OK

4.01432e-16

3.5839

6.567

0.25

6.257

6.257

12.824

20000

20

0.6

80

OK

3.86471e-07

1.48166

2.749

1.75

10.291

5.88057

13.04

20000

20

0.6

80

OK

3.86471e-07

1.67955

2.242

1.75

19.327

11.044

21.569

50000

20

0.6

50

OK

4.061e-16

8.18842

12.459

0.25

15.058

15.058

27.517

50000

20

0.6

80

OK

4.05497e-07

2.35728

4.816

1.75

21.052

12.0297

25.868

50000

20

0.6

80

OK

4.05497e-07

3.50925

4.819

1.75

28.29

16.1657

33.109

100000

20

0.6

50

OK

4.0605e-16

16.5372

22.588

0.25

27.511

27.511

50.099

100000

20

0.6

80

OK

2.45005e-07

4.00659

8.481

1.75

34.189

19.5366

42.67

100000

20

0.6

80

OK

2.45005e-07

6.59776

8.931

1.75

40.089

22.908

49.02

200000

20

0.6

50

OK

4.06458e-16

32.8206

41.996

0.25

50.962

50.962

92.958

200000

20

0.6

80

OK

1.49427e-07

7.1104

15.001

1.75

66.827

38.1869

81.828

200000

20

0.6

80

OK

1.49427e-07

12.671

16.495

1.75

60.358

34.4903

76.853

500000

20

0.6

50

OK

4.06573e-16

82.57

100.851

0.25

116.678

116.678

217.529

500000

20

0.6

80

OK

1.02155e-07

16.7895

35.195

1.75

132.678

75.816

167.873

500000

20

0.6

80

OK

1.02155e-07

31.1332

39.366

1.75

120.96

69.12

160.326

1000000

20

0.6

50

OK

4.06679e-16

156.851

189.966

0.25

223.739

223.739

413.705

1000000

20

0.6

80

OK

7.20293e-08

32.4026

68.105

1.75

239.208

136.69

307.313

1000000

20

0.6

80

OK

7.20293e-08

61.574

77.231

1.75

224.204

128.117

301.435

1000

50

0.6

50

OK

8.37586e-07

1.32656

4.317

0.25

4.158

4.158

8.475

1000

50

0.6

80

OK

5.15115e-08

0.84608

1.754

1.75

6.641

3.79486

8.395

1000

50

0.6

80

OK

5.15115e-08

1.00227

1.288

1.75

16.6

9.48571

17.888

2000

50

0.6

50

OK

7.09194e-13

1.52618

5.472

0.75

5.211

5.211

10.683

2000

50

0.6

80

OK

4.78034e-07

0.839744

1.766

1.75

7.125

4.07143

8.891

2000

50

0.6

80

OK

4.78034e-07

1.14294

1.464

1.75

16.836

9.62057

18.3

5000

50

0.6

50

OK

1.84129e-12

2.45178

8.504

0.75

5.677

5.677

14.181

5000

50

0.6

80

OK

1.94855e-08

1.48342

2.586

1.75

7.497

4.284

10.083

5000

50

0.6

80

OK

1.94855e-08

1.84698

2.248

1.75

17.012

9.72114

19.26

10000

50

0.6

50

OK

1.87344e-10

5.22739

11.511

0.25

7.14

7.14

18.651

10000

50

0.6

80

OK

2.29515e-08

1.88221

3.206

1.75

8.823

5.04171

12.029

10000

50

0.6

80

OK

2.29515e-08

2.52208

3.058

1.75

17.826

10.1863

20.884

20000

50

0.6

50

OK

1.35814e-11

10.8566

17.715

0.25

7.792

7.792

25.507

20000

50

0.6

80

OK

1.59875e-08

3.13277

5.095

1.75

11.616

6.63771

16.711

20000

50

0.6

80

OK

1.59875e-08

5.10659

5.919

1.75

20.178

11.5303

26.097

50000

50

0.6

50

OK

6.36376e-16

29.1401

38.111

0.25

18.397

18.397

56.508

50000

50

0.6

80

OK

7.46547e-07

7.14931

11.669

1.25

23.443

18.7544

35.112

50000

50

0.6

80

OK

7.46547e-07

13.0004

14.906

1.25

28.307

22.6456

43.213

100000

50

0.6

50

OK

6.43815e-16

55.9851

68.028

0.25

33.683

33.683

101.711

100000

50

0.6

80

OK

4.69721e-07

13.5514

22.246

1.25

39.893

31.9144

62.139

100000

50

0.6

80

OK

4.69721e-07

25.7011

29.304

1.25

43.501

34.8008

72.805

200000

50

0.6

50

OK

6.45465e-16

118.024

135.683

0.25

61.742

61.742

197.425

200000

50

0.6

80

OK

5.81929e-07

26.5349

42.709

1.25

76.32

61.056

119.029

200000

50

0.6

80

OK

5.81929e-07

52.3704

58.701

1.25

71.538

57.2304

130.239

500000

50

0.6

50

OK

6.4646e-16

294.792

329.355

0.25

143.129

143.129

472.484

500000

50

0.6

80

OK

2.89614e-07

67.0134

106.046

1.25

159.194

127.355

265.24

500000

50

0.6

80

OK

2.89614e-07

129.176

143.583

1.25

151.938

121.55

295.521

1000000

50

0.6

50

OK

6.45776e-16

597.797

659.843

0.25

278.876

278.876

938.719

1000000

50

0.6

80

OK

2.37119e-07

133.445

210.33

1.25

296.892

237.514

507.222

1000000

50

0.6

80

OK

2.37119e-07

259.647

287.444

1.25

288.261

230.609

575.705

1000

100

0.6

50

OK

3.61969e-08

4.24115

8.411

0.25

6.781

6.781

15.192

1000

100

0.6

80

OK

3.26765e-07

2.01808

2.961

1.25

7.859

6.2872

10.82

1000

100

0.6

80

OK

3.26765e-07

2.14534

2.468

1.25

14.82

11.856

17.288

2000

100

0.6

50

OK

4.28075e-08

5.45562

11.122

0.25

7.093

7.093

18.215

2000

100

0.6

80

OK

7.88853e-09

2.56224

3.601

1.75

8.77

5.01143

12.371

2000

100

0.6

80

OK

7.88853e-09

2.61338

2.98

1.75

18.542

10.5954

21.522

5000

100

0.6

50

OK

8.34302e-08

9.15306

19.384

0.25

8.022

8.022

27.406

5000

100

0.6

80

OK

5.50533e-07

3.35491

4.654

1.25

8.411

6.7288

13.065

5000

100

0.6

80

OK

5.50533e-07

3.99072

4.532

1.25

15.311

12.2488

19.843

10000

100

0.6

50

OK

1.32711e-07

15.619

34.715

0.25

10.078

10.078

44.793

10000

100

0.6

80

OK

7.7227e-07

4.74819

6.687

1.25

9.83

7.864

16.517

10000

100

0.6

80

OK

7.7227e-07

6.45315

7.265

1.25

15.713

12.5704

22.978

20000

100

0.6

50

OK

2.44804e-07

34.2656

54.497

0.25

10.445

10.445

64.942

20000

100

0.6

80

OK

2.73721e-07

7.61094

11.073

1.25

12.284

9.8272

23.357

20000

100

0.6

80

OK

2.73721e-07

12.3096

13.648

1.25

18.489

14.7912

32.137

50000

100

0.6

50

OK

9.82483e-16

90.6287

114.534

0.25

23.093

23.093

137.627

50000

100

0.6

80

OK

1.23672e-07

19.4731

27.794

1.25

27.53

22.024

55.324

50000

100

0.6

80

OK

1.23672e-07

34.1606

37.452

1.25

33.508

26.8064

70.96

100000

100

0.6

50

OK

8.69244e-16

185.838

215.867

0.25

42.837

42.837

258.704

100000

100

0.6

80

OK

9.87577e-08

39.5595

55.658

1.25

47.599

38.0792

103.257

100000

100

0.6

80

OK

9.87577e-08

71.4022

77.719

1.25

62.252

49.8016

139.971

200000

100

0.6

50

OK

8.74206e-16

372.668

413.292

0.25

76.558

76.558

489.85

200000

100

0.6

80

OK

6.13254e-08

78.0877

109.405

1.25

90.58

72.464

199.985

200000

100

0.6

80

OK

6.13254e-08

140.75

152.369

1.25

90.756

72.6048

243.125

500000

100

0.6

50

OK

8.7647e-16

935.144

1008.76

0.25

177.599

177.599

1186.36

500000

100

0.6

80

OK

1.02003e-07

199.523

276.315

1.25

197.878

158.302

474.193

500000

100

0.6

80

OK

1.02003e-07

361.851

389.657

1.25

204.624

163.699

594.281

1000000

100

0.6

50

OK

8.78674e-16

1881.97

2010.79

0.25

347.098

347.098

2357.89

1000000

100

0.6

80

OK

6.22013e-07

390.994

543.418

1.25

302.586

242.069

846.004

1000000

100

0.6

80

OK

6.22013e-07

708.069

763.084

1.25

317.848

254.278

1080.93

1000

200

0.6

50

OK

1.18998e-15

9.65002

15.596

0.25

9.252

9.252

24.848

1000

200

0.6

80

OK

2.33239e-08

4.95686

6.011

1.25

12.951

10.3608

18.962

1000

200

0.6

80

OK

2.33239e-08

4.89568

5.25

1.25

18.443

14.7544

23.693

2000

200

0.6

50

OK

8.2636e-09

12.102

23.823

0.25

12.528

12.528

36.351

2000

200

0.6

80

OK

4.51167e-07

4.80723

5.898

1.25

9.155

7.324

15.053

2000

200

0.6

80

OK

4.51167e-07

5.54438

6.009

1.25

14.35

11.48

20.359

5000

200

0.6

50

OK

6.42596e-09

24.273

51.561

0.25

13.998

13.998

65.559

5000

200

0.6

80

OK

7.74855e-07

7.06336

8.974

1.25

10.193

8.1544

19.167

5000

200

0.6

80

OK

7.74855e-07

10.237

11.029

1.25

14.7

11.76

25.729

10000

200

0.6

50

OK

6.46827e-09

42.7673

102.52

0.25

16.703

16.703

119.223

10000

200

0.6

80

OK

8.8246e-07

10.3517

13.706

1.25

11.695

9.356

25.401

10000

200

0.6

80

OK

8.8246e-07

16.6881

18.042

1.25

17.267

13.8136

35.309

20000

200

0.6

50

OK

6.76893e-09

81.1198

202.239

0.25

22.609

22.609

224.848

20000

200

0.6

80

OK

9.33744e-08

17.1732

23.701

1.25

17.851

14.2808

41.552

20000

200

0.6

80

OK

9.33744e-08

30.8237

33.289

1.25

24.541

19.6328

57.83

50000

200

0.6

50

OK

9.84119e-10

232.826

361.287

0.25

32.315

32.315

393.602

50000

200

0.6

80

OK

7.35074e-07

45.4467

61.431

1.25

29.172

23.3376

90.603

50000

200

0.6

80

OK

7.35074e-07

86.3513

92.407

1.25

37.713

30.1704

130.12

100000

200

0.6

50

OK

1.18308e-15

485.314

625.144

0.25

57.772

57.772

682.916

100000

200

0.6

80

OK

6.44932e-07

97.0158

128.593

1.25

52.869

42.2952

181.462

100000

200

0.6

80

OK

6.44932e-07

185.478

197.551

1.25

67.878

54.3024

265.429

200000

200

0.6

50

OK

1.19927e-15

991.031

1152.75

0.25

106.685

106.685

1259.44

200000

200

0.6

80

OK

1.81202e-07

194.996

256.945

1.25

102.598

82.0784

359.543

200000

200

0.6

80

OK

1.81202e-07

370.439

393.253

1.25

121.351

97.0808

514.604

500000

200

0.6

50

OK

1.20823e-15

2531.02

2760.91

0.25

249.954

249.954

3010.86

500000

200

0.6

80

OK

1.82869e-08

508.88

662.022

1.25

288.784

231.027

950.806

500000

200

0.6

80

OK

1.82869e-08

969.774

1025.54

1.25

353.043

282.434

1378.58

1000000

200

0.6

50

OK

1.21097e-15

5078.22

5419.47

0.25

489.768

489.768

5909.24

1000000

200

0.6

80

OK

1.01114e-07

994.096

1298.64

1.25

446.527

357.222

1745.17

1000000

200

0.6

80

OK

1.01114e-07

1902.59

2013.15

1.25

560.465

448.372

2573.61

1000

500

0.6

50

OK

1.61434e-15

13.2392

13.732

0.25

10.57

10.57

24.302

1000

500

0.6

80

OK

1.61434e-15

13.2629

13.741

0.25

10.467

10.467

24.208

1000

500

0.6

80

OK

1.61434e-15

13.223

13.708

0.25

10.479

10.479

24.187

2000

500

0.6

50

OK

1.6814e-15

34.7228

72.601

0.25

22.351

22.351

94.952

2000

500

0.6

80

OK

3.80324e-08

14.4115

16.323

1.25

20.556

16.4448

36.879

2000

500

0.6

80

OK

3.80324e-08

20.6971

21.456

1.25

25.147

20.1176

46.603

5000

500

0.6

50

OK

4.23621e-10

93.5126

228.331

0.25

35.202

35.202

263.533

5000

500

0.6

80

OK

6.01198e-08

28.1124

31.92

1.25

21.857

17.4856

53.777

5000

500

0.6

80

OK

6.01198e-08

41.478

43.058

1.25

26.774

21.4192

69.832

10000

500

0.6

50

OK

3.68614e-10

193.656

496.479

0.25

39.14

39.14

535.619

10000

500

0.6

80

OK

6.86314e-08

42.583

50.544

1.25

25.202

20.1616

75.746

10000

500

0.6

80

OK

6.86314e-08

76.3219

79.288

1.25

29.939

23.9512

109.227

20000

500

0.6

50

OK

4.41885e-10

389.686

1059.11

0.25

51.064

51.064

1110.18

20000

500

0.6

80

OK

6.67022e-08

77.0025

92.623

1.25

31.004

24.8032

123.627

20000

500

0.6

80

OK

6.67022e-08

147.015

152.707

1.25

37.954

30.3632

190.661

50000

500

0.6

50

OK

7.41744e-10

990.094

2683.28

0.25

101.628

101.628

2784.91

50000

500

0.6

80

OK

5.56515e-08

183.479

222.046

1.25

53.804

43.0432

275.85

50000

500

0.6

80

OK

5.56515e-08

362.906

377.111

1.25

69.092

55.2736

446.203

100000

500

0.6

50

OK

4.03985e-09

2240.78

3966.61

0.25

123.621

123.621

4090.23

100000

500

0.6

80

OK

1.8079e-07

391.295

468.03

1.25

96.893

77.5144

564.923

100000

500

0.6

80

OK

1.8079e-07

775.17

803.318

1.25

144.281

115.425

947.599

200000

500

0.6

50

OK

1.80715e-15

4744.2

6527.42

0.25

231.49

231.49

6758.91

200000

500

0.6

80

OK

4.80574e-08

842.812

995.418

1.25

196.151

156.921

1191.57

200000

500

0.6

80

OK

4.80574e-08

1672.06

1727.54

1.25

293.063

234.45

2020.6

500000

500

0.6

50

OoM (in setup stage)

500000

500

0.6

80

OK

2.15108e-08

2288.58

2668.72

1.25

472.735

378.188

3141.46

500000

500

0.6

80

OK

2.15108e-08

4595.7

4734.14

1.25

740.979

592.783

5475.12

1000000

500

0.6

50

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)