4-2-2016 New Banded Results



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


N

K

d

NPrtns

Solves

RelR

T-LU

T-SwDef

T-MMDef

T-PreP

nItrs

T-SwInf

T-MVInf

T-Kry

T-KryPIt

T-Total

1000

10

0.6

80

OK

3.57013e-07

0.855968

0.996

2.75

22.257

8.09345

23.253

1000

10

0.6

80

OK

3.57013e-07

0.626848

1.311

2.75

7.61

2.76727

8.921

1000

10

0.6

80

OK

3.57013e-07

1.31872

2.917

2.75

12.013

4.36836

14.93

2000

10

0.6

80

OK

2.17844e-07

0.883456

1.039

2.75

22.552

8.20073

23.591

2000

10

0.6

80

OK

2.17844e-07

0.622656

1.457

2.75

7.943

2.88836

9.4

2000

10

0.6

80

OK

2.17844e-07

1.47075

3.156

2.75

12.261

4.45855

15.417

5000

10

0.6

80

OK

1.07166e-07

0.852928

1.203

2.75

23.491

8.54218

24.694

5000

10

0.6

80

OK

1.07166e-07

0.659648

1.6

2.75

8.632

3.13891

10.232

5000

10

0.6

80

OK

1.07166e-07

1.31251

2.952

2.75

12.861

4.67673

15.813

10000

10

0.6

80

OK

6.91353e-07

0.799104

1.117

2.75

23.955

8.71091

25.072

10000

10

0.6

80

OK

6.91353e-07

0.627232

1.61

2.75

10.67

3.88

12.28

10000

10

0.6

80

OK

6.91353e-07

1.31334

2.997

2.75

14.977

5.44618

17.974

20000

10

0.6

80

OK

1.30571e-07

1.0377

1.426

2.75

26.519

9.64327

27.945

20000

10

0.6

80

OK

1.30571e-07

1.36109

2.563

2.75

13.886

5.04945

16.449

20000

10

0.6

80

OK

1.30571e-07

1.04717

2.977

2.75

17.007

6.18436

19.984

50000

10

0.6

80

OK

8.46377e-07

1.93446

2.81

2.25

40.23

17.88

43.04

50000

10

0.6

80

OK

8.46377e-07

2.13491

3.979

2.25

27.815

12.3622

31.794

50000

10

0.6

80

OK

8.46377e-07

2.88208

5.542

2.25

31.038

13.7947

36.58

100000

10

0.6

80

OK

4.57895e-07

3.51043

5.159

2.25

51.464

22.8729

56.623

100000

10

0.6

80

OK

4.57895e-07

3.1985

5.964

2.25

43.611

19.3827

49.575

100000

10

0.6

80

OK

4.57895e-07

3.89994

7.745

2.25

48.972

21.7653

56.717

200000

10

0.6

80

OK

9.29011e-07

6.38467

8.851

2.25

73.922

32.8542

82.773

200000

10

0.6

80

OK

9.29011e-07

5.62618

9.998

2.25

73.546

32.6871

83.544

200000

10

0.6

80

OK

9.29011e-07

6.38477

12.888

2.25

82.37

36.6089

95.258

500000

10

0.6

80

OK

5.61586e-07

14.8307

19.066

2.25

133.074

59.144

152.14

500000

10

0.6

80

OK

5.61586e-07

12.6148

22.084

2.25

145.89

64.84

167.974

500000

10

0.6

80

OK

5.61586e-07

13.3105

27.581

2.25

166.591

74.0404

194.172

1000000

10

0.6

80

OK

3.18509e-07

29.1771

36.883

2.25

248.509

110.448

285.392

1000000

10

0.6

80

OK

3.18509e-07

24.5115

42.375

2.25

266.679

118.524

309.054

1000000

10

0.6

80

OK

3.18509e-07

25.0764

52.938

2.25

304.259

135.226

357.197

1000

20

0.6

80

OK

5.21227e-07

1.08112

1.292

2.25

21.555

9.58

22.847

1000

20

0.6

80

OK

5.21227e-07

0.592768

1.394

2.25

6.48

2.88

7.874

1000

20

0.6

80

OK

5.21227e-07

1.35232

3.362

2.25

10.494

4.664

13.856

2000

20

0.6

80

OK

3.38398e-07

0.84576

1.159

2.25

17.512

7.78311

18.671

2000

20

0.6

80

OK

3.38398e-07

0.773792

1.765

2.25

6.583

2.92578

8.348

2000

20

0.6

80

OK

3.38398e-07

1.19862

2.903

2.25

9.853

4.37911

12.756

5000

20

0.6

80

OK

8.56005e-07

0.837056

1.169

1.75

15.947

9.11257

17.116

5000

20

0.6

80

OK

8.56005e-07

0.765376

1.733

1.75

6.46

3.69143

8.193

5000

20

0.6

80

OK

8.56005e-07

1.18698

2.86

1.75

8.431

4.81771

11.291

10000

20

0.6

80

OK

5.06078e-07

1.11302

1.511

1.75

17.092

9.76686

18.603

10000

20

0.6

80

OK

5.06078e-07

1.37818

2.548

1.75

8.033

4.59029

10.581

10000

20

0.6

80

OK

5.06078e-07

1.09053

3.039

1.75

10.105

5.77429

13.144

20000

20

0.6

80

OK

3.86471e-07

1.65405

2.208

1.75

18.962

10.8354

21.17

20000

20

0.6

80

OK

3.86471e-07

1.49709

2.808

1.75

10.32

5.89714

13.128

20000

20

0.6

80

OK

3.86471e-07

2.45027

4.795

1.75

13.446

7.68343

18.241

50000

20

0.6

80

OK

4.05497e-07

3.4799

4.753

1.75

27.334

15.6194

32.087

50000

20

0.6

80

OK

4.05497e-07

2.36448

4.82

1.75

21.004

12.0023

25.824

50000

20

0.6

80

OK

4.05497e-07

3.67686

7.941

1.75

28.894

16.5109

36.835

100000

20

0.6

80

OK

2.45005e-07

6.6104

8.884

1.75

38.901

22.2291

47.785

100000

20

0.6

80

OK

2.45005e-07

3.904

8.358

1.75

33.912

19.3783

42.27

100000

20

0.6

80

OK

2.45005e-07

4.64202

11.07

1.75

38.171

21.812

49.241

200000

20

0.6

80

OK

1.49427e-07

12.7211

16.515

1.75

59.583

34.0474

76.098

200000

20

0.6

80

OK

1.49427e-07

7.09798

15.166

1.75

66.547

38.0269

81.713

200000

20

0.6

80

OK

1.49427e-07

7.74992

19.793

1.75

80.131

45.7891

99.924

500000

20

0.6

80

OK

1.02155e-07

31.1244

39.439

1.75

120.514

68.8651

159.953

500000

20

0.6

80

OK

1.02155e-07

16.779

35.178

1.75

131.517

75.1526

166.695

500000

20

0.6

80

OK

1.02155e-07

15.4683

43.341

1.75

148.104

84.6309

191.445

1000000

20

0.6

80

OK

7.20293e-08

61.5971

77.248

1.75

224.639

128.365

301.887

1000000

20

0.6

80

OK

7.20293e-08

32.4463

68.013

1.75

238.628

136.359

306.641

1000000

20

0.6

80

OK

7.20293e-08

29.0819

82.878

1.75

264.687

151.25

347.565

1000

50

0.6

80

OK

5.15115e-08

1.00883

1.287

1.75

16.288

9.30743

17.575

1000

50

0.6

80

OK

5.15115e-08

0.89312

1.821

1.75

6.625

3.78571

8.446

1000

50

0.6

80

OK

5.15115e-08

1.39309

3.044

1.75

9.538

5.45029

12.582

2000

50

0.6

80

OK

4.78034e-07

1.15398

1.503

1.75

17.333

9.90457

18.836

2000

50

0.6

80

OK

4.78034e-07

0.943424

1.891

1.75

6.794

3.88229

8.685

2000

50

0.6

80

OK

4.78034e-07

1.37232

3.224

1.75

10.057

5.74686

13.281

5000

50

0.6

80

OK

1.94855e-08

1.93584

2.328

1.75

16.626

9.50057

18.954

5000

50

0.6

80

OK

1.94855e-08

1.44912

2.554

1.75

7.259

4.148

9.813

5000

50

0.6

80

OK

1.94855e-08

2.77987

5.178

1.75

11.01

6.29143

16.188

10000

50

0.6

80

OK

2.29515e-08

2.6576

3.288

1.75

21.45

12.2571

24.738

10000

50

0.6

80

OK

2.29515e-08

2.03878

3.646

1.75

10.152

5.80114

13.798

10000

50

0.6

80

OK

2.29515e-08

2.53037

5.342

1.75

10.59

6.05143

15.932

20000

50

0.6

80

OK

1.59875e-08

5.25814

6.176

1.75

24.18

13.8171

30.356

20000

50

0.6

80

OK

1.59875e-08

3.11738

5.08

1.75

11.468

6.55314

16.548

20000

50

0.6

80

OK

1.59875e-08

4.16954

7.847

1.75

15.844

9.05371

23.691

50000

50

0.6

80

OK

7.46547e-07

13.0187

15.09

1.25

28.323

22.6584

43.413

50000

50

0.6

80

OK

7.46547e-07

7.21059

11.998

1.25

25.919

20.7352

37.917

50000

50

0.6

80

OK

7.46547e-07

7.1688

14.302

1.25

28.214

22.5712

42.516

100000

50

0.6

80

OK

4.69721e-07

25.7205

29.556

1.25

45.801

36.6408

75.357

100000

50

0.6

80

OK

4.69721e-07

13.6758

22.829

1.25

41.307

33.0456

64.136

100000

50

0.6

80

OK

4.69721e-07

12.4401

25.762

1.25

43.396

34.7168

69.158

200000

50

0.6

80

OK

5.81929e-07

53.0515

59.824

1.25

72.038

57.6304

131.862

200000

50

0.6

80

OK

5.81929e-07

26.3745

42.603

1.25

76.513

61.2104

119.116

200000

50

0.6

80

OK

5.81929e-07

23.0379

48.223

1.25

81.148

64.9184

129.371

500000

50

0.6

80

OK

2.89614e-07

127.635

142.06

1.25

151.459

121.167

293.519

500000

50

0.6

80

OK

2.89614e-07

65.2711

104.274

1.25

158.861

127.089

263.135

500000

50

0.6

80

OK

2.89614e-07

55.7491

117.256

1.25

182.993

146.394

300.249

1000000

50

0.6

80

OK

2.37119e-07

260.068

288.468

1.25

292.558

234.046

581.026

1000000

50

0.6

80

OK

2.37119e-07

133.002

209.848

1.25

297.747

238.198

507.595

1000000

50

0.6

80

OK

2.37119e-07

109.979

234.74

1.25

323.445

258.756

558.185

1000

100

0.6

80

OK

3.26765e-07

2.18432

2.533

1.25

14.894

11.9152

17.427

1000

100

0.6

80

OK

3.26765e-07

2.00451

2.933

1.25

7.553

6.0424

10.486

1000

100

0.6

80

OK

3.26765e-07

2.07984

3.98

1.25

10.777

8.6216

14.757

2000

100

0.6

80

OK

7.88853e-09

2.57587

2.941

1.75

17.933

10.2474

20.874

2000

100

0.6

80

OK

7.88853e-09

2.4809

3.554

1.75

8.52

4.86857

12.074

2000

100

0.6

80

OK

7.88853e-09

2.73168

4.57

1.75

11.329

6.47371

15.899

5000

100

0.6

80

OK

5.50533e-07

4.21654

4.9

1.25

19.839

15.8712

24.739

5000

100

0.6

80

OK

5.50533e-07

3.17616

4.501

1.25

8.311

6.6488

12.812

5000

100

0.6

80

OK

5.50533e-07

3.88723

6.355

1.25

11.415

9.132

17.77

10000

100

0.6

80

OK

7.7227e-07

6.44448

7.24

1.25

15.294

12.2352

22.534

10000

100

0.6

80

OK

7.7227e-07

4.45139

6.397

1.25

9.698

7.7584

16.095

10000

100

0.6

80

OK

7.7227e-07

4.09094

7.138

1.25

11.658

9.3264

18.796

20000

100

0.6

80

OK

2.73721e-07

12.2964

13.638

1.25

18.132

14.5056

31.77

20000

100

0.6

80

OK

2.73721e-07

7.40406

10.86

1.25

12.233

9.7864

23.093

20000

100

0.6

80

OK

2.73721e-07

7.96563

17.297

1.25

19.231

15.3848

36.528

50000

100

0.6

80

OK

1.23672e-07

34.1636

37.525

1.25

34.056

27.2448

71.581

50000

100

0.6

80

OK

1.23672e-07

19.424

27.706

1.25

27.449

21.9592

55.155

50000

100

0.6

80

OK

1.23672e-07

13.417

26.324

1.25

29.763

23.8104

56.087

100000

100

0.6

80

OK

9.87577e-08

70.5505

76.774

1.25

53.801

43.0408

130.575

100000

100

0.6

80

OK

9.87577e-08

39.6717

55.896

1.25

48.713

38.9704

104.609

100000

100

0.6

80

OK

9.87577e-08

26.5242

53.404

1.25

53.106

42.4848

106.51

200000

100

0.6

80

OK

6.13254e-08

140.504

152.217

1.25

91.985

73.588

244.202

200000

100

0.6

80

OK

6.13254e-08

78.1158

109.422

1.25

92.315

73.852

201.737

200000

100

0.6

80

OK

6.13254e-08

50.3635

99.356

1.25

94.622

75.6976

193.978

500000

100

0.6

80

OK

1.02003e-07

361.303

389.208

1.25

205.495

164.396

594.703

500000

100

0.6

80

OK

1.02003e-07

199.598

276.262

1.25

198.976

159.181

475.238

500000

100

0.6

80

OK

1.02003e-07

126.044

246.929

1.25

208.225

166.58

455.154

1000000

100

0.6

80

OK

6.22013e-07

707.842

762.787

1.25

318.412

254.73

1081.2

1000000

100

0.6

80

OK

6.22013e-07

391.149

543.561

1.25

303.536

242.829

847.097

1000000

100

0.6

80

OK

6.22013e-07

248.541

491.071

1.25

315.3

252.24

806.371

1000

200

0.6

80

OK

2.33239e-08

4.91917

5.274

1.25

19.136

15.3088

24.41

1000

200

0.6

80

OK

2.33239e-08

4.67926

5.908

1.25

12.878

10.3024

18.786

1000

200

0.6

80

OK

2.33239e-08

4.55546

5.645

1.25

12.574

10.0592

18.219

2000

200

0.6

80

OK

4.51167e-07

5.60698

6.112

1.25

15.176

12.1408

21.288

2000

200

0.6

80

OK

4.51167e-07

4.7632

5.935

1.25

9.302

7.4416

15.237

2000

200

0.6

80

OK

4.51167e-07

5.17738

7.363

1.25

11.041

8.8328

18.404

5000

200

0.6

80

OK

7.74855e-07

10.2751

11.215

1.25

18.153

14.5224

29.368

5000

200

0.6

80

OK

7.74855e-07

6.98454

9.048

1.25

10.229

8.1832

19.277

5000

200

0.6

80

OK

7.74855e-07

6.45005

9.557

1.25

12.161

9.7288

21.718

10000

200

0.6

80

OK

8.8246e-07

16.7491

18.148

1.25

17.227

13.7816

35.375

10000

200

0.6

80

OK

8.8246e-07

10.3863

13.781

1.25

11.719

9.3752

25.5

10000

200

0.6

80

OK

8.8246e-07

7.82624

13.118

1.25

13.404

10.7232

26.522

20000

200

0.6

80

OK

9.33744e-08

30.817

33.283

1.25

24.459

19.5672

57.742

20000

200

0.6

80

OK

9.33744e-08

17.3465

23.961

1.25

19.24

15.392

43.201

20000

200

0.6

80

OK

9.33744e-08

11.3438

21.45

1.25

19.813

15.8504

41.263

50000

200

0.6

80

OK

7.35074e-07

86.2921

92.401

1.25

38.632

30.9056

131.033

50000

200

0.6

80

OK

7.35074e-07

45.5897

62.242

1.25

30.343

24.2744

92.585

50000

200

0.6

80

OK

7.35074e-07

27.1956

52.097

1.25

30.477

24.3816

82.574

100000

200

0.6

80

OK

6.44932e-07

185.501

197.622

1.25

68.199

54.5592

265.821

100000

200

0.6

80

OK

6.44932e-07

96.3521

127.911

1.25

53.741

42.9928

181.652

100000

200

0.6

80

OK

6.44932e-07

56.1508

105.32

1.25

53.488

42.7904

158.808

200000

200

0.6

80

OK

1.81202e-07

374.62

397.828

1.25

128.092

102.474

525.92

200000

200

0.6

80

OK

1.81202e-07

193.262

255.206

1.25

102.81

82.248

358.016

200000

200

0.6

80

OK

1.81202e-07

112.245

209.371

1.25

104.603

83.6824

313.974

500000

200

0.6

80

OK

1.82869e-08

969.995

1026.53

1.25

353.928

283.142

1380.46

500000

200

0.6

80

OK

1.82869e-08

504.118

658.783

1.25

289.738

231.79

948.521

500000

200

0.6

80

OK

1.82869e-08

289.6

532.704

1.25

293.496

234.797

826.2

1000000

200

0.6

80

OK

1.01114e-07

1902.88

2013.58

1.25

558.436

446.749

2572.02

1000000

200

0.6

80

OK

1.01114e-07

988.066

1291.81

1.25

447.44

357.952

1739.25

1000000

200

0.6

80

OK

1.01114e-07

569.459

1050.56

1.25

449.041

359.233

1499.6

1000

500

0.6

80

OK

1.61434e-15

13.4072

13.967

0.25

10.899

10.899

24.866

1000

500

0.6

80

OK

1.61434e-15

13.1786

13.655

0.25

10.39

10.39

24.045

1000

500

0.6

80

OK

1.61434e-15

13.3604

13.845

0.25

10.495

10.495

24.34

2000

500

0.6

80

OK

3.80324e-08

20.7084

21.457

1.25

24.962

19.9696

46.419

2000

500

0.6

80

OK

3.80324e-08

14.1956

16.094

1.25

20.25

16.2

36.344

2000

500

0.6

80

OK

3.80324e-08

14.218

16.122

1.25

20.212

16.1696

36.334

5000

500

0.6

80

OK

6.01198e-08

41.9975

43.578

1.25

26.64

21.312

70.218

5000

500

0.6

80

OK

6.01198e-08

28.406

32.248

1.25

22.095

17.676

54.343

5000

500

0.6

80

OK

6.01198e-08

22.0505

27.79

1.25

24.517

19.6136

52.307

10000

500

0.6

80

OK

6.86314e-08

76.1695

79.132

1.25

29.371

23.4968

108.503

10000

500

0.6

80

OK

6.86314e-08

42.1767

50.301

1.25

25.569

20.4552

75.87

10000

500

0.6

80

OK

6.86314e-08

29.2866

40.897

1.25

27.615

22.092

68.512

20000

500

0.6

80

OK

6.67022e-08

146.889

152.6

1.25

38.711

30.9688

191.311

20000

500

0.6

80

OK

6.67022e-08

76.8828

92.45

1.25

30.228

24.1824

122.678

20000

500

0.6

80

OK

6.67022e-08

46.1318

70.572

1.25

33.066

26.4528

103.638

50000

500

0.6

80

OK

5.56515e-08

363.052

377.28

1.25

69.514

55.6112

446.794

50000

500

0.6

80

OK

5.56515e-08

183.541

222.843

1.25

54.348

43.4784

277.191

50000

500

0.6

80

OK

5.56515e-08

100.5

160.577

1.25

60.396

48.3168

220.973

100000

500

0.6

80

OK

1.8079e-07

775.521

803.765

1.25

144.091

115.273

947.856

100000

500

0.6

80

OK

1.8079e-07

391.646

468.585

1.25

98.523

78.8184

567.108

100000

500

0.6

80

OK

1.8079e-07

213.231

391.826

1.25

91.68

73.344

483.506

200000

500

0.6

80

OK

4.80574e-08

1672.39

1728.11

1.25

293.486

234.789

2021.59

200000

500

0.6

80

OK

4.80574e-08

842.894

996.377

1.25

198.292

158.634

1194.67

200000

500

0.6

80

OK

4.80574e-08

456.734

697.3

1.25

192.195

153.756

889.495

500000

500

0.6

80

OK

2.15108e-08

4539.14

4676.75

1.25

738.673

590.938

5415.43

500000

500

0.6

80

OK

2.15108e-08

2288.27

2668.83

1.25

473.336

378.669

3142.16

500000

500

0.6

80

OK

2.15108e-08

1239.39

1839.27

1.25

431.224

344.979

2270.5

1000000

500

0.6

80

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)