4-9-2016 New Banded Results



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


N

K

d

NPrtns

T-pinv

T-mv

Solves

RelR

T-BC

T-LU

T-SPK

T-LUrdcd

T-PreP

nItrs

T-Kry

T-KryPIt

T-Total

1000

10

0.6

50

2.05619

0.167264

OK

8.14263e-12

0.055232

1.33264

0.686496

0.387392

3.446

1.25

6.468

5.1744

9.914

1000

10

0.6

80

2.86602

0.375712

OK

2.36381e-10

0

1.31456

0

0

1.557

4.75

52.402

11.032

53.959

2000

10

0.6

50

1.47562

0.119488

OK

2.49705e-14

0.0568

1.39277

0.31712

0.399904

3.365

0.75

4.819

4.819

8.184

2000

10

0.6

80

2.79891

0.40576

OK

2.97092e-10

0

1.30666

0

0

1.542

4.25

46.816

11.0155

48.358

5000

10

0.6

50

2.27178

0.155168

OK

1.26959e-14

0.07024

1.37891

0.351136

0.384992

3.941

0.75

5.487

5.487

9.428

5000

10

0.6

80

3.19485

0.451168

OK

6.41246e-10

0

1.25034

0

0

1.708

3.75

42.445

11.3187

44.153

10000

10

0.6

50

2.95152

0.140128

OK

3.28059e-16

0.054464

1.40858

0.766528

0.3864

4.191

0.25

4.921

4.921

9.112

10000

10

0.6

80

4.94858

0.656192

OK

9.39939e-10

0

1.60589

0

0

2.297

3.75

52.035

13.876

54.332

20000

10

0.6

50

4.97379

0.215392

OK

3.17874e-16

0.053248

1.91286

0.46704

0.369472

4.594

0.25

7.288

7.288

11.882

20000

10

0.6

80

7.41616

0.954368

OK

5.87459e-10

0

1.31773

0

0

1.842

3.75

46.642

12.4379

48.484

50000

10

0.6

50

11.7758

0.440544

OK

3.17973e-16

0.04528

3.81501

0.768704

0.335808

7.171

0.25

15.594

15.594

22.765

50000

10

0.6

80

21.3026

1.98694

OK

7.00301e-10

0

2.0681

0

0

3.132

3.75

59.285

15.8093

62.417

100000

10

0.6

50

21.3823

0.809408

OK

3.17875e-16

0.03616

6.99766

1.25875

0.267872

11.254

0.25

26.585

26.585

37.839

100000

10

0.6

80

35.09

3.67629

OK

2.63041e-10

0

3.46224

0

0

5.106

3.75

72.939

19.4504

78.045

200000

10

0.6

50

41.2039

1.56778

OK

3.17563e-16

0.03488

13.5464

2.20541

0.276352

19.684

0.25

49.047

49.047

68.731

200000

10

0.6

80

64.1914

7.08531

OK

3.00918e-10

0

6.28346

0

0

8.509

3.75

104.744

27.9317

113.253

500000

10

0.6

50

100.593

3.84698

OK

3.17572e-16

0.036128

33.1016

5.03811

0.267808

44.91

0.25

111.031

111.031

155.941

500000

10

0.6

80

151.736

17.3479

OK

2.25904e-10

0

14.8964

0

0

19.139

3.75

205.195

54.7187

224.334

1000000

10

0.6

50

200.23

7.6824

OK

3.17757e-16

0.03616

65.9341

9.74966

0.270816

87.196

0.25

214.941

214.941

302.137

1000000

10

0.6

80

298.688

34.5951

OK

9.93146e-10

0

29.2513

0

0

36.862

3.25

371.509

114.31

408.371

1000

20

0.6

50

1.59546

0.112

OK

5.03984e-10

0.056672

1.27667

0.339968

0.442624

3.756

0.75

4.809

4.809

8.565

1000

20

0.6

80

1.94598

0.249376

OK

1.16167e-10

0

0.81632

0

0

0.956

3.25

26.528

8.16246

27.484

2000

20

0.6

50

1.72483

0.133888

OK

5.2288e-10

0.065984

1.37734

1.34566

0.500352

5.07

0.75

4.977

4.977

10.047

2000

20

0.6

80

2.40566

0.341248

OK

7.681e-10

0

1.32224

0

0

1.811

3.25

36.163

11.1271

37.974

5000

20

0.6

50

2.67942

0.181856

OK

8.94553e-10

0.253408

1.4081

0.94736

0.459264

4.832

0.25

4.85

4.85

9.682

5000

20

0.6

80

2.57619

0.422528

OK

8.67865e-10

0

1.25171

0

0

1.758

2.75

33.275

12.1

35.033

10000

20

0.6

50

2.78835

0.176192

OK

2.35799e-11

0.237824

2.16387

1.01066

0.468704

5.694

0.25

4.724

4.724

10.418

10000

20

0.6

80

3.65328

0.616672

OK

9.73068e-10

0

1.33091

0

0

1.847

2.75

33.029

12.0105

34.876

20000

20

0.6

50

4.78275

0.28272

OK

4.01432e-16

0.224992

3.72781

1.07254

0.437952

7.356

0.25

6.966

6.966

14.322

20000

20

0.6

80

5.93981

0.9928

OK

6.92831e-10

0

1.86288

0

0

2.515

2.75

35.021

12.7349

37.536

50000

20

0.6

50

11.9695

0.610304

OK

4.061e-16

0.226368

8.48205

1.31165

0.382624

12.978

0.25

15.552

15.552

28.53

50000

20

0.6

80

16.2407

2.1576

OK

6.23816e-10

0

3.52422

0

0

4.807

2.75

41.781

15.1931

46.588

100000

20

0.6

50

22.3208

1.17594

OK

4.0605e-16

0.162208

16.8062

1.84765

0.361824

22.901

0.25

27.909

27.909

50.81

100000

20

0.6

80

28.3804

4.13526

OK

8.57747e-10

0

6.7417

0

0

9.116

2.75

58.239

21.1778

67.355

200000

20

0.6

50

42.6357

2.27152

OK

4.06458e-16

0.16176

32.1013

2.82538

0.356704

41.169

0.25

51.126

51.126

92.295

200000

20

0.6

80

52.162

7.97318

OK

2.98509e-10

0

12.7929

0

0

16.588

2.75

86.339

31.396

102.927

500000

20

0.6

50

103.892

5.60445

OK

4.06573e-16

0.170912

79.4479

5.79962

0.36208

97.611

0.25

116.09

116.09

213.701

500000

20

0.6

80

123.648

19.6515

OK

3.42927e-10

0

31.2774

0

0

39.557

2.75

171.537

62.3771

211.094

1000000

20

0.6

50

206.659

11.1994

OK

4.06679e-16

0.163328

162.45

10.7655

0.359488

195.634

0.25

225.034

225.034

420.668

1000000

20

0.6

80

243.88

39.2051

OK

1.40223e-10

0

61.9536

0

0

77.625

2.75

316.958

115.257

394.583

1000

50

0.6

50

3.19293

0.13728

OK

5.41854e-13

0.066784

1.53098

1.49469

0.636

5.735

0.75

6.398

6.398

12.133

1000

50

0.6

80

3.39414

0.305824

OK

5.73815e-10

0

1.32461

0

0

1.805

2.25

29.963

13.3169

31.768

2000

50

0.6

50

3.34787

0.181824

OK

7.09194e-13

0.261792

1.71142

2.0264

0.75008

6.812

0.75

6.61

6.61

13.422

2000

50

0.6

80

3.51952

0.432

OK

3.33326e-10

0

1.39248

0

0

1.867

2.25

29.797

13.2431

31.664

5000

50

0.6

50

3.66208

0.282912

OK

1.84129e-12

0.275744

2.61379

3.54246

1.03923

9.527

0.75

6.772

6.772

16.299

5000

50

0.6

80

3.47178

0.670048

OK

9.14694e-10

0

2.11078

0

0

2.643

2.25

28.249

12.5551

30.892

10000

50

0.6

50

5.45334

0.467552

OK

1.87344e-10

0.244896

5.69558

3.60019

1.01469

12.676

0.25

7.961

7.961

20.637

10000

50

0.6

80

4.09587

1.09776

OK

3.29765e-10

0

2.66547

0

0

3.289

2.25

27.985

12.4378

31.274

20000

50

0.6

50

6.00762

0.553472

OK

1.35814e-11

0.258688

10.9747

3.62115

0.98784

18.222

0.25

8.173

8.173

26.395

20000

50

0.6

80

6.76611

1.95168

OK

2.65793e-10

0

5.17376

0

0

6.024

2.25

29.422

13.0764

35.446

50000

50

0.6

50

14.4091

1.29798

OK

6.36376e-16

0.196864

29.1022

4.08208

0.958976

38.039

0.25

18.406

18.406

56.445

50000

50

0.6

80

18.826

4.56877

OK

3.5551e-10

0

12.8733

0

0

14.814

2.25

43.424

19.2996

58.238

100000

50

0.6

50

26.9908

2.56326

OK

6.43815e-16

0.196448

60.4528

4.96362

0.963488

72.618

0.25

33.984

33.984

106.602

100000

50

0.6

80

28.6736

7.64659

OK

9.59934e-10

0

25.9997

0

0

29.556

2.25

58.056

25.8027

87.612

200000

50

0.6

50

51.2458

5.0329

OK

6.45465e-16

0.20032

117.064

6.48086

0.958464

134.758

0.25

62.451

62.451

197.209

200000

50

0.6

80

62.0187

17.6468

OK

7.94132e-10

0

52.1716

0

0

58.469

2.25

102.195

45.42

160.664

500000

50

0.6

50

125.053

12.5068

OK

6.4646e-16

0.201408

299.38

11.1685

0.952256

333.931

0.25

143.496

143.496

477.427

500000

50

0.6

80

127.231

37.5401

OK

6.80824e-10

0

129.803

0

0

144.254

2.25

186.988

83.1058

331.242

1000000

50

0.6

50

248.075

24.9787

OK

6.45776e-16

0.178304

591.374

19.0011

0.944896

653.671

0.25

279.737

279.737

933.408

1000000

50

0.6

80

251.759

74.9685

OK

4.20445e-10

0

263.031

0

0

290.973

2.25

351.555

156.247

642.528

1000

100

0.6

50

5.48906

0.14912

OK

2.96759e-15

0.172192

4.24422

1.65782

0.855808

8.332

0.75

7.714

7.714

16.046

1000

100

0.6

80

4.8113

0.35088

OK

3.93268e-10

0

2.41613

0

0

2.791

2.25

28.492

12.6631

31.283

2000

100

0.6

50

5.67978

0.23408

OK

3.20539e-15

0.172096

5.41107

2.76003

1.15606

10.984

0.75

8.002

8.002

18.986

2000

100

0.6

80

4.76941

0.515712

OK

2.4168e-10

0

2.82858

0

0

3.327

2.25

27.528

12.2347

30.855

5000

100

0.6

50

6.47283

0.459392

OK

5.66363e-15

0.26848

9.26442

6.39126

2.10659

20.181

0.75

9.469

9.469

29.65

5000

100

0.6

80

4.57264

0.91232

OK

3.96533e-10

0

4.0065

0

0

4.531

2.25

21.881

9.72489

26.412

10000

100

0.6

50

7.82038

0.81952

OK

1.02547e-14

0.39232

15.6836

13.0511

3.6641

35.176

0.75

11.19

11.19

46.366

10000

100

0.6

80

5.11786

1.65488

OK

1.30014e-10

0

6.52304

0

0

7.369

2.25

25.36

11.2711

32.729

20000

100

0.6

50

12.2095

1.54698

OK

9.01164e-16

0.290688

34.4124

13.3009

3.6495

54.671

0.75

16.529

16.529

71.2

20000

100

0.6

80

6.64848

3.10995

OK

2.88388e-10

0

12.3721

0

0

13.741

2.25

26.817

11.9187

40.558

50000

100

0.6

50

18.1526

2.49699

OK

9.82483e-16

0.320384

90.9672

14.2816

3.66438

114.942

0.25

23.354

23.354

138.296

50000

100

0.6

80

15.664

6.2815

OK

6.54399e-10

0

34.6585

0

0

37.967

1.75

38.602

22.0583

76.569

100000

100

0.6

50

33.5068

4.93082

OK

8.69244e-16

0.291296

184.666

15.8211

3.6504

214.507

0.25

42.872

42.872

257.379

100000

100

0.6

80

28.2918

12.3425

OK

4.57002e-10

0

70.8268

0

0

77.142

1.75

59.842

34.1954

136.984

200000

100

0.6

50

61.8158

9.79565

OK

8.74206e-16

0.293408

372.476

18.8595

3.6625

413.492

0.25

77.861

77.861

491.353

200000

100

0.6

80

53.6282

24.517

OK

3.2852e-10

0

141.398

0

0

153.128

1.75

97.093

55.4817

250.221

500000

100

0.6

50

148.549

24.4127

OK

8.7647e-16

0.275488

938.932

27.8278

3.64038

1012.81

0.25

178.938

178.938

1191.75

500000

100

0.6

80

130.612

61.3066

OK

3.68961e-10

0

366.049

0

0

394.199

1.75

210.56

120.32

604.759

1000000

100

0.6

50

293.594

48.756

OK

8.78674e-16

0.277408

1879.42

42.8763

3.68438

2008.76

0.25

348.943

348.943

2357.71

1000000

100

0.6

80

257.49

121.905

OK

1.9837e-10

0

710.874

0

0

765.913

1.75

401.926

229.672

1167.84

1000

200

0.6

50

8.34835

0.178976

OK

1.18998e-15

0.268

9.80797

2.98093

1.61082

16.755

0.25

9.806

9.806

26.561

1000

200

0.6

80

8.35654

0.41264

OK

6.5029e-11

0

4.95331

0

0

5.31

1.75

23.312

13.3211

28.622

2000

200

0.6

50

10.9961

0.409024

OK

1.08056e-15

0.2616

12.1695

7.28861

2.70288

24.556

0.75

14.023

14.023

38.579

2000

200

0.6

80

7.09603

0.689376

OK

1.19787e-10

0

5.75862

0

0

6.349

1.75

26.662

15.2354

33.011

5000

200

0.6

50

12.0842

0.840352

OK

1.14094e-15

0.385568

24.3555

19.7416

5.31389

52.127

0.75

15.145

15.145

67.272

5000

200

0.6

80

7.26397

1.40371

OK

8.07397e-11

0

10.1891

0

0

10.98

1.75

23.31

13.32

34.29

10000

200

0.6

50

13.7377

1.61542

OK

1.13915e-15

0.446816

42.8315

46.2344

10.4282

103.036

0.75

17.848

17.848

120.884

10000

200

0.6

80

7.79923

2.69613

OK

8.76397e-10

0

16.7598

0

0

18.14

1.75

25.623

14.6417

43.763

20000

200

0.6

50

18.0436

3.10848

OK

1.13882e-15

0.75328

81.4741

95.5156

20.4296

203.14

0.75

23.849

23.849

226.989

20000

200

0.6

80

9.1967

5.19805

OK

2.7296e-10

0

30.9637

0

0

33.414

1.75

29.797

17.0269

63.211

50000

200

0.6

50

24.7159

5.08854

OK

9.84119e-10

0.752416

233.585

97.5024

20.4495

362.521

0.25

32.495

32.495

395.016

50000

200

0.6

80

22.776

12.782

OK

2.01488e-10

0

87.4804

0

0

93.644

1.75

52.212

29.8354

145.856

100000

200

0.6

50

43.7419

10.1175

OK

1.18308e-15

0.755648

487.354

100.55

20.4503

627.852

0.25

58.067

58.067

685.919

100000

200

0.6

80

44.1585

25.3399

OK

4.92705e-11

0

184.454

0

0

196.42

1.75

88.28

50.4457

284.7

200000

200

0.6

50

81.61

20.173

OK

1.19927e-15

0.75248

996.01

106.35

20.4795

1158.38

0.25

107.465

107.465

1265.85

200000

200

0.6

80

103.403

60.5247

OK

2.67731e-12

0

372.06

0

0

394.94

2.25

182.958

81.3147

577.898

500000

200

0.6

50

193.426

50.3572

OK

1.20823e-15

0.755872

2516.82

124.054

20.4261

2745.76

0.25

249.777

249.777

2995.54

500000

200

0.6

80

213.998

125.841

OK

3.64618e-11

0

973.919

0

0

1029.81

1.75

358.549

204.885

1388.36

1000000

200

0.6

50

382.11

100.689

OK

1.21097e-15

0.754208

5058.87

153.59

20.4358

5398.9

0.25

489.899

489.899

5888.8

1000000

200

0.6

80

427.834

251.631

OK

2.18376e-11

0

1910.8

0

0

2021.82

1.75

703.126

401.786

2724.95

1000

500

0.6

50

7.8856

0.321728

OK

1.61434e-15

0

13.5729

0

0

14.153

0.25

11.69

11.69

25.843

1000

500

0.6

80

7.89747

0.32176

OK

1.61434e-15

0

13.5788

0

0

14.16

0.25

11.696

11.696

25.856

2000

500

0.6

50

20.8898

0.552704

OK

1.6814e-15

0.369632

34.8972

27.3573

8.26755

73.197

0.25

22.37

22.37

95.567

2000

500

0.6

80

19.7565

1.396

OK

7.36599e-12

0

20.842

0

0

21.648

1.75

37.049

21.1709

58.697

5000

500

0.6

50

32.1537

1.95062

OK

4.23621e-10

0.733504

94.033

106.858

24.3614

229.436

0.25

35.463

35.463

264.899

5000

500

0.6

80

19.9847

3.28467

OK

1.65693e-12

0

42.2572

0

0

43.874

1.75

38.015

21.7229

81.889

10000

500

0.6

50

33.5511

3.86506

OK

3.68614e-10

1.56499

192.533

239.814

51.2868

491.035

0.25

39.065

39.065

530.1

10000

500

0.6

80

19.903

6.45155

OK

1.90938e-11

0

76.6506

0

0

79.581

1.75

41.487

23.7069

121.068

20000

500

0.6

50

42.2287

7.61555

OK

4.41885e-10

3.2769

391.294

551.918

105.296

1062.39

0.25

51.831

51.831

1114.22

20000

500

0.6

80

23.3752

12.6849

OK

2.87489e-11

0

147.752

0

0

153.454

1.75

51.469

29.4109

204.923

50000

500

0.6

50

80.113

18.9525

OK

7.41744e-10

8.47696

994.702

1397.22

269.069

2694.6

0.25

101.954

101.954

2796.55

50000

500

0.6

80

43.0727

31.6472

OK

4.23415e-12

0

369.759

0

0

384.187

1.75

89.856

51.3463

474.043

100000

500

0.6

50

141.579

37.8183

OK

2.12353e-15

8.43725

2251.34

1407.94

268.95

3982.21

0.75

184.804

184.804

4167.02

100000

500

0.6

80

103.71

63.0748

OK

5.55139e-12

0

780.173

0

0

808.407

1.75

183.679

104.959

992.086

200000

500

0.6

50

175.786

50.4038

OK

1.80715e-15

8.56058

4765.4

1422.62

269.065

6552

0.25

232.406

232.406

6784.4

200000

500

0.6

80

225.01

125.986

OK

4.12283e-12

0

1682

0

0

1737.61

1.75

368.443

210.539

2106.05

500000

500

0.6

50

OoM (in setup stage)

500000

500

0.6

80

595.329

314.87

OK

1.29539e-12

0

4568.02

0

0

4705.93

1.75

930.051

531.458

5635.98

1000000

500

0.6

50

OoM (in setup stage)

1000000

500

0.6

80

OoM (in setup stage)