1-18-2016 Banded Results BCR Test



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


N

K

d

NPrtns

Solves

T-LU

T-SwDef

T-MMDef

T-PreP

nItrs

T-SwInf

T-MVInf

T-Kry

T-KryPIt

T-Total

200000

10

1

1

OK

20.3492

12.3565

34.5255

77.066

0.75

10.9205

30.4785

53.914

53.914

130.98

200000

10

1

1

OK

263.25

267.298

0.75

1203

1203

1470.3

200000

10

1

4

OK

18.8195

11.9787

33.5216

74.098

2.75

27.1888

77.2121

131.85

47.9455

205.948

200000

10

1

4

OK

67.1194

71.244

2.75

1162.38

422.685

1233.63

200000

10

1

8

OK

18.1038

11.8479

32.9729

72.657

2.75

26.7522

76.7927

130.772

47.5535

203.429

200000

10

1

8

OK

33.615

37.766

2.75

592.499

215.454

630.265

200000

10

1

16

OK

17.3357

11.7191

32.5674

71.325

2.75

26.2927

76.3913

129.561

47.1131

200.886

200000

10

1

16

OK

17.1177

21.327

2.75

312.053

113.474

333.38

200000

10

1

20

OK

16.6506

11.634

32.3417

70.373

3.25

29.8215

87.5593

147.165

45.2815

217.538

200000

10

1

20

OK

13.908

18.059

3.25

293.879

90.4243

311.938

200000

10

1

32

OK

16.6514

11.5812

32.0918

70.062

3.25

33.8424

99.5451

166.69

51.2892

236.752

200000

10

1

32

OK

9.03206

13.213

3.25

221.665

68.2046

234.878

200000

10

1

50

OK

18.2581

11.4949

31.5236

71.097

3.25

33.2897

99.041

165.051

50.7849

236.148

200000

10

1

50

OK

6.11213

10.261

3.25

155.827

47.9468

166.088

200000

20

1

1

OK

28.6199

20.7719

41.9931

101.863

0.75

10.7771

32.3196

57.82

57.82

159.683

200000

20

1

1

OK

289.145

295.548

0.75

1281.3

1281.3

1576.85

200000

20

1

4

OK

27.2978

20.3129

41.723

99.878

2.75

27.016

84.0668

144.652

52.6007

244.53

200000

20

1

4

OK

72.4893

78.978

2.75

1207.53

439.1

1286.5

200000

20

1

8

OK

26.2669

19.7319

40.2655

96.747

2.75

26.0593

82.5024

141.503

51.4556

238.25

200000

20

1

8

OK

36.6529

43.153

2.75

623.385

226.685

666.538

200000

20

1

16

OK

25.558

19.3955

39.6827

95.081

2.75

25.4546

82.3021

140.265

51.0055

235.346

200000

20

1

16

OK

19.9518

26.466

2.75

331.043

120.379

357.509

200000

20

1

20

OK

24.768

19.2154

39.2984

93.72

2.75

24.7638

81.458

138.363

50.3138

232.083

200000

20

1

20

OK

16.394

22.955

2.75

274.37

99.7709

297.325

200000

20

1

32

OK

24.8433

19.0466

39.4636

93.795

2.75

24.8935

81.861

138.934

50.5215

232.729

200000

20

1

32

OK

11.4265

17.96

2.75

184.207

66.9844

202.167

200000

20

1

50

OK

24.0452

18.8197

33.953

87.28

2.75

24.1401

81.4913

137.552

50.0189

224.832

200000

20

1

50

OK

8.55437

15.042

2.75

141.073

51.2993

156.115

200000

50

1

1

OK

58.0314

40.8238

72.4546

189.162

0.75

13.3027

32.753

63.573

63.573

252.735

200000

50

1

1

OK

1039.47

1052.56

0.75

1457.63

1457.63

2510.19

200000

50

1

4

OK

56.7008

39.1881

72.1398

185.901

2.25

27.4128

71.869

133.673

59.4102

319.574

200000

50

1

4

OK

259.28

272.653

2.25

1121.44

498.418

1394.09

200000

50

1

8

OK

55.1665

37.8595

70.2663

180.909

2.25

26.0826

70.4027

130.356

57.936

311.265

200000

50

1

8

OK

130.61

144.061

2.25

579.224

257.433

723.285

200000

50

1

16

OK

56.4256

36.6487

63.5862

174.417

2.25

24.9818

69.6983

128.317

57.0298

302.734

200000

50

1

16

OK

78.0676

91.625

2.25

309.089

137.373

400.714

200000

50

1

20

OK

54.475

36.7486

63.7013

172.682

2.25

29.5471

82.788

151.317

67.252

323.999

200000

50

1

20

OK

63.8312

77.4

2.25

301.99

134.218

379.39

200000

50

1

32

OK

53.3825

35.189

60.7676

167.081

2.25

28.35

81.2771

148.302

65.912

315.383

200000

50

1

32

OK

46.5579

60.077

2.25

203.903

90.6236

263.98

200000

50

1

50

OK

54.1159

34.8981

61.3683

168.226

2.25

23.9795

69.3601

126.621

56.276

294.847

200000

50

1

50

OK

36.3988

49.908

2.25

125.373

55.7213

175.281

200000

100

1

1

OK

109.452

153.344

144.245

439.723

0.75

17.6194

37.9714

79.144

79.144

518.867

200000

100

1

1

OK

937.449

961.64

0.75

1482.94

1482.94

2444.58

200000

100

1

4

OK

107.429

144.442

141.698

426.241

1.75

28.7662

67.3479

136.118

77.7817

562.359

200000

100

1

4

OK

387.62

412.099

1.75

947.68

541.531

1359.78

200000

100

1

8

OK

106.048

139.558

132.434

410.647

1.75

27.0383

66.3349

132.926

75.9577

543.573

200000

100

1

8

OK

248.261

272.714

1.75

491.687

280.964

764.401

200000

100

1

16

OK

104.966

134.466

128.239

400.576

1.75

25.5201

65.2039

130.225

74.4143

530.801

200000

100

1

16

OK

175.343

199.889

1.75

268.611

153.492

468.5

200000

100

1

20

OK

105.8

134.257

129.189

401.772

1.75

25.4436

65.7541

130.625

74.6429

532.397

200000

100

1

20

OK

161.812

186.236

1.75

224.089

128.051

410.325

200000

100

1

32

OK

104.641

94.349

122.662

354.483

1.75

24.2701

63.4397

126.828

72.4731

481.311

200000

100

1

32

OK

141.536

165.955

1.75

157.178

89.816

323.133

200000

100

1

50

OK

107.852

126.344

123.788

390.606

2.25

29.7704

79.3455

156.408

69.5147

547.014

200000

100

1

50

OK

132.729

157.335

2.25

141.818

63.0302

299.153

200000

200

1

1

OK

313.943

416.819

369.393

1193.6

0.75

27.9343

54.4135

119.991

119.991

1313.6

200000

200

1

1

OK

1188.76

1235.69

0.75

1510.62

1510.62

2746.31

200000

200

1

4

OK

316.903

392.275

354.109

1156.23

1.75

44.4804

96.5819

206.648

118.085

1362.88

200000

200

1

4

OK

587.392

634.899

1.75

981.874

561.071

1616.77

200000

200

1

8

OK

309.11

371.585

338.017

1111.7

1.75

41.2714

93.1607

199.771

114.155

1311.48

200000

200

1

8

OK

439.486

487.232

1.75

526.348

300.77

1013.58

200000

200

1

16

OK

315.661

231.646

329.207

970.438

1.75

39.1924

90.9563

195.379

111.645

1165.82

200000

200

1

16

OK

379.165

426.587

1.75

302.099

172.628

728.686

200000

200

1

20

OK

311.447

351.758

323.041

1078.59

1.75

38.2922

91.1988

194.782

111.304

1273.37

200000

200

1

20

OK

359.274

407.021

1.75

257.547

147.17

664.568

200000

200

1

32

OK

312.864

209.578

295.581

911.322

1.75

36.0292

84.8687

185.645

106.083

1096.97

200000

200

1

32

OK

345.565

393.281

1.75

191.432

109.39

584.713

200000

200

1

50

OK

317.581

315.395

289.584

1013.54

1.75

38.4963

84.709

188.392

107.653

1201.93

200000

200

1

50

OK

329.592

377.557

1.75

153.589

87.7651

531.146

200000

500

1

1

OK

1708.03

1551.11

1885.43

5459.16

0.75

66.8968

112.541

261.294

261.294

5720.45

200000

500

1

1

OK

2633.78

2753.85

0.75

1737.58

1737.58

4491.43

200000

500

1

4

OK

1689.04

1428.31

1754.08

5184.39

1.75

104.392

193.735

443.03

253.16

5627.42

200000

500

1

4

OK

1840.25

1960.57

1.75

1156.79

661.023

3117.36

200000

500

1

8

OK

1693.9

1340.81

1659.23

5004.41

1.75

97.2564

186.477

428.111

244.635

5432.52

200000

500

1

8

OK

1675.73

1796

1.75

658.753

376.43

2454.76

200000

500

1

16

OK

1721.11

1231.98

1513.5

4772.56

1.75

90.2528

175.286

409.211

233.835

5181.77

200000

500

1

16

OK

1563.16

1683.79

1.75

423.643

242.082

2107.43

200000

500

1

20

OK

1725.21

1182.48

1471.68

4682.82

1.75

90.4108

170.743

405.195

231.54

5088.02

200000

500

1

20

OK

1567.54

1688.02

1.75

373.703

213.545

2061.72

200000

500

1

32

NConv

1779.29

974.797

1363.73

4436.76

1000.25

41314.5

72053.7

176225

176.181

180662

200000

500

1

32

OK

1519.83

1640.36

1.75

306.237

174.993

1946.6

200000

500

1

50

OK

1812.79

958.266

1172.26

4233.82

1.75

82.458

146.331

373.455

213.403

4607.27

200000

500

1

50

OK

1494.15

1614.83

1.75

266.234

152.134

1881.06