4-25-2016 Double Results New Nightly Test



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


Name

N

NNZ

SPD

DB

K-DB

K-noDrop

K

FRate

nuKf

Solves

Bstng

SolAcc

T-DB

T-CM

T-Drop

T-Dtransf

T-Asmbl

LU-M

Fill-in

NPrtns

T-BC

T-LU

GFlps-LU

T-SPK

T-LUrdcd

T-PreP

Kry-M

nItrs

T-Kry

T-Total

Pardiso

SlwD

Fastest

SpdUp

apache2

715176

4817870

0

1

65837

3370

3370

0.0954238

0.986985

OK

1

0.000448549

201.308

588.493

0

53.92

7834.44

LU

-1

50

0

808.056

2.8302

0

0

9547.4

P-B2(SI)

500.25

59680.7

69228.1

5070

13.6545

apache2

715176

4817870

0

1

65837

3370

3370

0.0953937

0.986985

OK

1

0.000411902

184.89

564.068

0

48.405

8304.41

LU

-1

50

0

512.949

4.45817

0

0

9855.77

P-B2(SI)

500.25

57602.8

67458.6

5070

13.3054

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

32.7979

3.70208

16.459

13.172

1.125

LU

-1

16

0

1.88531

0.0109616

0

0

71.668

P-B2(SI)

17.75

129.449

201.117

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

30.4864

4.53863

16.326

13.621

1.132

LU

-1

16

0

1.97773

0.0104494

0

0

278.872

P-B2(SI)

17.75

142.757

421.629

cfd1

70656

1828364

0

1

6229

2794

2794

0.959956

0.944087

OK

1

2.55487e-09

65.4866

140.625

0

17.608

1294.62

LU

-1

16

0

327.124

7.61386

0

0

1858.14

P-B2(SI)

105.25

4511.1

6369.24

2164.91

2.94203

7984.73

1.25364

cfd1

70656

1828364

0

1

6229

2794

2794

0.959347

0.942929

OK

1

2.30723e-08

61.7592

134.011

0

16.457

1249.88

LU

-1

16

0

209.131

12.57

0

0

1896.96

P-B2(SI)

102.75

3946.62

5843.58

2164.91

2.69922

7984.73

1.36641

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.00018385

17.4372

26.5258

0

6.51

3779.44

LU

-1

6

0

24276.2

15.7461

0

0

28126.3

P-B2(SI)

162.25

140808

168934

314.685

536.837

34371.6

0.203461

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.000364038

18.6054

26.8716

0

6.25

3798.72

LU

-1

6

0

23175.6

16.4939

0

0

27234.5

P-B2(SI)

177.25

153690

180925

314.685

574.939

34371.6

0.189977

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

19.7777

5.62034

8.94

4.723

4.124

LU

-1

16

0

3.2329

0.446049

0

0

50.63

P-B2(SI)

12.75

118.437

169.067

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

17.0229

4.88706

7.771

5.078

3.647

LU

-1

16

0

3.52784

0.408757

0

0

236.332

P-B2(SI)

12.75

126.534

362.866

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.927686

0.848733

OK

1

6.93539e-07

193.49

621.928

0

67.847

5064.93

LU

-1

8

0

15054.8

17.8598

0

0

21058.5

P-B2(SI)

500.25

358810

379869

82862.7

4.58431

925062

2.43522

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.929354

0.851044

OK

1

1.01803e-07

198.255

632.68

0

68.699

3784.73

LU

-1

8

0

10325.4

25.4046

0

0

15210.3

P-B2(SI)

500.25

358402

373613

82862.7

4.50882

925062

2.47599

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.943412

0.880257

OK

1

2.219e-08

115.555

332.83

0

33.086

6454.84

LU

-1

16

0

6717.68

17.7136

0

0

13682.4

P-B2(SI)

211.25

76441.1

90123.5

81031.2

1.11221

128099

1.42137

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.943083

0.879047

OK

1

1.67665e-08

101.958

309.465

0

31.002

5985.32

LU

-1

16

0

5340.55

22.4837

0

0

12003.4

P-B2(SI)

160.75

66790.4

78793.7

81031.2

0.972388

128099

1.62575

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

83.3232

7.99283

16.639

6.628

6.167

LU

-1

16

0

17.3895

0.00308692

0

0

141.998

P-B2(SI)

4.75

180.913

322.911

273.206

N/A

313.407

N/A

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

83.2795

7.96351

15.689

7.593

5.791

LU

-1

16

0

17.5984

0.00305028

0

0

338.758

P-B2(SI)

4.75

194.085

532.843

273.206

N/A

313.407

N/A

shipsec1

140874

7813404

0

1

5237

5237

5237

0.962201

0.96387

OK

1

0.00346863

327.571

252.74

0

84.738

5293.16

LU

-1

16

0

1152.14

5.69189

0

0

7148.95

P-B2(SI)

353.25

31322.6

38471.6

730

52.7008

37981.1

0.987251

shipsec1

140874

7813404

0

1

5237

5237

5237

0.962469

0.964068

OK

1

0.00300758

317.699

252.826

0

83.96

5291.73

LU

-1

16

0

751.288

8.60703

0

0

6940.89

P-B2(SI)

365.25

29127.7

36068.6

730

49.409

37981.1

1.05302

shipsec5

179860

10113096

0

1

3923

3923

3923

0.936794

0.928082

NConv

1

0.0528061

483.577

387.398

0

119.924

2922.56

LU

-1

16

0

2491.26

6.94731

0

0

6439.54

P-B2(SI)

484.25

70447.4

76886.9

950

N/A

26676.6

N/A

shipsec5

179860

10113096

0

1

3923

3923

3923

0.936794

0.928082

NConv

1

0.0528061

468.612

382.755

0

115.853

2422.36

LU

-1

16

0

1303.57

13.277

0

0

4890.98

P-B2(SI)

484.25

54956.7

59847.7

950

N/A

26676.6

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.972768

0.954527

NConv

1

0.0466428

156.177

332.489

0

41.436

3905.33

LU

-1

16

0

669.833

7.72253

0

0

5133.16

P-B2(SI)

178.25

10225.4

15358.6

3690

N/A

42393.3

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.97329

0.955748

NConv

1

0.0229472

149.187

318.17

0

45.325

2748.33

LU

-1

16

0

375.874

13.0566

0

0

3825.55

P-B2(SI)

163.75

8841.81

12667.4

3690

N/A

42393.3

N/A

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.97293

0.955308

OK

1

0.00511725

146.769

321.31

0

44.125

2736.94

LU

-1

16

0

664.939

7.57229

0

0

3933

P-B2(SI)

159.75

9033.68

12966.7

6694.36

1.93696

40472.3

3.12125

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.972948

0.954687

NConv

1

0.0491396

146.562

322.397

0

44.509

2730.99

LU

-1

16

0

446.005

11.5224

0

0

3876

P-B2(SI)

157.25

9084.9

12960.9

6694.36

N/A

40472.3

N/A

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0132705

0.939768

OK

1

1.14557e-07

438.895

1293.45

0

98.284

771.986

LU

-1

16

0

3442.66

5.71785

0

0

6117.84

P-B2(SI)

206.25

119034

125152

7110

17.6022

35361.2

0.282546

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0164495

0.94105

OK

1

4.47124e-07

418.361

1305.07

0

102.51

676.971

LU

-1

16

0

2130.19

8.87827

0

0

4858.98

P-B2(SI)

204.25

104960

109819

7110

15.4457

35361.2

0.321995