4-5-2016 Double Results New Nightly Test



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


Name

N

NNZ

SPD

DB

K-DB

K-noDrop

K

FRate

nuKf

Solves

Bstng

SolAcc

T-DB

T-CM

T-Drop

T-Dtransf

T-Asmbl

LU-M

Fill-in

NPrtns

T-BC

T-LU

GFlps-LU

T-SPK

T-LUrdcd

T-PreP

Kry-M

nItrs

T-Kry

T-Total

Pardiso

SlwD

Fastest

SpdUp

apache2

715176

4817870

0

1

65837

3370

3370

0.0953937

0.986985

OK

1

0.000411902

267.516

815.902

0

83.621

9942.98

LU

-1

50

0

809.681

2.82434

0

0

11982

P-B2(SI)

500.25

59725.4

71707.4

5070

14.1435

apache2

715176

4817870

0

1

65837

3370

3370

0.0953937

0.986985

OK

1

0.000411902

194.535

591.517

0

52.71

4603.69

LU

-1

50

0

505.601

4.52296

0

0

6245.67

P-B2(SI)

500.25

56825.6

63071.3

5070

12.4401

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

39.6948

4.13418

25.908

16.112

1.087

LU

-1

16

0

1.75978

0.0117435

0

0

91.402

P-B2(SI)

17.75

101.949

193.351

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

36.1987

4.51731

24.726

14.741

1.159

LU

-1

16

0

1.83011

0.0112922

0

0

266.385

P-B2(SI)

17.75

115.048

381.433

cfd1

70656

1828364

0

1

6229

2794

2794

0.960031

0.943475

OK

1

4.5315e-09

80.3942

168

0

24.882

1419.04

LU

-1

16

0

327.625

7.89226

0

0

2032.16

P-B2(SI)

111.25

4744.37

6776.53

2164.91

3.13017

7984.73

1.17829

cfd1

70656

1828364

0

1

6229

2794

2794

0.95852

0.942973

OK

1

3.37163e-09

77.2386

164.693

0

24.537

1404.29

LU

-1

16

0

207.666

12.4365

0

0

2111.79

P-B2(SI)

111.75

4163.8

6275.59

2164.91

2.89878

7984.73

1.27235

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.000364038

21.2106

34.8614

0

6.636

3529.36

LU

-1

6

0

24270

15.7502

0

0

27880.9

P-B2(SI)

177.25

153435

181316

314.685

576.182

34371.6

0.189568

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.00018385

20.5581

33.5809

0

6.708

4299.48

LU

-1

6

0

23530.3

16.2453

0

0

28174.4

P-B2(SI)

162.25

142378

170552

314.685

541.978

34371.6

0.201531

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

17.1634

5.49958

8.623

4.808

4.224

LU

-1

16

0

3.07642

0.468737

0

0

45.764

P-B2(SI)

12.75

97.635

143.399

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

15.3543

4.79473

7.935

4.675

3.638

LU

-1

16

0

3.35834

0.429388

0

0

217.428

P-B2(SI)

12.75

105.583

323.011

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.927686

0.848733

OK

1

6.93539e-07

234.265

668.461

0

92.374

3468.28

LU

-1

8

0

15269.1

17.6091

0

0

19793.4

P-B2(SI)

500.25

362363

382156

82862.7

4.61192

925062

2.42064

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.927686

0.848733

OK

1

6.93539e-07

283.785

835.6

0

112.568

6606.38

LU

-1

8

0

10798.4

24.8997

0

0

18949.6

P-B2(SI)

500.25

357589

376539

82862.7

4.54413

925062

2.45675

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.938823

0.880469

OK

1

3.48971e-08

98.3278

308.681

0

31.561

3358.97

LU

-1

16

0

7252.56

16.4277

0

0

11077.5

P-B2(SI)

194.75

81425.2

92502.7

81031.2

1.14157

128099

1.38481

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.938823

0.880469

OK

1

3.48971e-08

164.796

511.617

0

56.205

6427.4

LU

-1

16

0

5656.78

21.062

0

0

13091.3

P-B2(SI)

194.75

81516.2

94607.5

81031.2

1.16754

128099

1.354

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

93.4618

9.03224

18.671

6.428

6.877

LU

-1

16

0

17.2548

0.00311102

0

0

155.151

P-B2(SI)

4.75

175.652

330.803

273.206

N/A

313.407

N/A

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

92.1925

8.18552

23.501

7.155

6.472

LU

-1

16

0

17.3371

0.00309625

0

0

369.815

P-B2(SI)

4.75

187.41

557.225

273.206

N/A

313.407

N/A

shipsec1

140874

7813404

0

1

5237

5237

5237

0.962201

0.96387

OK

1

0.00346863

427.235

371.211

0

141.394

5849.95

LU

-1

16

0

1146.31

5.72086

0

0

7962.31

P-B2(SI)

353.25

30858.5

38820.8

730

53.1792

37981.1

0.978369

shipsec1

140874

7813404

0

1

5237

5237

5237

0.962469

0.964068

OK

1

0.00300758

300.687

242.42

0

81.392

2960.73

LU

-1

16

0

745.341

8.6757

0

0

4562.06

P-B2(SI)

365.25

28663.6

33225.6

730

45.5146

37981.1

1.14313

shipsec5

179860

10113096

0

1

3923

3923

3923

0.936794

0.928082

NConv

1

0.0453804

466.747

367.814

0

120.517

1773.4

LU

-1

16

0

2514.52

6.88303

0

0

5274.77

P-B2(SI)

452.75

66053.1

71327.9

950

N/A

26676.6

N/A

shipsec5

179860

10113096

0

1

3923

3923

3923

0.936237

0.927508

NConv

1

0.0191234

466.928

365.895

0

116.912

1784.78

LU

-1

16

0

1300.74

13.5156

0

0

4287.91

P-B2(SI)

388.25

43533.8

47821.7

950

N/A

26676.6

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.972948

0.954687

NConv

1

0.0491396

183.376

455.271

0

61.133

4302.62

LU

-1

16

0

666.688

7.70833

0

0

5689.32

P-B2(SI)

157.25

8812.15

14501.5

3690

N/A

42393.3

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.972713

0.954752

NConv

1

0.0318838

137.435

301.068

0

41.453

2708.28

LU

-1

16

0

447.064

11.472

0

0

3871.97

P-B2(SI)

153.25

8689.88

12561.8

3690

N/A

42393.3

N/A

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.972768

0.954527

NConv

1

0.0466428

189.053

388.658

0

63.235

2774.37

LU

-1

16

0

660.764

7.82852

0

0

4106.83

P-B2(SI)

178.25

9904.66

14011.5

6694.36

N/A

40472.3

N/A

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.972768

0.954527

NConv

1

0.0466428

168.2

365.284

0

68.464

4222.99

LU

-1

16

0

446.114

11.5952

0

0

5582.72

P-B2(SI)

178.25

10087.1

15669.8

6694.36

N/A

40472.3

N/A

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0169094

0.940826

OK

1

5.62354e-08

487.163

1226.46

0

131.492

683.142

LU

-1

16

0

3240.87

5.86575

0

0

5835.91

P-B2(SI)

205.25

117944

123780

7110

17.4093

35361.2

0.285678

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0170191

0.941054

OK

1

2.04167e-07

373.681

1249.76

0

88.628

690.601

LU

-1

16

0

2102.17

8.99558

0

0

4818.57

P-B2(SI)

205.25

106429

111248

7110

15.6467

35361.2

0.31786