4-24-2016 Double Results New Nightly Test



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


Name

N

NNZ

SPD

DB

K-DB

K-noDrop

K

FRate

nuKf

Solves

Bstng

SolAcc

T-DB

T-CM

T-Drop

T-Dtransf

T-Asmbl

LU-M

Fill-in

NPrtns

T-BC

T-LU

GFlps-LU

T-SPK

T-LUrdcd

T-PreP

Kry-M

nItrs

T-Kry

T-Total

Pardiso

SlwD

Fastest

SpdUp

apache2

715176

4817870

0

1

65837

3370

3370

0.0954237

0.986985

OK

1

0.000269876

203.186

574.459

0

50.522

8922.64

LU

-1

50

0

798.771

2.86308

0

0

10596.2

P-B2(SI)

500.25

59365.5

69961.7

5070

13.7992

apache2

715176

4817870

0

1

65837

3370

3370

0.0953938

0.986985

OK

1

0.000199675

203.271

585.703

0

52.326

7353.32

LU

-1

50

0

511.591

4.47015

0

0

8989.06

P-B2(SI)

500.25

57236.1

66225.1

5070

13.0622

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

30.745

3.90498

16.307

12.81

1.112

LU

-1

16

0

1.72451

0.0119837

0

0

68.785

P-B2(SI)

17.75

101.637

170.422

bundle1

10581

770901

0

1

10461

10461

5

0.79373

0.789358

OK

1

9.57334e-12

29.5982

3.91782

18.514

13.206

1.128

LU

-1

16

0

1.76038

0.0117395

0

0

221.46

P-B2(SI)

17.75

111.13

332.59

cfd1

70656

1828364

0

1

6229

2794

2794

0.959956

0.944087

OK

1

2.55487e-09

63.9812

147.588

0

16.911

1150.43

LU

-1

16

0

325.001

7.66358

0

0

1714.39

P-B2(SI)

105.25

4377.24

6091.63

2164.91

2.8138

7984.73

1.31077

cfd1

70656

1828364

0

1

6229

2794

2794

0.957924

0.942428

OK

1

1.15937e-08

63.4309

140.849

0

16.908

1153.62

LU

-1

16

0

204.494

13.1668

0

0

1761.18

P-B2(SI)

103.75

3849.63

5610.81

2164.91

2.59171

7984.73

1.4231

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.000364038

19.7369

29.5091

0

5.867

2575.58

LU

-1

6

0

24632.6

15.5183

0

0

27282.3

P-B2(SI)

177.25

155186

182468

314.685

579.844

34371.6

0.18837

ckt11752_tr_0

49702

333029

0

1

49452

10258

10258

0.715493

0.869972

OK

1

0.00018385

19.4008

28.3642

0

4.913

4127.95

LU

-1

6

0

23176.8

16.493

0

0

27603

P-B2(SI)

162.25

140618

168221

314.685

534.57

34371.6

0.204324

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

17.9457

5.4933

8.957

4.863

4.117

LU

-1

16

0

3.0497

0.472844

0

0

47.195

P-B2(SI)

12.75

97.421

144.616

FEM_3D_thermal1

17880

430740

0

1

13787

13787

12

0.909423

0.286582

OK

1

2.33165e-11

16.9371

4.88791

9.04

4.164

4.079

LU

-1

16

0

3.2912

0.438147

0

0

198.564

P-B2(SI)

12.75

104.211

302.775

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.927686

0.848733

OK

1

6.93539e-07

215.261

639.351

0

71.122

7609.07

LU

-1

8

0

15054.3

17.8604

0

0

23650.7

P-B2(SI)

500.25

358194

381845

82862.7

4.60817

925062

2.42261

Ga3As3H12

61349

5970947

0

1

20893

12502

12502

0.927686

0.848733

OK

1

6.93539e-07

179.739

610.482

0

65.318

6200.82

LU

-1

8

0

10799.2

24.8977

0

0

18170.2

P-B2(SI)

500.25

357181

375351

82862.7

4.52979

925062

2.46453

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.943412

0.880257

OK

1

2.219e-08

112.63

338.248

0

31.623

4054.7

LU

-1

16

0

6711.51

17.7299

0

0

11275.7

P-B2(SI)

211.25

76192.6

87468.3

81031.2

1.07944

128099

1.46452

GaAsH6

61349

3381809

0

1

20005

10783

10783

0.947671

0.878835

OK

1

3.04681e-08

103.019

302.268

0

30.587

6259.39

LU

-1

16

0

4920.38

24.3735

0

0

11901.7

P-B2(SI)

184.75

66841.5

78743.1

81031.2

0.971763

128099

1.6268

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

93.4673

9.05367

16.84

6.858

6.769

LU

-1

16

0

17.48

0.00307094

0

0

153.776

P-B2(SI)

4.75

177.023

330.799

273.206

N/A

313.407

N/A

lung2

109460

492564

0

1

107141

107141

10

0.434854

0.964614

NConv

1

0.0572659

93.0667

9.05934

16.747

7.629

6.738

LU

-1

16

0

17.2021

0.00312054

0

0

337.853

P-B2(SI)

4.75

186.593

524.446

273.206

N/A

313.407

N/A

shipsec1

140874

7813404

0

1

5237

5237

5237

0.963837

0.964119

OK

1

0.00333151

359.469

267.971

0

90.595

3075.34

LU

-1

16

0

1113.16

5.74044

0

0

4927.52

P-B2(SI)

360.25

31685.7

36613.2

730

50.1551

37981.1

1.03736

shipsec1

140874

7813404

0

1

5237

5237

5237

0.962201

0.96387

OK

1

0.00346863

359.572

262.43

0

90.403

3081.06

LU

-1

16

0

745.56

8.79589

0

0

4742.59

P-B2(SI)

353.25

27621

32363.6

730

44.3337

37981.1

1.17358

shipsec5

179860

10113096

0

1

3923

3923

3923

0.939427

0.930833

OK

1

0.00267622

488.608

391.868

0

119.904

3095.84

LU

-1

16

0

2281.16

7.02601

0

0

6411.86

P-B2(SI)

388.25

55275.3

61687.2

950

64.9339

26676.6

0.43245

shipsec5

179860

10113096

0

1

3923

3923

3923

0.936794

0.928082

NConv

1

0.0528061

422.4

377.68

0

108.682

2923.91

LU

-1

16

0

1301.73

13.2958

0

0

5431.35

P-B2(SI)

484.25

54189.1

59620.5

950

N/A

26676.6

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.972948

0.954687

NConv

1

0.0491396

151.798

329.011

0

41.054

3925.86

LU

-1

16

0

667.055

7.70409

0

0

5142.53

P-B2(SI)

157.25

8734.23

13876.8

3690

N/A

42393.3

N/A

t3dh

79171

4352105

0

1

5854

5088

5088

0.972948

0.954687

NConv

1

0.0491396

149.274

324.284

0

43.998

2734.23

LU

-1

16

0

442.482

11.6141

0

0

3900.81

P-B2(SI)

157.25

8794.5

12695.3

3690

N/A

42393.3

N/A

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.974494

0.955539

OK

1

0.0054874

149.836

321.641

0

43.179

2736.32

LU

-1

16

0

582.175

8.46606

0

0

3851.57

P-B2(SI)

150.25

7687.25

11538.8

6694.36

1.72366

40472.3

3.50749

t3dh_a

79171

4352105

0

1

5854

5088

5088

0.974332

0.954757

NConv

1

0.0138453

146.756

318.96

0

43.827

2725.46

LU

-1

16

0

368.007

13.7672

0

0

3807.11

P-B2(SI)

157.75

8192.11

11999.2

6694.36

N/A

40472.3

N/A

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0169628

0.941314

OK

1

7.10293e-08

464.032

1289.45

0

104.512

758.318

LU

-1

16

0

3235.48

5.69012

0

0

5967.57

P-B2(SI)

215.75

123432

129399

7110

18.1996

35361.2

0.273272

thermal2

1228045

8580313

0

1

1226000

1849

1849

0.0132705

0.939768

OK

1

1.14557e-07

392.214

1266.51

0

88.952

766.665

LU

-1

16

0

2297.81

8.56669

0

0

5171.02

P-B2(SI)

206.25

105122

110293

7110

15.5124

35361.2

0.320611