2-18-2016 Double Results BCR Test



• Name: the name of the matrix (test)
• N: the dimension of the matrix (number of rows and columns)
• NNZ: number of non-zeros
• SPD: whether the matrix is specified by the user to be symmetric positive definite (values: 0 or 1)
• DB: indicate whether DB reordering is performed. Values: 0 or 1.
• K-DB: the half-bandwidth after DB reordering method (without any drop-off). If DB is specified not to be executed, then this reports the original half-bandwidth
• KnoDrp: the half-bandwidth after DB and CM reordering but before drop-off
• K: the half-bandwidth after reordering and drop-off
• FRate: fill-in rate. See NOTES below
• nuKf: non-uniform K factor. Indicates whether the K changes a lot from row to row. Values are between 0 and 1, with 0 indicating a perfectly uniform bandwidth over the entire matrix. See NOTES below
• Solves: indicates whether we managed to solve the problem or not. OK means solved fine, otherwise a reason is provided for failure
• Bstng: indicates whether we enable diagonal boosting when doing factorization. Values: 0 or 1
• SolAcc: infinity norm of the array storing the relative errors
• T-DB: time to run DB reordering for the matrix on the CPU
• T-CM: time to run CM reordering for the matrix on the CPU
• T-Drop: time to drop off off-diagonal elements to decrease bandwidth. Done on the CPU.
• T-Dtransf: data transfer from CPU to GPU
• T-Asmbl: after reordering and drop-off, copy the sparse matrix to banded matrix stored in GPU memory
• LU-M: LU method (complete, ILUT or ILUULT)
• Fill-in: the fill-in factor of ILUT (-1 indicates complete LU)
• NPrtns: the number of partitions used to solve the problem
• T-BC: time required to get off-diagonal right hand sides (Bs and Cs) from the banded matrix - done on the GPU
• T-LU: LU time
• GFlps-LU: LU GFLOPs
• T-SPK: time to solve for the spikes Vs and Ws - done on the GPU
• T-LUrdcd: time required to factorize the reduced matrices - done on the GPU
• T-PreP: the sum of all preprocessing times, see NOTES
• Kry-M: the method used in Krylov solving stage (can be BiCGStab2 (0), BiCGStab (1), or CG(2))
• nItrs: the number of Krylov-solve iterations to solve the problem
• T-Kry: time spent in the Krylov solver (on the GPU)
• Total: total time to solve the problem, sum of PreProc + T-Kry
• Pardiso: the time for the commercial tool "Pardiso" to solve the problem
• SlwD: the slowdown ratio of our solver compared to Pardiso (a value less than one means that we are faster than Pardiso. The value is shown in green if we run faster and shown in red if we run more than 5 times slower.)
• Fastest: the time when SaP runs fastest historically
• SpdUp: the speedup of this run compared to the historical fastest run (the value is shown in green if the speedup is more than 5% and shown in red if the slowdown is more than 5%)


NOTES:
1) nuKf = 1/(2KN)*[sum over i from 1 to N of (2K - K_{iLeft} - K_{iRight})], where K_{iLeft} is the row half-bandwidth to the left of the diagonal while K_{iRight} is the row half-bandwidth to the right of the diagonal.
2) FRate = the actual number of NNZ / ((2K+1)N).
3) All times reported are in miliseconds (1E-3 second)


Name

N

NNZ

SPD

DB

K-DB

K-noDrop

K

FRate

nuKf

Solves

Bstng

SolAcc

T-DB

T-CM

T-Drop

T-Dtransf

T-Asmbl

LU-M

Fill-in

NPrtns

T-BC

T-LU

GFlps-LU

T-SPK

T-LUrdcd

T-PreP

Kry-M

nItrs

T-Kry

T-Total

Pardiso

SlwD

Fastest

SpdUp

ANCF31770

31770

183540

248

NConv

561868

1

1

0

0

0

36.612

P-B2(SI)

1000.25

0

0

1765.17

1.76473

1801.78

ANCF31770

31770

183540

248

OK

0.00260996

1

0

180.207

248.738

P-B2(SI)

1.75

437.199

249.828

685.937

ANCF31770

31770

183540

248

NConv

0.0133456

2

1

59.0868

73.1248

68.0474

259.071

P-B2(SI)

17.25

205.58

144.897

395.836

22.947

654.907

ANCF31770

31770

183540

248

NConv

0.0254185

4

0

59.1043

545.108

P-B2(SI)

1.25

212.047

169.638

757.155

ANCF31770

31770

183540

248

NConv

0.021842

6

0

48.0152

490.83

P-B2(SI)

2.25

182.604

81.1573

673.434

ANCF31770

31770

183540

248

NConv

0.0230882

8

0

33.8179

834.184

P-B2(SI)

2.75

167.101

60.764

1001.28

ANCF31770

31770

183540

248

NConv

0.0159593

10

0

31.9936

405.561

P-B2(SI)

3.25

179.108

55.1102

584.669

ANCF31770

31770

183540

248

NConv

0.0180965

16

0

21.2707

504.718

P-B2(SI)

5.75

182.606

31.7576

687.324

ANCF31770

31770

183540

248

NConv

0.0274011

20

0

18.8192

432.362

P-B2(SI)

8.25

219.279

26.5793

651.641

ANCF31770

31770

183540

248

NConv

1.7818

4

0

66.0722

340.573

P-B2(SI)

63.75

3014.87

47.292

3355.44

ANCF31770

31770

183540

248

NConv

0.0448152

6

0

48.0781

320.714

P-B2(SI)

76.25

2466.98

32.3538

2787.69

ANCF31770

31770

183540

248

NConv

0.177938

8

0

34.4942

310.356

P-B2(SI)

96.75

2408.24

24.8914

2718.6

ANCF31770

31770

183540

248

NConv

1.30931

10

0

32.9259

308.52

P-B2(SI)

70.5

1464.57

20.7741

1773.09

ANCF31770

31770

183540

248

NConv

0.208222

16

0

22.221

298.826

P-B2(SI)

70.75

986.59

13.9447

1285.42

ANCF31770

31770

183540

248

NConv

0.0868336

20

0

19.3174

297.066

P-B2(SI)

73.25

872.17

11.9068

1169.24

ANCF88950

88950

513900

410

NConv

34817.3

1

1

0

0

0

104.991

P-B2(SI)

1000.25

0

0

2043.4

2.04289

2148.39

ANCF88950

88950

513900

410

OK

0.00267795

1

0

717.814

823.195

P-B2(SI)

1.75

1323.84

756.482

2147.04

ANCF88950

88950

513900

410

NConv

0.0109386

2

1

332.24

311.064

306.284

1136.3

P-B2(SI)

19.75

507.136

530.233

1096.7

55.5293

2233.01

ANCF88950

88950

513900

410

NConv

0.0374571

4

0

208.394

2377.26

P-B2(SI)

2.25

713.902

317.29

3091.16

ANCF88950

88950

513900

410

NConv

0.0181459

6

0

133.726

1111.14

P-B2(SI)

1.75

395.655

226.089

1506.79

ANCF88950

88950

513900

410

NConv

0.0292233

8

0

104.912

978.394

P-B2(SI)

2.75

436.858

158.857

1415.25

ANCF88950

88950

513900

410

NConv

0.0249971

10

0

103.116

843.068

P-B2(SI)

2.25

303.648

134.955

1146.72

ANCF88950

88950

513900

410

NConv

0.0334421

16

0

60.3864

785.983

P-B2(SI)

3.75

309.214

82.4571

1095.2

ANCF88950

88950

513900

410

NConv

0.0338375

20

0

56.2052

691.198

P-B2(SI)

4.25

291.953

68.6948

983.151

ANCF88950

88950

513900

410

NConv

0.0308134

4

0

207.843

579.413

P-B2(SI)

54.75

7147.27

130.544

7726.68

ANCF88950

88950

513900

410

NConv

0.108424

6

0

135.161

512.172

P-B2(SI)

51.25

4451.92

86.8667

4964.09

ANCF88950

88950

513900

410

NConv

0.101272

8

0

106.226

517.29

P-B2(SI)

58.25

3899.27

66.9402

4416.56

ANCF88950

88950

513900

410

NConv

0.524339

10

0

104.607

478.091

P-B2(SI)

68.5

3685.62

53.8047

4163.71

ANCF88950

88950

513900

410

NConv

95.0523

16

0

61.3747

438.555

P-B2(SI)

60.75

2101.03

34.5849

2539.59

ANCF88950

88950

513900

410

NConv

0.472447

20

0

57.4225

437.007

P-B2(SI)

60.25

1725.68

28.6419

2162.68

NetANCF_40by40

63603

569262

608

NConv

41287.3

1

1

0

0

0

74.681

P-B2(SI)

1000.25

0

0

1820.21

1.81976

1894.89

NetANCF_40by40

63603

569262

608

OK

2.45192e-05

1

0

554.734

630.697

P-B2(SI)

1.25

796.02

636.816

1426.72

NetANCF_40by40

63603

569262

608

OK

0.00225691

2

1

308.282

258.483

277.881

980.937

P-B2(SI)

62.75

1520.97

1385.19

3063.99

48.8285

4044.93

NetANCF_40by40

63603

569262

608

OK

0.000127663

4

0

193.544

1288.33

P-B2(SI)

1.75

445.639

254.651

1733.97

NetANCF_40by40

63603

569262

608

OK

0.000154191

6

0

110.514

2423.38

P-B2(SI)

1.75

299.217

170.981

2722.6

NetANCF_40by40

63603

569262

608

OK

0.000138704

8

0

88.364

826.246

P-B2(SI)

3.25

386.946

119.06

1213.19

NetANCF_40by40

63603

569262

608

OK

0.000155362

10

0

68.3678

831.643

P-B2(SI)

1.75

194.232

110.99

1025.88

NetANCF_40by40

63603

569262

608

OK

0.000145401

16

0

46.697

690.521

P-B2(SI)

2.25

196.42

87.2978

886.941

NetANCF_40by40

63603

569262

608

OK

0.000184039

20

0

36.6672

716.208

P-B2(SI)

4.25

275.195

64.7518

991.403

NetANCF_40by40

63603

569262

608

OK

0.0018359

4

0

193.733

522.902

P-B2(SI)

92.75

8754.04

94.3831

9276.94

NetANCF_40by40

63603

569262

608

OK

0.0013817

6

0

112.079

445.947

P-B2(SI)

103.5

6488.54

62.6912

6934.49

NetANCF_40by40

63603

569262

608

OK

0.000362897

8

0

89.0952

427.392

P-B2(SI)

90.75

4328.32

47.695

4755.71

NetANCF_40by40

63603

569262

608

OK

0.00128756

10

0

69.0402

412.041

P-B2(SI)

327.25

12663.9

38.6979

13075.9

NetANCF_40by40

63603

569262

608

OK

0.000708558

16

0

47.4747

399.906

P-B2(SI)

121.75

3108.35

25.5306

3508.26

NetANCF_40by40

63603

569262

608

OK

0.00153493

20

0

37.6231

395.85

P-B2(SI)

320.75

6642.54

20.7094

7038.39