AMD Threadripper 1950X @4.0Ghz 16cores/32threads in compare with Intel Core i9 7960X @4.0Ghz ,@5.2Ghz 16cores/32threads !!
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
18-12-2016
Made a bench run with new Andscacs 0.89b & Andscacs 0.89zb ,to test if Numa works!
Also with last new asmFish 2016-12-17
And last Stockfish dev. 171216 BMI2
Same as before ,just scroll down for new data to date 18-12-2016 : https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
21-11-2016
Good news.. finally Stockfish can use numa..thank to Marco for fixing this!! Stockfish 211116 64 bmi2 Numa
Also tested last Cfish 211116 x64 bmi2 numa & BrainFish 161119 x64 BMI2 numa
Data in spreadsheet ..scroll down to date 21-11-2016 : https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
02-11-2016
New benches done on a Xeon E5-2699 v4 2x22cores =44cores/88threads
OS: Windows Server 2012 R2
HT OFF
Engines used :
Stockfish 8 ,Cfish 8 ,BrainFish 161030 ,Stockfish 011116 Numa ,pedantFish 2016-10-17
Same link for spreadsheet ,scroll down to date 02-11-2016
31-10-2016
Same tests done today with HT ON!
With same engines + Stockfish prepare builds for TCEC9 Final ..i get two compiles from Kiran ,one normal ,one with numa
Marco had put the sources on Fishcooking so i have also compile these two..also to compare speed
Same thing..all data in spreadsheet updated ,scroll down to the date 31-10-2016
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
Stockfish have still work to do! Numa doesn't work..
asmFish ,Cfish & BrainFish are numa-aware and can use 88threads @100%
Ipman.
30-10-2016
New benches done on a Xeon E5-2699 v4 2x22cores =44cores/88threads
OS: Windows Server 2012 R2
HT OFF
Engines used :
asmFish 2016-10-17 bmi2 Numa
Cfish 301016 bmi2 Numa
BrainFish_161025_x64_bmi2 Numa
Stockfish 16103013 64 BMI2 ,last Dev. version from Abrok.eu
Andscacs 0.88bx bmi2 version with 128cores
All data in spreadsheet: https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
Scroll down to the date: 30-10-2016
02-09-2016
Here benches from 9 engines!
6 off them where Stockfish versions without CounterMovesHistory(CMH) ,3 from Marco who are/where in testing in Testframe.
and 3 versions from mstembera who also want i check them out..
2 last versions from asmFish & pedantfish + i made a compile from Cfish with last source.
Each engine get 10 runs ..so that's 90 times copy & paste these 3 command lines :)
All data under i have put again in spreadsheet to have a nice compare..just scroll down to the date 01-09-2016 for this new data:
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
some info is in the spreadsheet..
With the first version from Marco, i compiled Stockfish 300816 64 BMI2_CMH and let him play for my list,and he's doing very well..
http://www.ipmanchess.yolasite.com/i7-5960x.php
Ipman.
Stockfish 300816 64 BMI2_CMH
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 36 multipv 1 score cp 25 nodes 214055533 nps 9495011
16cores
info depth 26 seldepth 36 multipv 1 score cp 22 nodes 358385571 nps 17831015
18cores=1cpu
info depth 26 seldepth 32 multipv 1 score cp 19 nodes 360810155 nps 20196482
22cores
info depth 26 seldepth 35 multipv 1 score cp 27 nodes 369745039 nps 24947374
32cores
info depth 26 seldepth 36 multipv 1 score cp 9 nodes 1560728534 nps 34864928
36cores =2cpu's
info depth 26 seldepth 34 multipv 1 score cp 12 nodes 710489804 nps 39462886
44cores
info depth 26 seldepth 41 multipv 1 score cp 10 nodes 1388361035 nps 39583766
54cores=3cpu's
info depth 26 seldepth 37 multipv 1 score cp 14 nodes 1048836479 nps 39761789
64cores
info depth 26 seldepth 40 multipv 1 score cp 12 nodes 1503213792 nps 39707683
72cores=4cpu's
info depth 26 seldepth 35 multipv 1 score cp 24 nodes 865096200 nps 39825807
------------------------------------------------------------------
Stockfish 300816 64 BMI2CMH2
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 38 multipv 1 score cp 21 nodes 401806145 nps 9518540
16cores
info depth 26 seldepth 37 multipv 1 score cp 25 nodes 473307550 nps 18439595
18cores=1cpu
info depth 26 seldepth 38 multipv 1 score cp 13 nodes 583243311 nps 20010406
22cores
info depth 26 seldepth 37 multipv 1 score cp 19 nodes 477733138 nps 24681397
32cores
info depth 26 seldepth 33 multipv 1 score cp 13 nodes 603749626 nps 34838408
36cores =2cpu's
info depth 26 seldepth 38 multipv 1 score cp 10 nodes 1331144638 nps 39798625
44cores
info depth 26 seldepth 32 multipv 1 score cp 14 nodes 1177764580 nps 39794721
54cores=3cpu's
info depth 26 seldepth 38 multipv 1 score cp 15 nodes 1126361002 nps 39984416
64cores -> Takes very long time
info depth 26 seldepth 38 multipv 1 score cp 6 nodes 5523698121 nps 40199832
72cores=4cpu's
info depth 26 seldepth 34 multipv 1 score cp 12 nodes 1585321510 nps 40403739
------------------------------------------------------------------
Stockfish 300816 64 BMI2mst
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 35 multipv 1 score cp 13 nodes 358909485 nps 9511567
16cores
info depth 26 seldepth 37 multipv 1 score cp 12 nodes 761497853 nps 17981059
18cores=1cpu
info depth 26 seldepth 35 multipv 1 score cp 16 nodes 567991898 nps 19257879
22cores
info depth 26 seldepth 36 multipv 1 score cp 22 nodes 511505916 nps 24389944
32cores
info depth 26 seldepth 34 multipv 1 score cp 9 nodes 1133493016 nps 33885175
36cores =2cpu's
info depth 26 seldepth 39 multipv 1 score cp 12 nodes 916328149 nps 37933770
44cores
info depth 26 seldepth 36 multipv 1 score cp 10 nodes 1004299723 nps 38079158
54cores=3cpu's
info depth 26 seldepth 36 multipv 1 score cp 16 nodes 1078875527 nps 38253927
64cores
info depth 26 seldepth 34 multipv 1 score cp 9 nodes 1595882215 nps 37774148
72cores=4cpu's
info depth 26 seldepth 40 multipv 1 score cp 13 nodes 1628443910 nps 37853182
------------------------------------------------------------------
Stockfish 300816 64 BMI2mst2
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 35 multipv 1 score cp 19 nodes 293358077 nps 9181211
16cores
info depth 26 seldepth 35 multipv 1 score cp 8 nodes 925633066 nps 17225887
18cores=1cpu
info depth 26 seldepth 34 multipv 1 score cp 19 nodes 365213352 nps 20088743
22cores
info depth 26 seldepth 37 multipv 1 score cp 17 nodes 729413787 nps 23431217
32cores
info depth 26 seldepth 37 multipv 1 score cp 11 nodes 911256954 nps 34033873
36cores =2cpu's
info depth 26 seldepth 39 multipv 1 score cp 15 nodes 889570054 nps 37915354
44cores
info depth 26 seldepth 36 multipv 1 score cp 22 nodes 710521220 nps 37991723
54cores=3cpu's
info depth 26 seldepth 39 multipv 1 score cp 20 nodes 962220043 nps 38388990
64cores
info depth 26 seldepth 34 multipv 1 score cp 12 nodes 1096920071 nps 38261539
72cores=4cpu's -> very very slow
info depth 26 seldepth 37 multipv 1 score cp 15 nodes 9634407155 nps 38629412
info depth 17 seldepth 25 multipv 1 score cp 12 nodes 6555393053 nps 38540051
------------------------------------------------------------------
Stockfish 020916 64 BMI2mst3
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 35 multipv 1 score cp 22 nodes 221049847 nps 9395980
16cores
info depth 26 seldepth 37 multipv 1 score cp 26 nodes 401911504 nps 18269535
18cores=1cpu
info depth 26 seldepth 36 multipv 1 score cp 11 nodes 486618272 nps 20863414
22cores
info depth 26 seldepth 37 multipv 1 score cp 9 nodes 747748105 nps 24085167
32cores
info depth 26 seldepth 39 multipv 1 score cp 15 nodes 696410851 nps 35596547
36cores =2cpu's
info depth 26 seldepth 36 multipv 1 score cp 25 nodes 527866108 nps 39354813
44cores
info depth 26 seldepth 32 multipv 1 score cp 21 nodes 406935399 nps 39298445
54cores=3cpu's
info depth 26 seldepth 38 multipv 1 score cp 8 nodes 1715176705 nps 39772213
64cores
info depth 26 seldepth 38 multipv 1 score cp 16 nodes 833563471 nps 39881511
72cores=4cpu's
info depth 26 seldepth 36 multipv 1 score cp 29 nodes 906649524 nps 40032211
------------------------------------------------------------------
Stockfish 020916 64 BMI2cmh3
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 seldepth 37 multipv 1 score cp 29 nodes 301100912 nps 9403526
16cores
info depth 26 seldepth 37 multipv 1 score cp 28 nodes 323755501 nps 18288171
18cores=1cpu
info depth 26 seldepth 32 multipv 1 score cp 29 nodes 339921221 nps 21013923
22cores
info depth 26 seldepth 34 multipv 1 score cp 22 nodes 452383861 nps 24043787
32cores
info depth 26 seldepth 36 multipv 1 score cp 12 nodes 771180084 nps 35590736
36cores =2cpu's
info depth 26 seldepth 34 multipv 1 score cp 27 nodes 555000419 nps 39434447
44cores
info depth 26 seldepth 37 multipv 1 score cp 12 nodes 956124931 nps 39665004
54cores=3cpu's
info depth 26 seldepth 34 multipv 1 score cp 22 nodes 729261250 nps 39590730
64cores
info depth 26 seldepth 35 multipv 1 score cp 13 nodes 1207781603 nps 39917427
72cores=4cpu's
info depth 26 seldepth 34 multipv 1 score cp 10 nodes 1292179539 nps 39814498
------------------------------------------------------------------
asmFishW_2016-08-30_bmi2
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 multipv 1 time 31753 nps 10972497 score cp 29 nodes 348409702
16cores
info depth 26 multipv 1 time 36902 nps 21711162 score cp 15 nodes 801185320
18cores=1cpu
info depth 26 multipv 1 time 26279 nps 24423214 score cp 18 nodes 641817658
22cores
info depth 26 multipv 1 time 22431 nps 28761248 score cp 21 nodes 645143573
32cores
info depth 26 multipv 1 time 25245 nps 40873769 score cp 8 nodes 1031858299
36cores =2cpu's
info depth 26 multipv 1 time 15610 nps 45551445 score cp 13 nodes 711058069
44cores
info depth 26 multipv 1 time 36508 nps 55228839 score cp 17 nodes 2016294484
54cores=3cpu's
info depth 26 multipv 1 time 16629 nps 65340175 score cp 15 nodes 1086541784
64cores
info depth 26 multipv 1 time 17468 nps 77270204 score cp 13 nodes 1349755940
72cores=4cpu's
info depth 26 multipv 1 time 16162 nps 86486534 score cp 15 nodes 1397795377
------------------------------------------------------------------
pedantFishW_2016-08-30_bmi2
setoption name threads value 72
setoption name hash value 1024
go depth 26
8cores
info depth 26 multipv 1 time 21027 nps 10878231 score cp 24 nodes 228736582
16cores
info depth 26 multipv 1 time 20295 nps 21646435 score cp 22 nodes 439314417
18cores=1cpu
info depth 26 multipv 1 time 28576 nps 24313067 score cp 11 nodes 694770222
22cores
info depth 26 multipv 1 time 11635 nps 27857828 score cp 24 nodes 324125832
32cores
info depth 26 multipv 1 time 17154 nps 40372184 score cp 12 nodes 692544449
36cores =2cpu's
info depth 26 multipv 1 time 23145 nps 45632643 score cp 9 nodes 1056167535
44cores
info depth 26 multipv 1 time 19434 nps 54445430 score cp 17 nodes 1058092506
54cores=3cpu's
info depth 26 multipv 1 time 27473 nps 66605592 score cp 12 nodes 1829855442
64cores
info depth 26 multipv 1 time 15605 nps 76666220 score cp 15 nodes 1196376371
72cores=4cpu's
info depth 26 multipv 1 time 15577 nps 86257272 score cp 13 nodes 1343629537
------------------------------------------------------------------
Cfish 010916 64 BMI2
setoption name threads value 72
setoption name Hash value 1024
go depth 26
8cores
info depth 26 seldepth 34 multipv 1 score cp 20 nodes 278090066 nps 9884835
16cores
info depth 26 seldepth 37 multipv 1 score cp 24 nodes 628402635 nps 18807692
18cores=1cpu
info depth 26 seldepth 36 multipv 1 score cp 15 nodes 568309036 nps 20325061
22cores
info depth 26 seldepth 37 multipv 1 score cp 13 nodes 769875461 nps 25172490
32cores
info depth 26 seldepth 35 multipv 1 score cp 15 nodes 514903226 nps 34453210
36cores =2cpu's
info depth 26 seldepth 36 multipv 1 score cp 11 nodes 1009104946 nps 39106531
44cores
info depth 26 seldepth 35 multipv 1 score cp 9 nodes 1048757566 nps 39135665
54cores=3cpu's
info depth 26 seldepth 36 multipv 1 score cp 18 nodes 1178637942 nps 39269605
64cores
info depth 26 seldepth 40 multipv 1 score cp 16 nodes 2335439489 nps 39310545
72cores=4cpu's
info depth 26 seldepth 37 multipv 1 score cp 16 nodes 1415523931 nps 39449415
------------------------------------------------------------------
28-08-2016
Last evening again a new run with some benches from last asmFish 2016-08-25 ,last Dev.Stockfish 270816 ,Hannibal 1.7 & Andscacs 0.872b
Just scroll down to the date 28-08-2016 for this last data that i put in this spreadsheet:
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
Scaling from asmFish is just great ,getting from 22 to 44cores 103,79%!! and Stockfish is hanging around only 60%..
Some info: later i will have the chance to run on this system ;)
https://www.supermicro.com/products/system/7U/7088/SYS-7088B-TR4FT.cfm
07-08-2016
System: Xeon E7-8870 v3 4x18cores=72cores/144threads -> HT Off
OS: Windows Server 2012 R2
Did run some benches last evening with last version from asmFish 2016-07-26 ,Stockfish 020816 ,Texel 1.06 & Hannibal 1.6.58
I have put data in spreadsheet + some explenations: (scroll down to date 07-08-2016)
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
Conclusion : The engine with working Numa-aware code will win TCEC9 final when using a 44cores system!
when you see Stockfish & Komodo so close in strenght in these LTC games..look out for Houdini 5 dev. ,it has Numa-aware..Robert has only
to update his Numa code so that it works well with more cores..will be dangerous when get in Final! Is it possible?! ;)
Or will Stockfish & Komodo wake up and include Numa-aware ..
04-07-2016
System: Xeon E7-8870 v3 4x18cores=72cores/144threads -> HT Off
OS: Windows Server 2012 R2
asmFish 2016.07.02 working great on 72cores!!
I just copy data here.. i have put data in spreadsheet..
1- asmFish 2016.07.02 bmi2 Numa-aware 256coressetoption name threads value 72
go depth 26
8cores
info depth 26 multipv 1 time 22556 nps 12331147 score cp 16 nodes 278141354
16cores
info depth 26 multipv 1 time 21316 nps 24409902 score cp 21 nodes 520321486
18cores=1cpu
info depth 26 multipv 1 time 27139 nps 27220856
22cores
info depth 26 multipv 1 time 28209 nps 31327220 score cp 20 nodes 883709550
32cores
info depth 26 multipv 1 time 44069 nps 42207807 score cp 16 nodes 1860055863
36cores =2cpu's
info depth 26 multipv 1 time 21664 nps 47434619 score cp 18 nodes 1027623602
44cores
info depth 26 multipv 1 time 22554 nps 58173721 score cp 14 nodes 1312050121
54cores=3cpu's
info depth 26 multipv 1 time 21809 nps 66881689 score cp 10 nodes 1458622771
64cores
info depth 26 multipv 1 time 25701 nps 80426269 score cp 17 nodes 2067035554
72cores=4cpu's
info depth 26 multipv 1 time 32731 nps 89286816 score cp 12 nodes 2922446805
---------------------------------------------------------------------------------------
setoption name threads value 72
go depth 27
54cores=3cpu's
info depth 27 multipv 1 time 28940 nps 69954328 score cp 15 nodes 2024478253
64cores
info depth 27 multipv 1 time 37896 nps 79536394 score cp 11 nodes 3014111220
72cores=4cpu's
info depth 27 multipv 1 time 37211 nps 89921467 score cp 10 nodes 3346067732
------------------------------------------------------------------------------------
2- asmFish 2016.07.02 bmi2 Numa-aware 256cores
2 positions data i get:
setoption name threads value 72
setoption name hash value 1024
position fen 2b5/1r6/2kBp1p1/p2pP1P1/2pP4/1pP3K1/1R3P2/8 b - -
64cores
go depth 34
info depth 34 multipv 1 time 13402 nps 118459898 score cp 287 nodes 1587599559
72cores
go depth 34
info depth 34 multipv 1 time 23059 nps 138338434 score cp 139 nodes 3189945968
---------------------------------------------------------------------------------
setoption name threads value 72
setoption name hash value 1024
position fen 8/k1b5/P4p2/1Pp2p1p/K1P2P1P/8/3B4/8 w - -
72cores
go depth 60
info depth 60 multipv 1 time 4134 nps 141039045 score mate 24 nodes 583055416
go depth 80
info depth 80 multipv 1 time 46997 nps 224021783 score mate 24 nodes 10528351740
go depth 100
info depth 100 multipv 1 time 213240 nps 216222910 score mate 24 nodes 46107373489
And yes..with a bmi2 compile and 72cores i pass the 200Million nodes/s. with this position!
At the moment it's playing on my 3systems..and start as first engine on my 3computers..for sure with 1core,it was already first,now even bigger jump!!
Ipman.
03-07-2016
Yesterday evening i run some tests with asmFish NumaTest_base
It was after testing that i find out it was only a 64cores version ,so it was normal that 72cores gives nothing more!
Also this was a base version..not a pocnt or even a bmi2 compile who would be even faster..but the gain is very clear higher thanks to Numa-aware!
So,have already two engines where Numa-aware works how it should be..Texel & asmFish ..who is next ;)
To see all these cores awake and running @100% it takes 1sec.!! very fast..
First bench was just selecting cores and use go depth 25
2- Numatest_Base.exe
setoption name threads value 72
go depth 25
8cores
info depth 25 multipv 1 time 20115 nps 11155655 score cp 19 nodes 224396020
16cores
info depth 25 multipv 1 time 20874 nps 22091255 score cp 24 nodes 461132876
18cores=1cpu
info depth 25 multipv 1 time 27752 nps 24705698 score cp 21 nodes 685632535
22cores
info depth 25 multipv 1 time 29275 nps 27782844 score cp 15 nodes 813342773
32cores
info depth 25 multipv 1 time 8600 nps 40916985 score cp 21 nodes 351886074
36cores =2cpu's
info depth 25 multipv 1 time 14514 nps 45452953 score cp 14 nodes 659704165
44cores
info depth 25 multipv 1 time 7000 nps 51543263 score cp 21 nodes 360802847
54cores=3cpu's
info depth 25 multipv 1 time 16290 nps 61816104 score cp 14 nodes 1006984341
64cores
info depth 25 multipv 1 time 15957 nps 74512417 score cp 17 nodes 1188994639
72cores=4cpu's -> 8cores Not used? same result as 64cores
info depth 25 multipv 1 time 17371 nps 74202550 score cp 17 nodes 1288972511
Next test was with 2 positions using Fen code
setoption name threads value 64
setoption name hash value 1024
position fen 2b5/1r6/2kBp1p1/p2pP1P1/2pP4/1pP3K1/1R3P2/8 b - -
64cores
go depth 34
info depth 34 multipv 1 time 28189 nps 117960186
72cores - 8cores not used
go depth 34
info depth 34 multipv 1 time 12625 nps 112852554
---------------------------------------------------------------------------------
setoption name threads value 64
setoption name hash value 1024
position fen 8/k1b5/P4p2/1Pp2p1p/K1P2P1P/8/3B4/8 w - -
64cores
go depth 34
was for Texel Numa,it was too fast for asmFish that i have to start with go depth 60!
go depth 60
info depth 60 multipv 1 time 5114 nps 125342789
go depth 80
info depth 80 multipv 1 time 25335 nps 170001086
go depth 100
info depth 100 multipv 1 time 211734 nps 194384966
And this was done with only base version 64cores..so i think with bmi2 version and using 72cores it will be easy above 200Million nodes/sec.
with this position!
Now this morning i see that Mohammed Li has updated asmFish 2016-07-02 with 256cores now!
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
01-07-2016
Sometimes programmers listen to testers and it's nice to see that i get response on that and bite in this Numa aware thing!
Thanks to Peter ,Mikael & Moh who trying things out with tools or changing code in there engine that we have now a working engine with Numa aware!
Down under this report you see a screen from system again i could run on William system a Xeon E7-8870 v3 4x18cores=72cores/144threads
OS: Windows server 2012 R2 with HT Off
Texel 1.06a48 512cores Numatest 3 from Peter Osterlund
He was so kind to make a new release Texel 1.06a49 with source and can put the link here:
Is removed..
setoption name hash value 1024
setoption name threads value 72
go depth 22
8cores
info nodes 523751942 nps 8781890 time 59640
16cores
info nodes 465912687 nps 16656990 time 27971
18cores=1cpu
info nodes 700218378 nps 18361567 time 38135
22cores
info nodes 474150640 nps 20808858 time 22786
32cores
info nodes 915012019 nps 31049985 time 29469
36cores =2cpu's
info nodes 789579580 nps 35225499 time 22415
44cores
info nodes 1054602231 nps 42282183 time 24942
54cores=3cpu's
info nodes 960015462 nps 48781273 time 19680
64cores
info nodes 1113647063 nps 56077700 time 19859
72cores
info nodes 987339559 nps 62092922 time 15901
I did also used go depth 24 for last two ,because it was to quickly finished with go depth 22
64cores - go depth 24
info nodes 4634818942 nps 63032176 time 73531
72cores - go depth 24
info nodes 6291455339 nps 69499644 time 90525
26-06-2016
Spreadsheet updated: https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit#gid=0
setoption name threads value x (x=8,16,18,22,36,44,54,64,72)
8cores
info nodes 255460822 nps 6482131 time 39410
16cores
info nodes 350094373 nps 13520811 time 25893
18cores = 1cpu
info nodes 318964178 nps 14368402 time 22199
22cores
info nodes 405473123 nps 17569682 time 23078
36cores = 2cpu's
info nodes 291845317 nps 24808340 time 11764
44cores
info nodes 441290180 nps 25184920 time 17522
54cores = 3cpu's
info nodes 229471568 nps 24573952 time 9338
64cores
info nodes 305918688 nps 25332783 time 12076
72cores = 4cpu's
info nodes 501804459 nps 24806192 time 20229
go depth 26
18cores
info nodes 259833042 nps 14963031 time 17365
36cores
info nodes 712096249 nps 25144641 time 28320
go depth 27
18cores
info nodes 789678361 nps 14579126 time 54165
36cores
info nodes 632270469 nps 25137979 time 25152
--------------------------------------------------------------------------------------
Texel had a first try with Numa..but also here no gain after 2cpu's..even a slowdown..also SMP can be better!
Texel 1.06a48 512cores Numa
setoption name threads value 8
go depth 22
8cores
info nodes 535900101 nps 7208965 time 74338
16cores
info nodes 666496072 nps 13589203 time 49046
18cores
info nodes 467481214 nps 15624894 time 29919
22cores
info nodes 719367492 nps 17628099 time 40808
32cores
info nodes 906991000 nps 21920702 time 41376
36cores
info nodes 924095974 nps 22428969 time 41201
44cores
info nodes 958960416 nps 15240947 time 62920
54cores
info nodes 1357458581 nps 12538295 time 108265
64cores
info nodes 1145138765 nps 14235759 time 80441
72cores
info nodes 1056645969 nps 14224983 time 74281
------------------------------------------------------------
18-06-2016
I read that Crafty 25.0.1 is Numa aware ..time to test it out..
System : Xeon E7-8870 v3 4x18cores=72cores/144threads HT Off
OS: Windows server 2012 R2
When i open it in console i get this: (in blue color)
EPD Kit revision date: 1996.04.21
unable to open book file [./book.bin].
book is disabled
unable to open book file [./books.bin].
Initializing multiple threads.
System is NUMA. 4 nodes reported by Windows -> It shows that system is Numa and has 4cpu's! ..but only use max. two off them?
Node 0 CPUs:
Node 1 CPUs: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Node 2 CPUs:
Node 3 CPUs: 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Current ideal CPU is 20
Exchanging nodes 0 and 3
Crafty v25.0.1 JA (1 cpu)
White(1): bench -> first bench uses only 1core
Total nodes: 185182294
Raw nodes per second: 4314592
Total elapsed time: 42.92
time used = 42.94
White(1):
---------------------------------------------------
White(1): mt=8
max threads set to 8.
White(1): bench
Running benchmark. . .
......
Total nodes: 355860844
Raw nodes per second: 27671916
Total elapsed time: 12.86
time used = 12.94
------------------------------------------------
White(1): mt=16
max threads set to 16.
White(1): bench
Running benchmark. . .
......
Total nodes: 503496086
Raw nodes per second: 45116140
Total elapsed time: 11.16
time used = 11.23
----------------------------------------------
White(1): mt=18
max threads set to 18.
White(1): bench
Running benchmark. . .
......
Total nodes: 296057861
Raw nodes per second: 38004860
Total elapsed time: 7.79
time used = 7.83
White(1):
-----------------------------------------------------------
White(1): mt=36
max threads set to 36.
White(1): bench
Running benchmark. . .
......
Total nodes: 360114781
Raw nodes per second: 40191380
Total elapsed time: 8.96
time used = 9.03
White(1):
------------------------------------------------------------
White(1): mt=54
max threads set to 54.
White(1): bench
Running benchmark. . .
......
Total nodes: 441466484
Raw nodes per second: 1794798
Total elapsed time: 245.97
time used = 4:06
White(1):
--------------------------------------------------
White(1): mt=72
ERROR - Crafty was compiled with CPUS=64. mt can not exceed this value.
max threads set to 64.
White(1): mt=64
max threads set to 64.
White(1): bench
Running benchmark. . .
......
Total nodes: 474695747
Raw nodes per second: 4151616
Total elapsed time: 114.34
time used = 1:54
White(1):
So Crafty 25.0.1 Numa also doesn't work like it should work on a 4cpu system?
William show me Cinebench..when you start it..it shows directly it has 72cores and bench use them all 4 x 18cores..4cpu's 100%!!
--------------------------------------------------------------------
A request from Peter to use default depth
Stockfish 130616 without CMH
bench 1024 18 24 default depth
===========================
Total time (ms) : 290343
Nodes searched : 7010082938
Nodes/second : 24144143
bench 1024 36 24 default depth
===========================
Total time (ms) : 179340
Nodes searched : 8645517290
Nodes/second : 48207412
bench 1024 54 24 default depth
===========================
Total time (ms) : 356931
Nodes searched : 19942128779
Nodes/second : 55871103
bench 1024 22 24 default depth
===========================
Total time (ms) : 233990
Nodes searched : 7124988406
Nodes/second : 30449969
bench 1024 44 24 default depth
===========================
Total time (ms) : 480633
Nodes searched : 23304937874
Nodes/second : 48488010
Between 18 and 36cores i get almost a perfect scaling in nodes/sec. 99,67%!!
Spreadsheet is updated..
16-06-2016
Today i get a other system for testing:
system: Xeon E7-8870 v3 4x18cores=72cores/144threads
This gives me a chance to compare these results and my findings from other system..
Used same Stockfish 130616 without CounterMovesHistory(CMH)
Data is also updated in spreadsheet : https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit?usp=sharing
Intel Xeon E7-8870 v3 @2.1Ghz
4x18cores=72cores/144threads system
Stockfish 130616 without CMH
bench 1024 8 2000 default time
===========================
Total time (ms) : 74123
Nodes searched : 1240250555
Nodes/second : 16732330
bench 1024 16 2000 default time
===========================
Total time (ms) : 74148
Nodes searched : 2286158365
Nodes/second : 30832367
bench 1024 18 2000 default time -> 1cpu
===========================
Total time (ms) : 74132
Nodes searched : 2851865262
Nodes/second : 38470097
bench 1024 22 2000 default
===========================
Total time (ms) : 74125
Nodes searched : 3135009861
Nodes/second : 42293556
bench 1024 24 2000 default
===========================
Total time (ms) : 74158
Nodes searched : 3383158382
Nodes/second : 45620949
bench 1024 32 2000 default
===========================
Total time (ms) : 74128
Nodes searched : 4661922731
Nodes/second : 62890172
bench 1024 36 2000 default time -> 2cpu's
===========================
Total time (ms) : 74117
Nodes searched : 5397609260
Nodes/second : 72825522
bench 1024 44 2000 default time
===========================
Total time (ms) : 74103
Nodes searched : 5393135858
Nodes/second : 72778913
bench 1024 54 2000 default time -> 3cpu's
===========================
Total time (ms) : 74136
Nodes searched : 5442621036
Nodes/second : 73414009
bench 1024 72 2000 default time -> 4cpu's
===========================
Total time (ms) : 74127
Nodes searched : 5499511667
Nodes/second : 74190398
Again same happen ,till 2cpu's everything goes well ..more cores using cpu3 & cpu4 doesn't give more nodes/sec.anymore and system slowdown!?
Needs to be NUMA aware if we want chessengines using these 4sockets systems optimal!
A little check from 18cores till 36cores (1cpu till 2cpu's) gives me 89,30% speed gain in nodes/sec. ,so same as other system who gives me 90%!!
Ipman.
-------------------------------------------
15-06-2016
Made a little spreadsheet with the data i get:
https://docs.google.com/spreadsheets/d/156Iztrz4erBxTntb6A9-4hCNAtLyazdBBh8zGy0RApE/edit?usp=sharing
And here Fishcooking link:
https://groups.google.com/forum/#!topic/fishcooking/YZ16ksLBHUc
Next test done on 14-06-2016
sytem:Xeon E7-8890 v4 4x24cores=96cores/192threads
I have contacted a few people after first test with the question if they have more idea's that i can test out
on this extreme system..i don't think we have many times the possibility to run some benches
on a 96cores/192threads system.
I get a e-mail from Mikael with later on a source from Stockfish without CounterMoveHistory!!
And i made a compile from it and ready to run some benches..
A other person asked me can you also run with 44cores as the final from TCEC9 will run a system
with 44cores..so i did..with the thinking it will be 2x22cores ,so tested also with 22cores
with the knowing that this system each cpu has 24cores..so 2cores different can show some
difference in nodes/s ,because the results before that where not going well after 24cores..but!
WoW what a difference..that is what i want to see,with a great scaling till 48cores now and
with much higher nodes/sec.
Just check this:
Stockfish 130616 bmi2 256cores without CounterMoveHistory!!
Bench 1024 8 2000 default time
===========================
Total time (ms) : 74179
Nodes searched : 1111563212
Nodes/second : 14984877
Bench 1024 16 2000 default time
===========================
Total time (ms) : 74171
Nodes searched : 2473072243
Nodes/second : 33342846
Bench 1024 20 2000 default time
===========================
Total time (ms) : 74175
Nodes searched : 3163890488
Nodes/second : 42654404
Bench 1024 22 2000 default time -> Test run for 1cpu compare TCEC9 final sytem
===========================
Total time (ms) : 74171
Nodes searched : 3439099279
Nodes/second : 46367168
Bench 1024 24 2000 default time
===========================
Total time (ms) : 74135
Nodes searched : 3714919446
Nodes/second : 50110196
Bench 1024 32 2000 default time
===========================
Total time (ms) : 74127
Nodes searched : 4856036008
Nodes/second : 65509679
Bench 1024 44 2000 default time -> Test run with 2cpu's compare TCEC9 final system
===========================
Total time (ms) : 74131
Nodes searched : 6532558230
Nodes/second : 88121814
Bench 1024 48 2000 default time
===========================
Total time (ms) : 74128
Nodes searched : 7113106606
Nodes/second : 95957082
Bench 1024 64 2000 default time
===========================
Total time (ms) : 74140
Nodes searched : 7094556559
Nodes/second : 95691348
Bench 1024 80 2000 default time
===========================
Total time (ms) : 74148
Nodes searched : 7282006695
Nodes/second : 98209077
Bench 1024 96 2000 default time
===========================
Total time (ms) : 74136
Nodes searched : 7235127846
Nodes/second : 97592638
Compare the results from first test..here all numbers are much higher and scaling nice till 48cores!
Just enough for TCEC9 final system ,see from 22cores to 44cores..almost dubble nodes/sec.!!
Test before give highest nodes/sec. with 64cores -> 72Million nodes/s ..now with 48cores -> 96Million nodes/s!!
And this thanks to Mikael with one single change in source ..who will do better ;)
Then we can go for next problem..how to handle cpu3 & cpu4 a extra 48cores who do nothing now?!
Imagine he continue scalling like these first 48cores!
Do i have only to count on Mikael ? ;)
For sure bring this compile to TCEC9 ..maybe i will try to contact Anton ,that he just try and see
the difference in speed on there final system..
-----------------------------------------------------------------------
Again i had some free time left on the system..so did some more tests:
Used here different time use..but they are same as above..it shows every time the same big
difference in nodes/sec. (proves for me it works very well)
Bench 1024 22 3000 default time
===========================
Total time (ms) : 111134
Nodes searched : 5084586687
Nodes/second : 45751855
Bench 1024 44 3000 default time
===========================
Total time (ms) : 111139
Nodes searched : 9988630925
Nodes/second : 89875119
Bench 1024 22 4000 default time
===========================
Total time (ms) : 148161
Nodes searched : 6939900820
Nodes/second : 46840267
Bench 1024 44 4000 default time
===========================
Total time (ms) : 148149
Nodes searched : 13151531526
Nodes/second : 88772327
Bench 1024 48 4000 default time
===========================
Total time (ms) : 148227
Nodes searched : 14372201115
Nodes/second : 96960750
------------------------------------------------------------
Then with different Hash ..results where a little lower
Bench 8192 48 2000 default time
===========================
Total time (ms) : 74131
Nodes searched : 6457869967
Nodes/second : 87114297
Bench 8192 48 3000 default time
===========================
Total time (ms) : 111131
Nodes searched : 10000089574
Nodes/second : 89984698
Bench 8192 48 4000 default time
===========================
Total time (ms) : 148176
Nodes searched : 12904412990
Nodes/second : 87088415
Bench 2048 48 4000 default time
===========================
Total time (ms) : 148155
Nodes searched : 13997014132
Nodes/second : 94475475
Bench 512 48 4000 default time
===========================
Total time (ms) : 148160
Nodes searched : 13651252173
Nodes/second : 92138581
Bench 512 22 2000 default time
===========================
Total time (ms) : 74130
Nodes searched : 3446013987
Nodes/second : 46486091
Bench 512 44 2000 default time
===========================
Total time (ms) : 74142
Nodes searched : 6366179189
Nodes/second : 85864681
----------------------------------------------------------
Did also run with new asmFish bmi2 130616..but scaling is totally not good..
asmFish bmi2
go depth 26 8c
info depth 26 multipv 1 time 37001 nps 9548522 score cp 21 nodes
go depth 26 16c
info depth 26 multipv 1 time 23123 nps 19538363 score cp 16 node
go depth 26 24c
info depth 26 multipv 1 time 20535 nps 24635035 score cp 12 nodes
go depth 26 32c
info depth 26 multipv 1 time 29463 nps 21899786 score cp 15 nodes
go depth 26 48c
info depth 26 multipv 1 time 18359 nps 41925656 score cp 17 nodes
go depth 26 64c
info depth 26 multipv 1 time 27842 nps 33100524 score cp 26 nodes
go depth 26 80c
info depth 26 multipv 1 time 21907 nps 52996489 score cp 20 nodes
go depth 27 80c
info depth 27 multipv 1 time 25354 nps 57872847 score cp 17 nodes 1467308167
go depth 28 80c
info depth 28 multipv 1 time 43913 nps 49413512 score cp 15 nodes 2169895595
go depth 27 96c
info depth 27 multipv 1 time 41803 nps 44760056 score cp 10 nodes 1871104649
Ipman.
-------------------------------------------------------------------------------------------
11-06-2016
Today i get a chance to test some chess benches on a Xeon E7-8890 v4 with 4 sockets!!
means 4cpu's x 24cores = 96cores or 192threads!
Operating system : Windows Server 2012 R2
1cpu = $7174 http://ark.intel.com/nl/products/family/93797/Intel-Xeon-Processor-E7-v4-Family
It was my first experience..and after a little searching how to transfer files ,copy & paste data i was ready to run some benches..
Engines that i used :
I compiled Stockfish with last source and set it to 256cores
I compiled DON 100616 also last source with 256cores
I get Komodo 1656 with 200cores ,Thanks to Mark & Larry!
Houdini 4 Pro 4 B only 32cores ,but has Numa (later more)
I had put a request on Fishcooking..and i getting some good info about how to bench best..so thanks for this information!
Also DON programmer Ehsan was kind enough to help me out..
Will come back during these tests..
I had prepare myself with all these Bench commands and i wanted do as many possible benches in the time i get on this
great system..so i choose to use a fix time for every bench ..or with some engines i use go depth 24 ,go depth 26 ..depending
how long it takes..
With my first contact had William run some tests..and i see it didn't work how i liked to see with these 192threads
Also programmers would like to see results with HyperThreading OFF ..so i decided to run benches with HT Off!
Lucky i told this William the day before ,so that he this morning can boot with HT Off (takes long time..)
For explenation..it will be clear ,i put some between these results.. so you need to scroll a lot with these few tests i did ;)
Let me start first with Stockfish:
Some people propose me to use 8Gb hash and i did this in every test..at the end i had some time left and wanted to see
what engines do with different Hash.
So..first you see bench 8192 8 2000 default time = 8192 Hash using 8cores and 2sec. for each position x 37
Stockfish has 37 positions into his bench test.
You can see Time is almost and always same.. cores i used where : 8,16,24,32,48,64,80 en 96cores
The result with different Hash i have put inbetween afterwards with the same total cores.
Stockfish bench 8192 8 2000 default time
===========================
Total time (ms) : 74198
Nodes searched : 941792202
Nodes/second : 12692959
Stockfish bench 8192 16 2000 default time
===========================
Total time (ms) : 74135
Nodes searched : 2150714471
Nodes/second : 29010783
-----------------------------------------------------------------
Stockfish bench 8192 24 2000 default time
===========================
Total time (ms) : 74169
Nodes searched : 2996733841
Nodes/second : 40404128
Stockfish bench 512 24 2000 default time -> tried some other Hash values ,to see if i get something better..
===========================
Total time (ms) : 74157
Nodes searched : 3106997530
Nodes/second : 41897562
Stockfish bench 1024 24 2000 default time -> you will see later with other engines i use 1024 Hash ,because highest Nodes/sec.!
===========================
Total time (ms) : 74148
Nodes searched : 3321030231
Nodes/second : 44789208
Stockfish bench 2048 24 2000 default time
===========================
Total time (ms) : 74141
Nodes searched : 3246180633
Nodes/second : 43783879
Stockfish bench 4096 24 2000 default time
===========================
Total time (ms) : 74124
Nodes searched : 3174197239
Nodes/second : 42822800
------------------------------------------------------------------
Stockfish bench 8192 32 2000 default time
===========================
Total time (ms) : 74136
Nodes searched : 3718498964
Nodes/second : 50157804
Stockfish bench 8192 48 2000 default time
===========================
Total time (ms) : 74165
Nodes searched : 4872528188
Nodes/second : 65698485
Stockfish bench 8192 64 2000 default time
===========================
Total time (ms) : 74145
Nodes searched : 5165559495
Nodes/second : 69668345
Stockfish bench 1024 64 2000 default time
===========================
Total time (ms) : 74154
Nodes searched : 5354535743
Nodes/second : 72208319 -> Peter told me you should see +70Million ;)
Stockfish bench 8192 80 2000 default time
===========================
Total time (ms) : 74166
Nodes searched : 5211511508
Nodes/second : 70268202
Stockfish bench 8192 96 2000 default time
===========================
Total time (ms) : 74151
Nodes searched : 5264618852
Nodes/second : 70998622
Stockfish bench 1024 96 2000 default time
===========================
Total time (ms) : 74186
Nodes searched : 5379118902
Nodes/second : 72508544
Stockfish bench 16384 96 2000 default time
===========================
Total time (ms) : 74163
Nodes searched : 5040791490
Nodes/second : 67969088
It was clearly using 2000 default time was better with Hash=1024
Now something Important!!
While these Nodes/sec. looking great i see a big problem during these tests..
But there is also good news!
First problem..till 24cores everything goes fast..above 24cores and the more cores i add ,how slower the test begins to go
even at the end you see almost same time used..but that's not true..the max.cores i see running where 48cores?
and you can see the nodes/sec. don't change so much anymore when using 64,80 and 96cores then when you compare from 8cores till 24cores..so when selected 64,80 or 96cores ,i see 48cores running?
With 32cores you will say,they go nice higher..but it's already lower then it should be!
So the good news is..it's not SMP problem ,but NUMA is needed when running a system with multi Intel cpu's!!
Why i say Intel ..because from AMD i don't know it and not test it yet..
1cpu = 24cores..everything till there goes great
2cpu's or more need NUMA!!
Next DON engine from Ehsan:
With DON i had to use 2000 movetime in place off 2000 default time to get a fix time bench!
Same thing here..till 48cores ,nodes/sec. goes nice up and then almost no change anymore
Total time same..but goes slower and slower with more cores to finish the test..
DON bench 8192 8 2000 movetime
=================================
Total time (ms) : 74195
Nodes searched : 712564065
Nodes/second : 9603936
---------------------------------
DON bench 8192 16 2000 movetime
=================================
Total time (ms) : 74175
Nodes searched : 1500633293
Nodes/second : 20230984
---------------------------------
DON bench 8192 24 2000 movetime
=================================
Total time (ms) : 74221
Nodes searched : 2092625949
Nodes/second : 28194526
---------------------------------
DON bench 8192 32 2000 movetime
=================================
Total time (ms) : 74242
Nodes searched : 2513426090
Nodes/second : 33854504
---------------------------------
DON bench 8192 48 2000 movetime
=================================
Total time (ms) : 74246
Nodes searched : 3574674169
Nodes/second : 48146353
---------------------------------
DON bench 8192 64 2000 movetime
=================================
Total time (ms) : 74248
Nodes searched : 3609990121
Nodes/second : 48620705
---------------------------------
DON bench 8192 80 2000 movetime
=================================
Total time (ms) : 74212
Nodes searched : 3634574998
Nodes/second : 48975569
---------------------------------
DON bench 8192 96 2000 movetime
=================================
Total time (ms) : 74219
Nodes searched : 3647528726
Nodes/second : 49145484
---------------------------------
With Hash=1024 i get higher nodes/sec. except with 48cores was lower
DON bench 1024 24 2000 movetime
=================================
Total time (ms) : 74204
Nodes searched : 2292225565
Nodes/second : 30890862
---------------------------------
DON bench 1024 32 2000 movetime
=================================
Total time (ms) : 74252
Nodes searched : 2594433993
Nodes/second : 34940930
---------------------------------
DON bench 1024 48 2000 movetime
=================================
Total time (ms) : 74197
Nodes searched : 3463283303
Nodes/second : 46676864
---------------------------------
Next Komodo from Mark & Larry:
Komodo bench commands:
----------------------
-setoption name hash value 1024 (to set Hash) , (128,256,512,1024,2048,4096,8192)
-setoption name threads value 8 (to set cores) , (8,16,24,32,48,64,80,96)-> HT Off
-go depth 24
With time left i run again the tests using, go depth 26
H=Hash , C=cores
Komodo go depth 24 H=8192 C=8
info time 23895 nodes 176383325 nps 7381324 hashfull 124
Komodo go depth 24 H=8192 C=16
info time 10523 nodes 138581276 nps 13169325 hashfull 79
Komodo go depth 24 H=8192 C=24
info time 9112 nodes 149030645 nps 16355127 hashfull 52
Komodo go depth 24 H=8192 C=48
info time 23736 nodes 237036210 nps 9986052 hashfull 39
Komodo go depth 24 H=8192 C=64
info time 31560 nodes 393609855 nps 12471693 hashfull 51
-----------------------------------------------------------
Komodo go depth 26 H=8192 C=8
info time 25355 nodes 185176070 nps 7303248 hashfull 135
Komodo go depth 26 H=8192 C=16
info time 23215 nodes 287714964 nps 12393381 hashfull 158
Komodo go depth 26 H=8192 C=24 -> run it twice ,because i see it lower then with 16c
info time 71465 nodes 767250121 nps 10735952 hashfull 271
info time 86053 nodes 896317258 nps 10415819 hashfull 302
Komodo go depth 26 H=8192 C=20 -> so i tried 20cores
info time 91149 nodes 1063403519 nps 11666604 hashfull 441
---------------------------------------------------------
Hash=1024 gives again higher nodes/sec.
Komodo go depth 26 H=1024 C=8
info time 71923 nodes 571277810 nps 7942805 hashfull 924
Komodo go depth 26 H=2048 C=8
info time 72453 nodes 538932168 nps 7438362 hashfull 889
Komodo go depth 26 H=1024 C=16
info time 48976 nodes 827701203 nps 16899827 hashfull 799
Komodo go depth 26 H=1024 C=24
info time 36725 nodes 663240265 nps 18059567 hashfull 602
Komodo go depth 26 H=1024 C=32
info time 52107 nodes 1131195820 nps 21708717 hashfull 451
Komodo go depth 26 H=1024 C=48
info time 33298 nodes 708926657 nps 21289807 hashfull 350
Komodo has already a slowdown in nodes/sec. gain after 16cores.. some work to do!
And as last Houdini:
When i see the problems with slowdown and not using all cores ,i said yes Houdini has Numa
but when i check ,it handle only 32cores ..but i say okay maybe i can see a little difference when i go from
24cores to 32cores
Used same bench commands as Komodo
Houdini go depth 24 H=8192 C=8
info multipv 1 depth 24 seldepth 51 score cp 10 time 46315 nodes 542050599 nps
11703000 tbhits 0 hashfull 474
Houdini go depth 24 H=8192 C=16
info multipv 1 depth 24 seldepth 47 score cp 12 time 22326 nodes 418218428 nps
18732000 tbhits 0 hashfull 386
Houdini go depth 24 H=8192 C=24
info multipv 1 depth 24 seldepth 53 score cp 6 time 27876 nodes 714840424 nps
25643000 tbhits 0 hashfull 614
Houdini go depth 24 H=8192 C=32
info multipv 1 depth 24 seldepth 49 score cp 11 time 34623 nodes 1086256131 nps
31373000 tbhits 0 hashfull 816
-----------------------------------------
Houdini go depth 24 H=1024 C=24
info multipv 1 depth 24 seldepth 54 score cp 7 time 27100 nodes 710782395 nps
26228000 tbhits 0 hashfull 1000
Hash=1024 would give again higher nodes/sec. but i had a problem when i want to set 32cores
Till i check and see this:
Houdini 4 Pro x64
(c) 2013 Robert Houdart
info string 48 processor(s) found, POPCNT available
info string NUMA configuration with 4 node(s), offset 0
info string 128 MB Hash
info string No valid license found
setoption name threads value 32
setoption name threads value 24
info string 24 threads used
setoption name threads value 32
I have a Licensed from all Houdini's i have..but have not thought when i copy the engine to a other computer
that it will be No valid..anymore
Also this "info string 32 threads used" came not when i press Enter..so i re-start Houdini every time ,and time by time it
get this "info string 32 threads used" i did the run then..
But it was after that ,that i say something is not right..and see later on that my version from Houdini was Not licensed anymore
..so i can say that these results are maybe not valid as i know Houdini will not play at full strenght.
Now i'm hoping that Robert with his new coming Houdini 5 will put enough cores as he has Numa!
It will be interesting to test this further out..and i think there will be more people to have a system with more then 1cpu..
So the chess programmers have to think to include NUMA in there chess engines if we want to profit and gain more nodes/sec.
in our systems!!
Again a big thanks to William H. to let me run these tests! It's a pleasure to meet you ;)
Thanks to the programmers for the info and compiles that gives me a chance to run these tests,even i coudn't really use all these cores
but now we know what is needed..Numa!
Ipman.
PS: Did make this little report after testing..so it's possible i have to adjust some off my explenations later on..
A little video link to the system i was using: Is removed..