And in the deleted comment I also admitted to miss-spelling you in my thesis‘ introduction, which I am forever ashamed of. Also this new video of yours is amazing and massively increased my motivation again to continue work on my engine as well!
for part 3 I think that engine is already strong enough that he needs to actually use some SPRTs for testing and not fixed games tests :) Somewhere there is where you need to start something that every "serious" engine uses as a development tool nowadays.
50% of why i love watching these videos is about the code. The other half is about the fantastic story telling along the way. Incredible video editing (animating code edits, animations between code/variable changes and the actual impact of those changes in the final product, statistic tracking and animating, good pacing throughout the video, no chapter feels slow or boring). And I could keep going for a while. I’m curious what kind of tools you use to do this editing. It’s really awesome and I hope you keep making these kinds of videos for a very long time!
18:25 An easy way to find if the bot has improved is performing a sign test. We model the number of wins for v7 as a binomial variable X~B(n,p) with number of games(n) = 1000, and hypothesise that probability of a win(p) = 0.5. If we treat every 2 draws as one loss and one win, we can find the probability of 365 + 282/2 = 506 wins. Finding the probability that X >= 506 gives us a p-value of 0.364. At a significance level of 0.05, this means we cannot conclude that the v7 bot is better than v6c and would need a larger sample of games to be sure. TLDR: we cannot say there is a significant improvement
İf the number of games is 1000, np and np are both > 10. Thus, we can model it with a normal model with a mean at 0.5 and a stdev of sqrt(0.5*0.5*1000) rather than a binomial
I wasn't sure how to deal with draws, but I suppose treating it as 0.5 of a win _kind of_ works? I just kept them separate :) I know whether equating a draw to a 50/50 win lose really works tho? Surely losses should be their own 3rd category which is either left out or treated separately?
The method to calculate the probability of one engine being better than another based on game outcomes is called the Sequential Probability Ratio Test (SPRT). The CLI-based match runners that the chess programming community use to test their engines have this capability built-in, and can automatically run a test until the probability of improvement exceeds some desired bound. The most popular one is called cutechess-cli, which works with UCI chess engines
I agree that the obvious next step for this engine is to get a better idea of its rating with CCRL/cutechess-cli. Sadly SPRT can be slow unless your hardware is top-shelf. This is one of the reasons that I haven't been able to use it to improve my own engine (Homura).
@@redbedhed I wouldn't say that, the time controls for SPRT are much faster than what he is doing and multithreading is a flat N speedup in the number of cores. I have 6 logical cores and a sub-1000 game SPRT is reasonable to wait for, I can test one or two of those changes a day. Once the patches start gaining less (~10 elo) it starts taking too long though
Interesting. I would have just assumed that the wins are binomially distributed. Then I could use some calculator to get the CDF for at least that many wins out of (num wins + num losses) trials assuming that the probability to win is 0.5. That would give the p-value.
This one really deserves the title "Adventure" and is exactly what you need when you can't sleep at night because you keep thinking about your terrible maths exam from 2 days ago. Thank you :D
Blame on you, Sebastian! I went down the rabbit hole of a chess engine from scratch of my own. I made bit boards , alphabeta search, nn evaluation, Montecarlo tree search and many other micro elements retrieved here and there as ideas. A wrote the code from zero 3 times, changing programming language as well as approaches. After two months I am dreaming of bishops moving only on diagonal bits (don’t know, in the dream a “diagonal bit” has totally sense). In fact it has been a huge journey and I am happy I had the stamina to overcome the several difficulties and the final result is definitely mediocre but totally mine and this is enough. You started the interest in the topic I had since I was a child as a personal programming challenge. Very nice job pal. Thanks a lot
Been working on mine on and off for a few months now, just started on version 4. How did you get magic numbers? My generation can't seem to find a single one in hours on randomly generating with a specified amount of 1s bits or random amount, in hundreds of millions of attempts. Not even a single valid one.
The "theoretical" best computer approach is to play out all possible continuations from a position to game completion, and choose the best continuing move. Humans can't do this, so we make generalizations instead. Having more pieces than our opponent will generally give us a greater ability to win, as will pushing passed pawns, delivering check, mobilizing our king in the end game, etc. Of course computers can't execute that "theoretical" best approach either, so (as represented in this video) we teach bots to use the generalizations that humans have come up with. I've sometimes wondered what it might look like to get computers to come up with generalizations of their own. Not only could this result in stronger bots, but it might create opportunities for bots to teach us strategies.
What impresses me the most is how dedicated you are to tackle new ideas and then return later to improve them. Usually when I start a personal project I put a lot of time into it and then kind of abandon it and move on to something else You are a true inspiration
I look forward to Chess Part 3! Also, this is such a SPECTACULARLY good explanation of so many difficult concepts of chess engine design. The magic bitboards part especially. Well friggin' done!
An hour long coding adventure featuring cats, that somehow makes me feel both relatively inedequate at coding but also motivates me to 'go do stuff!'... My day couldnt get any better! Honestly thank you for these videos, the animation work is so clean and you can tell how much effort goes in
RUclips needs more creators like you. This video was very helpful with helping me sleep. But I've heard it so many times that by now I know all the concepts you talk about by heart.
I am a computer science professor, and I teach heuristic search and Game AI each year. I am blown away by your animations and your detailed explanation of these concepts. I wish I had the time per lecture and video production skill to be able to make content like this for students. 27:51 - Sometimes adding calculations to the evaluation which theoretically improve things can cause your evaluation to be a bit slower, meaning fewer nodes / depth searched. There's not exact science to this, and this is why your evaluation is so important 28:16 - You can eliminate some of the conditionals that you have here by doing clever tricks like saying numIsolatedPawns += (1 - (friendlyPawnsBitboard & Bits.adjacentFilemask[pawnFile] > 0)). Doing the math directly on the result rather than using an if-statement helps eliminate slowdown due to branch prediction. In practice this may not actually make the program any stronger, and the compiler may be smart enough to do this for you anyway, but it's something to think about. Final comment: Did you tie-break scores with the number of moves performed? I did not see this anywhere in the video, but you definitely want to be taking 'quicker' checkmates than later ones due to the incomplete nature of the search.
@@Wariowa345 I can imagine that teaching search is difficult but very necessary... Many problems in AI are search problems. The basic search algorithms are not too difficult to grasp... Depth-first, breadth-first, best-first, uniform-cost, A*, backtracking depth-first, minimax, alpha-beta, mcts. However, customizing any one of these basic algorithms to make decisions in a particular environment (especially a stochastic environment [not computer chess] with multiple agents)... That's where things get extremely challenging. This video, while excellent, lacks quite a bit of detail and leaves out quite a few chess programming techniques. Programming a computer to play chess at super-human level is a feat that often takes many years to achieve (assuming you aren't just cloning stockfish and changing a few lines). Good hardware can speed the process up quite a bit. As others have mentioned in this thread, SPRT is the only way to really test if your engine has improved. SPRT tests often take many thousands of games to do this. If you can run games in parallel, the SPRT test will finish much faster. Good hardware is also needed to train neural networks in a timely manner. I'm personally about to upgrade from a cheap laptop to a desktop (now that I'm no longer living on student loans lol).
Currently writing a study about coding my own chess engine. Your video was the first and most impactful one I watched to inspire myself for my study. And now I get to have part 2?! Thank you very much Sebastian! Holy, I just finished watching the video and damn, I was hoping for copper but found a pile of gold!
Thank you for recommending that 30 chess bots suckerpinch video! It's really great for people who enjoy similar content and one of the most under-viewed videos on youtube. These two chess bot videos of yours have been the only things to scratch that same lovely itch for me, so thank you even more for making such amazing videos like these.
i'm sure you've heard by now how amazing your work is. what i want to point out though, is how remarkable your patience is. Sticking through gigantic projects like these and keeping on through all the bugs and edge cases is an amazing feat!
At 31:58 this sounds like something that could utilize the "pdep" assembly instruction on pretty much all 64 bit processors. There's a C# binding for the command called ParallelBitDeposit. The purpose of the command is to take a 64 bit value you have, and a 64 bit mask representing the locations you want the bits from that value to be distributed over, and it deposits the bits from the value into the spots defined in the mask. Being able to do that in a single instruction sounds like it could heavily speed up that part of the application.
There's also "pext", which can extract bits in the other direction. It can replace the magic bitboard calculation, provided a cpu supports the instruction.
This is really cool! I've been a chess nerd all my life so seeing this is like magic to me. Separately, do you have plans to continue working on your explaining computers series? And, if so I'd appreciate it if you would share with us! I really think that series is incredible. Thanks!
Hey, I’m happy you enjoyed the video! And yes, the computer series will definitely be back in the future - there’s still so much I want to explore there.
@@SebastianLague I highly recommend looking into neural networks and how they can be applied into your engine. It will probably be a lot of work and very very complex but I highly recommend it
18:24 The framing “probability that the new version is better than the old version” calls for a Bayesian approach. You could use a conjugate Dirichlet model with a weak flat prior to get a Dir(1 + 365, 1 + 282, 1 + 353) posterior distribution for the probability of Win/Draw/Loss. Defining “better” as “wins more than loses,” you can sample from the posterior and compute how often the probability of winning is greater than the probability of losing. With these numbers you get 67% so roughly 2:1 odds for V7 against V6c.
I don't often leave a comment but I just have to here. Wow. Gosh I love watching this stuff. Watching a Sebastian Lague's video is a fun activity all by itself. Thank you so much for the 1 hour long content.
50:22 Apology accepted my man. The first chess coding adventure was my favorite video of yours, I now give you explicit permission to make a third in the series.
Love this video series. I did a chess game for my A level comp sci project (A level is a qualification taken at 18 in UK). Spent hours watching and rewatching your first vid as it was released right in the middle of that time. This vid was awesome too! Would love to see more in the future :)
This is an awesome adventure through the associated code! I really appreciate the cleverness of the use of bitboards to represent various aspects of game state, and the various ways you show of working with them through mathematical operations that stand in for more conventional approaches such as iteration loops. One thought that crossed my mind repeatedly was taking all the metrics you employ, turning them into inputs for a neural net, and utilizing machine learning to let it experiment with various combinations of weights, to arrive at optimal values. I realize this is much, much easier said than done, since it amounts to many dozens of inputs. Anyway, great video, as always!
Yeah, it seems really hard to get anywhere close optimal values just tweaking all those weights by hand, so I think some sort of automated tuning process (using neural networks or otherwise) would definitely be a good idea. I’m happy you enjoyed the video btw, thank you!
@@SebastianLague You could try reinforcwment learning, it's not as heavy a subject as machine learning and I feel like it would fit well with your explanation style and how you bounce back and forth between code and examples ERRATA: In this comment I use the phrase "machine learning" as opposed to reinforcement learning, as my peers below have repeatedly pointed out, reinforcement learning is part of machine learning, using machine learning to refer to neural networks in specific as was my intention was unduly imprecise and contributed to misinformation on this topic
@@DanKaschel yeah of course but when people talk about machine learning, they most commonly mean _neural networks,_ that's the distinction I was drawing I'm very aware ML is a hugely broad topic referring to anything from RL, NNs, deep learning, neural architecture search, etc etc (I'm a data scientist by trade and currently doing a PhD on multitasking neural networks so I've touched all of these topics to some extent!)
As an applied mathematician, I can say smth about 18:30. We can't precisely determine the probability of how much one bot is better than the other. However, we can estimate that one bot is better than the other with a certain level of confidence, for instance, 95%, using statistical methods and calculations such as confidence intervals. (Or better statistician can =) ) Me and ChatGPT of course: To understand which chess bot performs better using statistics, given wins, draws, and losses, we can apply the concept of confidence intervals for the proportion of wins. A confidence interval gives an estimated range of values that is likely to include an unknown population parameter, in this case, the true winning proportion. In your example, you have two bots that played 1000 games against each other. The first bot has 365 wins, 282 draws, and 353 losses. The winning rate p is estimated as p = x/n = 365/1000 = 0.365. We can use the Wilson score interval, a type of confidence interval, to estimate the range where the true winning rate lies, with a certain level of confidence. The Wilson score interval for a binomial proportion (like the winning rate) is given by: L = (2np + z^2 - z*sqrt(z^2 + 4np(1-p))) / (2*(n+z^2)) U = (2np + z^2 + z*sqrt(z^2 + 4np(1-p))) / (2*(n+z^2)) where p is the observed winning rate, n is the number of games, and z is the z-score which corresponds to the desired confidence level (e.g., for a 95% confidence level, z = 1.96). Using these formulas for your data, the 95% confidence interval for the win rate of the first bot was calculated to be approximately [0.338, 0.392]. For the second bot, assuming it won 353 games, had 282 draws, and lost 365 games, the win rate would be p = x/n = 353/1000 = 0.353, and the 95% confidence interval would be approximately [0.327, 0.379]. Notice that these confidence intervals overlap, suggesting that based on these results, there isn't a statistically significant difference in the performance of the two bots. For there to be a statistically significant difference in performance, the confidence intervals of the two bots would need to not overlap. For example, if bot A won 385 games out of 1000, its win rate would be 0.385, and its 95% confidence interval might be [0.358, 0.412]. If bot B won 335 games out of 1000, its win rate would be 0.335, and its 95% confidence interval might be [0.308, 0.362]. Because these intervals do not overlap, we can assert with 95% confidence that bot A wins more frequently than bot B.
A random sample taken from {0, ½, 1} with equal probability of each has a mean of 0 and variance ⅙. The sum of 1000 trials has mean 500 and variance 166⅔ (standard deviation 12.89). Take your total score, subtract 500, and divide the result by 12.89 to get a z-score. Take that result and plug it into a p-value calculator online. That is the probability that two equal engines would get a result at least that extreme. It's something like (but technically not the same as) the probability that one engine is better or worse than another engine. EDIT: Using your observed results from 3:02, a better standard deviation to use would be 13.247. The result at 18:35 gives a z-score of 0.4529 and a p-value of 0.6506, which is not at all statistically significant.
I love this some months ago, inspired by the first video, I also decided to give writing my own engine a try. you have no idea how much I needed that boost of confidence to keep going and know that I'm going in the right direction. when I have some time and a stable build I will definitely make our engines play a game and let you know the result. you are by far my favourite (coding) content creator and I hope you never stop uploading once in a while.
I have been working in a chess engine myself in Rust. I had played chess for a few years but only your last video on this series made me excited to start. Great stuff!
Someone mentioned your channel in passing and I've been having such a good time watching your videos. Seeing you tackle these problems and clearly explain each approach with the code on-screen really helps make the whole thing feel more approachable.
I never fought about that, but I think you're my favorite RUclipsr among the hundreds I watched. 1) Inspiration for coding 2) So much I learned 3) So eye pleasing graphics Wow! Keep up the work 👍
18:33 Statistician here: As a rule of thumb, with 1000 observations, estimating a proportion of 50% has about 3% uncertainty up and down, so getting 500/500 means the underlying winning chance is between 47% and 53%. So any difference smaller than 30 more or less, I would disregard. More precisely, and if we ignore draws, we have in this case a winning chance of somewhere between 47.18% and 54.48% so we are undecided for sure! (This is with 95% Confidence Interval). In Python you can get this with `scipy.stats.beta.ppf([.025, .975], 1 + 365, 1 + 353)`
I reallly like this video and as someone who also created a chess bot (much weaker) I know to appriciate how much work you putted into this video. I also want to add that the bot isn't overrated at all. I am 2600 in bullet and blitz and played some bullets against it and it outplayed me most of the time. By the way I couldn't help but notice that you are also a decent chess player by your way of thinking
It’s not extremely overrated when playing against humans at tight time controls, no. But the only way to get an accurate rating for a chess engine is to pit it against other chess engines for many thousands of games.
I’m subscribed to a few hundred different RUclips channels, some inactive and others active. Yours is the only one I have notifications set up, so I get notified when you upload a new video. I absolutely love watching your videos. They are paced and explained to perfection, and you tackle some really interesting subjects. By far my favorite RUclips channel. Thank you!
Another simple improvement would be to use the “allowed moves” bitboards in your evaluation. You can combine it with the square scores as well and add all of the possible moves weighted by the square score. This would solve your “rook stuck in corner behind bishop” problems! Very cool video!
wasnt expecting a second part of this. the first chess video of you was already my favorite video and it was very detailled. i cant even imagine whats in this video :)
I was really hoping you'd come back to this! Your previous video inspired me to start working on my own chess engine, which after many iterations is now the project I'm most proud of!
It looks like you need to add King safety in the bitmap as well, it is not penalizing open space near king it looks like. Great video btw, never knew where the entire hour went!
18:31 Not a statistician, but I think I can calculate the P-value (the chance that the test would have given the exact same results or less evenly distributed assuming winning and losing both have a probability of 0.5) anyways. I assume the games that are drawn were never played, im pretty sure that's an assumption that you're allowed to make. Before finding the P-value, I first want to explain how you would find the chance that exactly 365 out of 718 are won. This question is the same as asking 'suppose you threw a fair coin 718 times. What are the chances that it lands heads exactly 365 times?'. If you throw a fair coin 718 times, there are 2^718 possible sequences. Out of those sequences, 718! / ( 365! * (718-365)! ) contain heads exactly 365 times. Let me explain: " suppose you have 718 tickets with each a unique value between 1-718. If you lay them down in a random order, there are 718! different possible sequences. If instead you paint over 365 tickets with yellow paint (analogous to getting heads in the coins analogy, or winning in the actual problem) before laying out the sequence, the amount of different sequences suddenly becomes 718! / 365! as for every new way of laying out a sequence, the yellow tickets would have otherwise had 365! different ways of being ordered. If you paint over the other 353 tickets with blue, the amount of different sequences becomes 718! / 365! / 353! " Thus, the chance to get heads exactly 353 times is 718! / ( 365! * (718-365)! ) / 2^718 which roughly equals 0.0269. Now, to calculate the P-value, we add up ((the chance to win 365 times out of 718) + (the chance to win 366 times out of 718) + .... + (the chance to win 718 times out of 718 times)) + ((the chance to lose 365 times out of 718) + (the chance to lose 366 times out of 718) + ... + the chance to lose 718 times out of 718 times)) = 2 * ((the chance to win 365 times out of 718) + (the chance to win 366 times out of 718) + .... + (the chance to win 718 times out of 718 times)) which roughly equals 0.68. This is not low enough to conclude that the newer version does not have a winrate of 50% at all. Let me know if anyone has any questions!
You've a very powerful talent Sebastian. I'm absolutely uninterested in chess or board games in general. Yet, when you make a video like this my fingers start to itch to create my own chess engine. There're very little other videos as inspiring and motivating as yours. I hope you are aware of the impact your videos have. Thanks so much for making these.
I love this video concept so much!! As a chess player and evolving programmer, you have officially just made my day! btw PLEASE make a third video, this is so enjoyable to watch...
Man, your vidoes are production quality, your narrations are well written and structured. No sure why this isn't on master class. Please do keep up the good work.
Sebastian's videos have been such an incredible inspiration for me! They sparked my passion for programming my very own chess engine. And now, seeing him in action, making it even more efficient, is truly motivating me to strive for improvement and enhance my skills. Thank you so much, Sebastian! You've made my day incredibly special. Sending lots of love and gratitude from India. ❤
18:24 tl;dr The probability that the pawn end table improves on the old code is 67.3%, and the probability it makes things worse is 32.7%. The win rate with the pawn end table is probably below 51% (it is unlikely to be a significant improvement). Using Laplace's rule of succession, the estimated win rate using the pawn end table is (wins+1) / (wins+1 + losses+1) = 366 / (366 + 354) = 50.83% However, to get the probability that the pawn end table represents an improvement, we need to find the probability density function (pdf) for the win rate and then integrate over the region representing an improvement. The prior for the win rate x is uniformly distributed on the interval [0,1] prior(x) = 1 if 0
It's always a good day when Sebastian posts a video. And always a terrible month afterwards, as I get inspired fo start a similar project and innevitably fail.
I really enjoy coding various things/projects and watching your videos really is enjoyable, as i can see how various things work, watch an effective planning process, and do all of this with pleasant visuals and audio. the editing and overall content of these videos is really great, and i cant wait for more videos. thank you for making such great content!
18:24 You wanted to know if this is better or just random. Well, I calculated the P-value(*) using a two-tailed Binomial test(**) for every version you tested: Match | Wins-Loss | P | Better? V2b vs V1 | 409 vs 270| 1.1e-7 | YES V3 vs V2b | 399 vs 387| 69% | Random V3_64vs V2b | 386 vs 246| 2.8e-8 | YES V4 vs V3_64 | 433 vs 237| 3.3e-14 | YES V5b vs V4 | 337 vs 273| 1.0e-3 | YES V6 vs V5b | 393 vs 310| 1.0e-4 | YES V6b vs V5b | 298 vs 309| 68% | Random V6c vs V5b | 385 vs 276| 2.6e-5 | YES V7 vs V6c | 365 vs 353| 68% | Random V8 vs V7 | 415 vs 325| 1.1e-7 | YES V9 vs V8 | 347 vs 367| 48% | Random V9b vs V8 | 396 vs 341| 4.7% | Random (debatable) V10 vs V9b | 447 vs 275| 1.6e-10 | YES V11 vs V10 | 477 vs 221| 1.6e-22 | YES V12 vs V11 | 397 vs 238| 2.9e-10 | YES The notation "2.8e-8" means "2.8*10^(-8)" that is 28 in a billion odds, 1e (*)What does the P-Value mean? It is the chance we would still get this result if the players were equally good. A lower P means more likely not random (**)How do we calculate that? Using a two-tailed binomial test (If you already know how that works you can stop reading now) Our null hypothesis (which we hope to debunk) is that both players are equally likely to win. (I ignored all draws, and looked only at the games with a winner) The P-value is the likelihood that we would still get this result or something less likely if our Null hypothesis is true. So for the 365 wins and 353 losses, the P-value is the chance of getting either 365 wins and 353 losses, 365 losses and 353 wins, 367 wins and 352 losses and so on. This is a "two-tailed" binomial test, and you can look up the formula for calculating that on Wikipedia. A low P-value means that it is unlikely (But still possible) that we would get this result if the players are equal. I decided that if P
Nice explanation :) It's interesting to see how different people deal with draws - I kept win/draw/loss-stats separate, some people counted them as 0.5 of a win, and you've left them out completely! I personally think it's better to include them somehow, because in the case of 6vs7, the wins slightly increased, draws significantly decreased, and the losses slightly increased! In fact from my calculations, the draw-stats were the only one of the three that *was* significant!
i'm a stats student, and i also ran some of these calculations when he mentioned it. if someone doesn't mind explaining, why use a two-tailed test (the proportion of wins != 0.5) instead of a right-tailed test (the proportion of wins > 0.5)?
@@soupo-sandwichone-tailed is more logical here. i also think using a 1-prop z test and interval would be more appropriate but i guess they'd yield similar results
I've been coding for a while and worked on a few big projects, but I've never really relied on bit manipulation to keep track of and manipulate data. This video was amazing, and it greatly increased my appreciation for binary operations and bit manipulation. Even though I understood what it was before I never really liked it, but now I'm definitely going to make a point to use it in my next project. Fantastic stuff as always, can't wait for the next one.
"makes more sense than En Passant at least" En Passant was added because moving pawns 2 squares was added to speed up the early game. Originally pawns could only move one square, so it was impossible for pawns to "jump past" each other to avoid capture. En Passant was added to allow for pawns to retain their same amount of control while keeping the speed of the early game fast.
One thing you should impliment is pondering. Pondering is when the engine uses the opponents move time to think of moves the opponent could make and the best reply to the opponents move if they do make it. This can help with prepopulating your search tree with possible moves and allows you to search deeper durring your time if the opponent does make that move.
I want to present like you do; you make it look a lot easier than it is, while explaining the key areas and skipping all the trivial things. And letting your audience have a small snicker now and then.
Awesome video. Very entertaining and informative at the same time. I thought 20 minutes have passed and then the whole video ended. Really enjoyed the explanation on every addition. Keep it up!!
I feel the most major improvement to make here is having CAB care a lot more about the safety of it's king, and the vulnerability of the opposing king.
I love how as soon as I decide I want to start to work on a chess engine, youtube recommends me 2 videos of someone completely recking anything I could imagine to do in my entire life in 2 months
Wait, a one hour sequel to the video that inspired me to write my own Chess engine and eventually became my Bachelor thesis? HELL YEAH!:D
That's awesome, is your thesis available online? I'd love to take a look :)
And in the deleted comment I also admitted to miss-spelling you in my thesis‘ introduction, which I am forever ashamed of.
Also this new video of yours is amazing and massively increased my motivation again to continue work on my engine as well!
@@RabbleRousy Haha don't worry about the miss-spelling at all, and I'm happy to hear the video has boosted your motivation!
@@SebastianLagueCODING GAME, WORK ON NOW I BEG YOU IT WAS SO COOL
Im not sure how much support Ill get in this particular comment section, but
What a nerd
Please don't apologize for a part 3, I'd love to see it!
for part 3 I think that engine is already strong enough that he needs to actually use some SPRTs for testing and not fixed games tests :) Somewhere there is where you need to start something that every "serious" engine uses as a development tool nowadays.
Waiting for it
an hour long ? We are blessed, thank you Sebastian
An hour long and links to another 40 min video that you probably watched ages ago but sure why not, lets have another look.. perfection
@@biocode4478 exactly
@bio
L
Kk
J
Lkk
} La code4478
Ngl the fact that a chessboard is 8x8 is a happy little coincidence that makes chess programming easier since all the values fill well
Ancestors knew what they were doing 😮
I'm coding Janggi (Korean chess) and the fact it's *not* 8x8 makes me jealous of how beautiful western chess code gets to be.
50% of why i love watching these videos is about the code. The other half is about the fantastic story telling along the way. Incredible video editing (animating code edits, animations between code/variable changes and the actual impact of those changes in the final product, statistic tracking and animating, good pacing throughout the video, no chapter feels slow or boring). And I could keep going for a while. I’m curious what kind of tools you use to do this editing. It’s really awesome and I hope you keep making these kinds of videos for a very long time!
He did some sort of “making of” a while ago
@@lucbloom for some unknown reason I missed his previous video. Thank you for mentioning it, it explains a lot!
agree, know absolutely nothing about coding but still watched all of it, spectacular storytelling and calm voice
@@Vespura_ what about the cats tho xd
@@csehszlovakze clearly the best part
18:25 An easy way to find if the bot has improved is performing a sign test.
We model the number of wins for v7 as a binomial variable X~B(n,p) with number of games(n) = 1000, and hypothesise that probability of a win(p) = 0.5.
If we treat every 2 draws as one loss and one win, we can find the probability of 365 + 282/2 = 506 wins.
Finding the probability that X >= 506 gives us a p-value of 0.364.
At a significance level of 0.05, this means we cannot conclude that the v7 bot is better than v6c and would need a larger sample of games to be sure.
TLDR: we cannot say there is a significant improvement
İf the number of games is 1000, np and np are both > 10. Thus, we can model it with a normal model with a mean at 0.5 and a stdev of sqrt(0.5*0.5*1000) rather than a binomial
@@soliform3485 you mean np and nq ?
@@typoilu3413 Yes my bad
Great explanations !
I wasn't sure how to deal with draws, but I suppose treating it as 0.5 of a win _kind of_ works? I just kept them separate :) I know whether equating a draw to a 50/50 win lose really works tho? Surely losses should be their own 3rd category which is either left out or treated separately?
The method to calculate the probability of one engine being better than another based on game outcomes is called the Sequential Probability Ratio Test (SPRT). The CLI-based match runners that the chess programming community use to test their engines have this capability built-in, and can automatically run a test until the probability of improvement exceeds some desired bound. The most popular one is called cutechess-cli, which works with UCI chess engines
I agree that the obvious next step for this engine is to get a better idea of its rating with CCRL/cutechess-cli. Sadly SPRT can be slow unless your hardware is top-shelf. This is one of the reasons that I haven't been able to use it to improve my own engine (Homura).
@@redbedhed I wouldn't say that, the time controls for SPRT are much faster than what he is doing and multithreading is a flat N speedup in the number of cores. I have 6 logical cores and a sub-1000 game SPRT is reasonable to wait for, I can test one or two of those changes a day. Once the patches start gaining less (~10 elo) it starts taking too long though
Interesting. I would have just assumed that the wins are binomially distributed.
Then I could use some calculator to get the CDF for at least that many wins out of (num wins + num losses) trials assuming that the probability to win is 0.5.
That would give the p-value.
This one really deserves the title "Adventure" and is exactly what you need when you can't sleep at night because you keep thinking about your terrible maths exam from 2 days ago. Thank you :D
Problem is, mine is tomorrow 😂
@@richardueltzen3755 hope it goes well
Blame on you, Sebastian! I went down the rabbit hole of a chess engine from scratch of my own. I made bit boards , alphabeta search, nn evaluation, Montecarlo tree search and many other micro elements retrieved here and there as ideas. A wrote the code from zero 3 times, changing programming language as well as approaches. After two months I am dreaming of bishops moving only on diagonal bits (don’t know, in the dream a “diagonal bit” has totally sense). In fact it has been a huge journey and I am happy I had the stamina to overcome the several difficulties and the final result is definitely mediocre but totally mine and this is enough. You started the interest in the topic I had since I was a child as a personal programming challenge. Very nice job pal. Thanks a lot
Been working on mine on and off for a few months now, just started on version 4. How did you get magic numbers? My generation can't seem to find a single one in hours on randomly generating with a specified amount of 1s bits or random amount, in hundreds of millions of attempts. Not even a single valid one.
"I think this move is pretty good but... *considers alternatives you know what? f it. *blunders queen.
The "theoretical" best computer approach is to play out all possible continuations from a position to game completion, and choose the best continuing move. Humans can't do this, so we make generalizations instead. Having more pieces than our opponent will generally give us a greater ability to win, as will pushing passed pawns, delivering check, mobilizing our king in the end game, etc. Of course computers can't execute that "theoretical" best approach either, so (as represented in this video) we teach bots to use the generalizations that humans have come up with. I've sometimes wondered what it might look like to get computers to come up with generalizations of their own. Not only could this result in stronger bots, but it might create opportunities for bots to teach us strategies.
Enter, neural networks.
Not actually how modern engines work. They calculate promising lines very deeply, not every possible line
game tree is so massive engines will stop after a mediocre depth; pruning is very important otherwise engines will be extremely ineffective
What impresses me the most is how dedicated you are to tackle new ideas and then return later to improve them. Usually when I start a personal project I put a lot of time into it and then kind of abandon it and move on to something else
You are a true inspiration
I look forward to Chess Part 3!
Also, this is such a SPECTACULARLY good explanation of so many difficult concepts of chess engine design. The magic bitboards part especially. Well friggin' done!
You know today is going to be a good day if Sebastian uploads a new video 😊
Bruh, I was about to comment that 😂😂
Fax no prinrert
True
True, make me learn about stuffs
I agree
An hour long coding adventure featuring cats, that somehow makes me feel both relatively inedequate at coding but also motivates me to 'go do stuff!'... My day couldnt get any better!
Honestly thank you for these videos, the animation work is so clean and you can tell how much effort goes in
Can't wait for part 3!
Always a pleasure to watch another Coding Adventure.
RUclips needs more creators like you. This video was very helpful with helping me sleep. But I've heard it so many times that by now I know all the concepts you talk about by heart.
I am a computer science professor, and I teach heuristic search and Game AI each year. I am blown away by your animations and your detailed explanation of these concepts. I wish I had the time per lecture and video production skill to be able to make content like this for students.
27:51 - Sometimes adding calculations to the evaluation which theoretically improve things can cause your evaluation to be a bit slower, meaning fewer nodes / depth searched. There's not exact science to this, and this is why your evaluation is so important
28:16 - You can eliminate some of the conditionals that you have here by doing clever tricks like saying numIsolatedPawns += (1 - (friendlyPawnsBitboard & Bits.adjacentFilemask[pawnFile] > 0)). Doing the math directly on the result rather than using an if-statement helps eliminate slowdown due to branch prediction. In practice this may not actually make the program any stronger, and the compiler may be smart enough to do this for you anyway, but it's something to think about.
Final comment: Did you tie-break scores with the number of moves performed? I did not see this anywhere in the video, but you definitely want to be taking 'quicker' checkmates than later ones due to the incomplete nature of the search.
do you consider your teaching subject harder than average?
The source code for the first video includes mate pruning, so he is looking for quicker mates
do you have a chess engine? If so, I'd love to read it! :)
@@Wariowa345 I can imagine that teaching search is difficult but very necessary... Many problems in AI are search problems.
The basic search algorithms are not too difficult to grasp... Depth-first, breadth-first, best-first, uniform-cost, A*, backtracking depth-first, minimax, alpha-beta, mcts.
However, customizing any one of these basic algorithms to make decisions in a particular environment (especially a stochastic environment [not computer chess] with multiple agents)... That's where things get extremely challenging.
This video, while excellent, lacks quite a bit of detail and leaves out quite a few chess programming techniques. Programming a computer to play chess at super-human level is a feat that often takes many years to achieve (assuming you aren't just cloning stockfish and changing a few lines).
Good hardware can speed the process up quite a bit. As others have mentioned in this thread, SPRT is the only way to really test if your engine has improved. SPRT tests often take many thousands of games to do this. If you can run games in parallel, the SPRT test will finish much faster. Good hardware is also needed to train neural networks in a timely manner.
I'm personally about to upgrade from a cheap laptop to a desktop (now that I'm no longer living on student loans lol).
My English is so little and this video is so fabulous. Thank you Sebastian! This programming - chess content is really interesting
Currently writing a study about coding my own chess engine. Your video was the first and most impactful one I watched to inspire myself for my study. And now I get to have part 2?! Thank you very much Sebastian!
Holy, I just finished watching the video and damn, I was hoping for copper but found a pile of gold!
Thank you for recommending that 30 chess bots suckerpinch video! It's really great for people who enjoy similar content and one of the most under-viewed videos on youtube. These two chess bot videos of yours have been the only things to scratch that same lovely itch for me, so thank you even more for making such amazing videos like these.
I think that might be my favorite video on youtube.
all this makes me realise just how terrifying stockfish is
holy fook.
*rook
i'm sure you've heard by now how amazing your work is. what i want to point out though, is how remarkable your patience is. Sticking through gigantic projects like these and keeping on through all the bugs and edge cases is an amazing feat!
At 31:58 this sounds like something that could utilize the "pdep" assembly instruction on pretty much all 64 bit processors. There's a C# binding for the command called ParallelBitDeposit. The purpose of the command is to take a 64 bit value you have, and a 64 bit mask representing the locations you want the bits from that value to be distributed over, and it deposits the bits from the value into the spots defined in the mask. Being able to do that in a single instruction sounds like it could heavily speed up that part of the application.
Oh nice, I wasn’t aware of that! That part of the calculation is computed ahead of time, so the speed doesn’t really matter, but still good to know.
There's also "pext", which can extract bits in the other direction. It can replace the magic bitboard calculation, provided a cpu supports the instruction.
@@user-dh8oi2mk4f are you saying that INTERCAL predicted the performance needs of chess engines decades in advance with its select operator?
your pfp is adorable!
This is really cool! I've been a chess nerd all my life so seeing this is like magic to me. Separately, do you have plans to continue working on your explaining computers series? And, if so I'd appreciate it if you would share with us! I really think that series is incredible. Thanks!
Hey, I’m happy you enjoyed the video! And yes, the computer series will definitely be back in the future - there’s still so much I want to explore there.
Hello, profile picture twin!
@@LimitedWard slightly different but thats cool
@@SebastianLague I highly recommend looking into neural networks and how they can be applied into your engine. It will probably be a lot of work and very very complex but I highly recommend it
@@mod6870 complete rewrite. I'm not saying its not worth it but it'd be starting over from scratch
18:24 The framing “probability that the new version is better than the old version” calls for a Bayesian approach. You could use a conjugate Dirichlet model with a weak flat prior to get a Dir(1 + 365, 1 + 282, 1 + 353) posterior distribution for the probability of Win/Draw/Loss. Defining “better” as “wins more than loses,” you can sample from the posterior and compute how often the probability of winning is greater than the probability of losing. With these numbers you get 67% so roughly 2:1 odds for V7 against V6c.
I integrated the binomial formula w.r.t. p and also got 67%
I don't often leave a comment but I just have to here.
Wow. Gosh I love watching this stuff. Watching a Sebastian Lague's video is a fun activity all by itself. Thank you so much for the 1 hour long content.
I was waiting for this video!!! Thank you!!!
I would love a part 3! Also I loved how long this one was :)
50:22 Apology accepted my man. The first chess coding adventure was my favorite video of yours, I now give you explicit permission to make a third in the series.
10:48 this is such a good reaction gif. ori is clearly an excellent performer. she has mastered the art of less is more
Telegram the above username, got a package for you📦🎉..
I love your vids, I’ve been subbed for over a year and every single video I’ve enjoyed!
Love this video series. I did a chess game for my A level comp sci project (A level is a qualification taken at 18 in UK). Spent hours watching and rewatching your first vid as it was released right in the middle of that time. This vid was awesome too! Would love to see more in the future :)
Your release of the video certainly gave it a boost in rankings :D
best series ever, the storyline is perfect
I watched what is now part one last week as it is one of my favorite RUclips videos, how great that it now has a sequel!
This is an awesome adventure through the associated code! I really appreciate the cleverness of the use of bitboards to represent various aspects of game state, and the various ways you show of working with them through mathematical operations that stand in for more conventional approaches such as iteration loops. One thought that crossed my mind repeatedly was taking all the metrics you employ, turning them into inputs for a neural net, and utilizing machine learning to let it experiment with various combinations of weights, to arrive at optimal values. I realize this is much, much easier said than done, since it amounts to many dozens of inputs. Anyway, great video, as always!
Yeah, it seems really hard to get anywhere close optimal values just tweaking all those weights by hand, so I think some sort of automated tuning process (using neural networks or otherwise) would definitely be a good idea. I’m happy you enjoyed the video btw, thank you!
@@SebastianLague You could try reinforcwment learning, it's not as heavy a subject as machine learning and I feel like it would fit well with your explanation style and how you bounce back and forth between code and examples
ERRATA: In this comment I use the phrase "machine learning" as opposed to reinforcement learning, as my peers below have repeatedly pointed out, reinforcement learning is part of machine learning, using machine learning to refer to neural networks in specific as was my intention was unduly imprecise and contributed to misinformation on this topic
@@Imperial_Squidyou do realize that reinforcement learning is a variety of machine learning, right?
@@DanKaschel yeah of course but when people talk about machine learning, they most commonly mean _neural networks,_ that's the distinction I was drawing
I'm very aware ML is a hugely broad topic referring to anything from RL, NNs, deep learning, neural architecture search, etc etc (I'm a data scientist by trade and currently doing a PhD on multitasking neural networks so I've touched all of these topics to some extent!)
Came here to say this :}
As an applied mathematician, I can say smth about 18:30. We can't precisely determine the probability of how much one bot is better than the other. However, we can estimate that one bot is better than the other with a certain level of confidence, for instance, 95%, using statistical methods and calculations such as confidence intervals. (Or better statistician can =) )
Me and ChatGPT of course:
To understand which chess bot performs better using statistics, given wins, draws, and losses, we can apply the concept of confidence intervals for the proportion of wins. A confidence interval gives an estimated range of values that is likely to include an unknown population parameter, in this case, the true winning proportion.
In your example, you have two bots that played 1000 games against each other. The first bot has 365 wins, 282 draws, and 353 losses. The winning rate p is estimated as p = x/n = 365/1000 = 0.365. We can use the Wilson score interval, a type of confidence interval, to estimate the range where the true winning rate lies, with a certain level of confidence.
The Wilson score interval for a binomial proportion (like the winning rate) is given by:
L = (2np + z^2 - z*sqrt(z^2 + 4np(1-p))) / (2*(n+z^2))
U = (2np + z^2 + z*sqrt(z^2 + 4np(1-p))) / (2*(n+z^2))
where p is the observed winning rate, n is the number of games, and z is the z-score which corresponds to the desired confidence level (e.g., for a 95% confidence level, z = 1.96).
Using these formulas for your data, the 95% confidence interval for the win rate of the first bot was calculated to be approximately [0.338, 0.392].
For the second bot, assuming it won 353 games, had 282 draws, and lost 365 games, the win rate would be p = x/n = 353/1000 = 0.353, and the 95% confidence interval would be approximately [0.327, 0.379].
Notice that these confidence intervals overlap, suggesting that based on these results, there isn't a statistically significant difference in the performance of the two bots.
For there to be a statistically significant difference in performance, the confidence intervals of the two bots would need to not overlap. For example, if bot A won 385 games out of 1000, its win rate would be 0.385, and its 95% confidence interval might be [0.358, 0.412]. If bot B won 335 games out of 1000, its win rate would be 0.335, and its 95% confidence interval might be [0.308, 0.362]. Because these intervals do not overlap, we can assert with 95% confidence that bot A wins more frequently than bot B.
This is so so good as always, hoping you will make a part 3!
A random sample taken from {0, ½, 1} with equal probability of each has a mean of 0 and variance ⅙.
The sum of 1000 trials has mean 500 and variance 166⅔ (standard deviation 12.89).
Take your total score, subtract 500, and divide the result by 12.89 to get a z-score. Take that result and plug it into a p-value calculator online.
That is the probability that two equal engines would get a result at least that extreme. It's something like (but technically not the same as) the probability that one engine is better or worse than another engine.
EDIT: Using your observed results from 3:02, a better standard deviation to use would be 13.247.
The result at 18:35 gives a z-score of 0.4529 and a p-value of 0.6506, which is not at all statistically significant.
Your last video was super entertaining and informative,,,, saw part 2 and I immediately clicked! Thanks for the vid :)
I love this
some months ago, inspired by the first video, I also decided to give writing my own engine a try.
you have no idea how much I needed that boost of confidence to keep going and know that I'm going in the right direction.
when I have some time and a stable build I will definitely make our engines play a game and let you know the result.
you are by far my favourite (coding) content creator and I hope you never stop uploading once in a while.
Yay, you're back! I really love watching your videos, it's always a surprise when a new one is released.
I have no clue what he’s saying but it’s nice to have this on in the background while I work on my own much less advanced coding adventures.
I have been working in a chess engine myself in Rust. I had played chess for a few years but only your last video on this series made me excited to start. Great stuff!
Someone mentioned your channel in passing and I've been having such a good time watching your videos. Seeing you tackle these problems and clearly explain each approach with the code on-screen really helps make the whole thing feel more approachable.
I never fought about that, but I think you're my favorite RUclipsr among the hundreds I watched.
1) Inspiration for coding
2) So much I learned
3) So eye pleasing graphics
Wow! Keep up the work 👍
Thank you for introducing me to wicked cinema that music is outstanding
18:33 Statistician here:
As a rule of thumb, with 1000 observations, estimating a proportion of 50% has about 3% uncertainty up and down, so getting 500/500 means the underlying winning chance is between 47% and 53%. So any difference smaller than 30 more or less, I would disregard.
More precisely, and if we ignore draws, we have in this case a winning chance of somewhere between 47.18% and 54.48% so we are undecided for sure! (This is with 95% Confidence Interval).
In Python you can get this with `scipy.stats.beta.ppf([.025, .975], 1 + 365, 1 + 353)`
OMG, Sebastian. Even your code editor color theme is aesthetically pleasing.
I was thinking that XD
do you know what that theme is called?
@@unnamedbot2960 palenight on vs code
I don't know why, but this chess coding series is really inspiring to me. I look forward to the next vid :)
I reallly like this video and as someone who also created a chess bot (much weaker) I know to appriciate how much work you putted into this video.
I also want to add that the bot isn't overrated at all. I am 2600 in bullet and blitz and played some bullets against it and it outplayed me most of the time.
By the way I couldn't help but notice that you are also a decent chess player by your way of thinking
It’s not extremely overrated when playing against humans at tight time controls, no. But the only way to get an accurate rating for a chess engine is to pit it against other chess engines for many thousands of games.
I’m subscribed to a few hundred different RUclips channels, some inactive and others active.
Yours is the only one I have notifications set up, so I get notified when you upload a new video.
I absolutely love watching your videos. They are paced and explained to perfection, and you tackle some really interesting subjects.
By far my favorite RUclips channel.
Thank you!
Another simple improvement would be to use the “allowed moves” bitboards in your evaluation. You can combine it with the square scores as well and add all of the possible moves weighted by the square score. This would solve your “rook stuck in corner behind bishop” problems!
Very cool video!
great idea to promote opening up pieces
oh wow, it's wonderful to see you coding on my home turf, ml & python!
50:25 "So I may need to subject you all to a Chess Part 3 in the future."
YES! My body is ready.
Fantastic video as always Sebastian! I've watched Tom7's video several times so it's awesome to see it called out!
wasnt expecting a second part of this. the first chess video of you was already my favorite video and it was very detailled. i cant even imagine whats in this video :)
this is amazing , top 10 videos on youtube for sure .
I was really hoping you'd come back to this! Your previous video inspired me to start working on my own chess engine, which after many iterations is now the project I'm most proud of!
Waited for this for so long
👆🏻
It looks like you need to add King safety in the bitmap as well, it is not penalizing open space near king it looks like. Great video btw, never knew where the entire hour went!
This was such a great sequel to my favorite video of yours! Your videos are so inspiring, keep it up :)
18:31 Not a statistician, but I think I can calculate the P-value (the chance that the test would have given the exact same results or less evenly distributed assuming winning and losing both have a probability of 0.5) anyways. I assume the games that are drawn were never played, im pretty sure that's an assumption that you're allowed to make.
Before finding the P-value, I first want to explain how you would find the chance that exactly 365 out of 718 are won. This question is the same as asking 'suppose you threw a fair coin 718 times. What are the chances that it lands heads exactly 365 times?'. If you throw a fair coin 718 times, there are 2^718 possible sequences. Out of those sequences, 718! / ( 365! * (718-365)! ) contain heads exactly 365 times. Let me explain:
"
suppose you have 718 tickets with each a unique value between 1-718. If you lay them down in a random order, there are 718! different possible sequences. If instead you paint over 365 tickets with yellow paint (analogous to getting heads in the coins analogy, or winning in the actual problem) before laying out the sequence, the amount of different sequences suddenly becomes 718! / 365! as for every new way of laying out a sequence, the yellow tickets would have otherwise had 365! different ways of being ordered. If you paint over the other 353 tickets with blue, the amount of different sequences becomes 718! / 365! / 353!
"
Thus, the chance to get heads exactly 353 times is 718! / ( 365! * (718-365)! ) / 2^718 which roughly equals 0.0269.
Now, to calculate the P-value, we add up ((the chance to win 365 times out of 718) + (the chance to win 366 times out of 718) + .... + (the chance to win 718 times out of 718 times)) + ((the chance to lose 365 times out of 718) + (the chance to lose 366 times out of 718) + ... + the chance to lose 718 times out of 718 times)) = 2 * ((the chance to win 365 times out of 718) + (the chance to win 366 times out of 718) + .... + (the chance to win 718 times out of 718 times)) which roughly equals 0.68. This is not low enough to conclude that the newer version does not have a winrate of 50% at all.
Let me know if anyone has any questions!
my weekend was getting a real low spot but sebastian lagues video just came up and saved the day :)
If a coding adventure comes out, you know you have to binge watch it immediately 😂
Please make another episode!! (Btw liked the length of the episode)
A sequel to my favourite coding adventure video!
You never disappoint
I've watched that first video along with that tom7 video countless times, and I'm so excited to see a sequel!
Very cool of you to narrate these experiments your cat made!
Watched the video, awesome as ever. I didn't even realize it was an hour long!
You've a very powerful talent Sebastian. I'm absolutely uninterested in chess or board games in general. Yet, when you make a video like this my fingers start to itch to create my own chess engine. There're very little other videos as inspiring and motivating as yours. I hope you are aware of the impact your videos have. Thanks so much for making these.
everytime you post i get immeasurably excited
Love these videos! It was very satisfying to watch it slowly get better and better 😄
I always look forward to your videos man.. This is a treat
I love this video concept so much!!
As a chess player and evolving programmer, you have officially just made my day!
btw PLEASE make a third video, this is so enjoyable to watch...
A follow up on one of the best coding videos. Thank you for your hard work and effort
i live for your cat b roll footage
oh my god ive been waiting for a followup for so long this has made my day, you even inspired me to make my own chess engine thank you!!
Man, your vidoes are production quality, your narrations are well written and structured. No sure why this isn't on master class. Please do keep up the good work.
I really like your "code improvement" videos, you learn so much just from watching the first version, but then wait! There is more!
Sebastian's videos have been such an incredible inspiration for me! They sparked my passion for programming my very own chess engine. And now, seeing him in action, making it even more efficient, is truly motivating me to strive for improvement and enhance my skills. Thank you so much, Sebastian! You've made my day incredibly special. Sending lots of love and gratitude from India. ❤
18:24 tl;dr The probability that the pawn end table improves on the old code is 67.3%, and the probability it makes things worse is 32.7%. The win rate with the pawn end table is probably below 51% (it is unlikely to be a significant improvement).
Using Laplace's rule of succession, the estimated win rate using the pawn end table is
(wins+1) / (wins+1 + losses+1)
= 366 / (366 + 354)
= 50.83%
However, to get the probability that the pawn end table represents an improvement, we need to find the probability density function (pdf) for the win rate and then integrate over the region representing an improvement.
The prior for the win rate x is uniformly distributed on the interval [0,1]
prior(x) = 1 if 0
It's always a good day when Sebastian posts a video. And always a terrible month afterwards, as I get inspired fo start a similar project and innevitably fail.
I really enjoy coding various things/projects and watching your videos really is enjoyable, as i can see how various things work, watch an effective planning process, and do all of this with pleasant visuals and audio. the editing and overall content of these videos is really great, and i cant wait for more videos. thank you for making such great content!
18:24 You wanted to know if this is better or just random. Well, I calculated the P-value(*) using a two-tailed Binomial test(**) for every version you tested:
Match | Wins-Loss | P | Better?
V2b vs V1 | 409 vs 270| 1.1e-7 | YES
V3 vs V2b | 399 vs 387| 69% | Random
V3_64vs V2b | 386 vs 246| 2.8e-8 | YES
V4 vs V3_64 | 433 vs 237| 3.3e-14 | YES
V5b vs V4 | 337 vs 273| 1.0e-3 | YES
V6 vs V5b | 393 vs 310| 1.0e-4 | YES
V6b vs V5b | 298 vs 309| 68% | Random
V6c vs V5b | 385 vs 276| 2.6e-5 | YES
V7 vs V6c | 365 vs 353| 68% | Random
V8 vs V7 | 415 vs 325| 1.1e-7 | YES
V9 vs V8 | 347 vs 367| 48% | Random
V9b vs V8 | 396 vs 341| 4.7% | Random (debatable)
V10 vs V9b | 447 vs 275| 1.6e-10 | YES
V11 vs V10 | 477 vs 221| 1.6e-22 | YES
V12 vs V11 | 397 vs 238| 2.9e-10 | YES
The notation "2.8e-8" means "2.8*10^(-8)" that is 28 in a billion odds,
1e
(*)What does the P-Value mean?
It is the chance we would still get this result if the players were equally good. A lower P means more likely not random
(**)How do we calculate that?
Using a two-tailed binomial test (If you already know how that works you can stop reading now)
Our null hypothesis (which we hope to debunk) is that both players are equally likely to win. (I ignored all draws, and looked only at the games with a winner)
The P-value is the likelihood that we would still get this result or something less likely if our Null hypothesis is true. So for the 365 wins and 353 losses, the P-value is the chance of getting either 365 wins and 353 losses, 365 losses and 353 wins, 367 wins and 352 losses and so on.
This is a "two-tailed" binomial test, and you can look up the formula for calculating that on Wikipedia.
A low P-value means that it is unlikely (But still possible) that we would get this result if the players are equal. I decided that if P
Nice explanation :) It's interesting to see how different people deal with draws - I kept win/draw/loss-stats separate, some people counted them as 0.5 of a win, and you've left them out completely! I personally think it's better to include them somehow, because in the case of 6vs7, the wins slightly increased, draws significantly decreased, and the losses slightly increased! In fact from my calculations, the draw-stats were the only one of the three that *was* significant!
I also got 68% as P-value for the V6b vs V5b matchup
hi
i'm a stats student, and i also ran some of these calculations when he mentioned it. if someone doesn't mind explaining, why use a two-tailed test (the proportion of wins != 0.5) instead of a right-tailed test (the proportion of wins > 0.5)?
@@soupo-sandwichone-tailed is more logical here. i also think using a 1-prop z test and interval would be more appropriate but i guess they'd yield similar results
I've been coding for a while and worked on a few big projects, but I've never really relied on bit manipulation to keep track of and manipulate data. This video was amazing, and it greatly increased my appreciation for binary operations and bit manipulation. Even though I understood what it was before I never really liked it, but now I'm definitely going to make a point to use it in my next project. Fantastic stuff as always, can't wait for the next one.
"makes more sense than En Passant at least"
En Passant was added because moving pawns 2 squares was added to speed up the early game.
Originally pawns could only move one square, so it was impossible for pawns to "jump past" each other to avoid capture. En Passant was added to allow for pawns to retain their same amount of control while keeping the speed of the early game fast.
One thing you should impliment is pondering.
Pondering is when the engine uses the opponents move time to think of moves the opponent could make and the best reply to the opponents move if they do make it.
This can help with prepopulating your search tree with possible moves and allows you to search deeper durring your time if the opponent does make that move.
i want a new episode of how computers work so bad! i will do anithing
I want to present like you do; you make it look a lot easier than it is, while explaining the key areas and skipping all the trivial things. And letting your audience have a small snicker now and then.
I know literally nothing about coding, but im still watching all the way through bc this guy makes it entertaining
Awesome video. Very entertaining and informative at the same time. I thought 20 minutes have passed and then the whole video ended. Really enjoyed the explanation on every addition. Keep it up!!
I feel the most major improvement to make here is having CAB care a lot more about the safety of it's king, and the vulnerability of the opposing king.
Telegram the above username, got a package for you📦🎉...
Wow! I've rewatched the first part many times and now its finally time for the sequel. Good job, I can't wait for the next part!!
An interesting challenge would be to make a non-random bad bot, that feels like a human beginner.
Maybe have a very low depth or if it finds a “killing move” have a likely probability it won’t play it.
I love how as soon as I decide I want to start to work on a chess engine, youtube recommends me 2 videos of someone completely recking anything I could imagine to do in my entire life in 2 months
Loved this video, please more uploads just like this! :)
Beatiful video and amazing content. Thank you so much!