The Clever Way to Count Tanks - Numberphile

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 1,8 тыс.

  • @numberphile
    @numberphile  3 месяца назад +174

    See brilliant.org/numberphile for Brilliant and 20% off their premium service & 30-day trial (episode sponsor)

    • @rugby7381
      @rugby7381 3 месяца назад +11

      The video: 38 minutes ago
      The comment: 1 day ago
      *_time travel confirmed?_*

    • @rmsgrey
      @rmsgrey 3 месяца назад

      @@rugby7381 It's standard for videos to be uploaded to RUclips some time before they go live to everyone, so the uploader and, not infrequently, also patrons, channel members, or other privileged people who get given a link to the still private video can comment on it before it's published.

    • @Electronieks
      @Electronieks 3 месяца назад

      @@rugby7381was private yesterday

    • @Electronieks
      @Electronieks 3 месяца назад +3

      Send this video to Ukraine 🇺🇦

    • @jerredhamann5646
      @jerredhamann5646 3 месяца назад +2

      Its likely they used that method but a less math way of doing it is permissable. one ur going to be sending spys and spy planes to bases storage yards and depots and since a lot these things are big and in the open and since you can only form the number of tank units u have tanks for u likely have a decent count of the number of units they have at x time. if u know the serial numbering system of the enemy, then the rise in the serial numbers over time from captured equipment will tell u their rates if last month the highest serial numbers in the low 1500s but now they are in the upper 1700 it doesnt take a math phd to figure out ur looking at about 270 tanks also since the serial numbers tell number date and location it tells u something more important the lag in their logistical system. If u know how long it takes for the enemy to make and move stuff u can predict movements and actions to some degree

  • @dowesschule
    @dowesschule 3 месяца назад +5485

    You didn't just pull out the first and last, but also the middle tanks 15&16!

    • @AndreasHontzia
      @AndreasHontzia 3 месяца назад +159

      And 23. Iluminati!!!

    • @shripalmehta
      @shripalmehta 3 месяца назад +75

      there's a mathematician!

    • @docsigma
      @docsigma 3 месяца назад +212

      Thats’s Numberwang!

    • @tonelemoan
      @tonelemoan 3 месяца назад +31

      SPOILER ALERT!11

    • @ilonachan
      @ilonachan 3 месяца назад +111

      the luckiest draw at the unluckiest time!

  • @GrimOrdnance
    @GrimOrdnance 3 месяца назад +259

    I adore the fact that you left the initial pull in the video, because that is the truth in probabilities. I appreciate your videos!

  • @adsilcott
    @adsilcott 3 месяца назад +1411

    6:33 I love the way the turrets are pointing at their actual positions in the number line :)

    • @Denis_Bobrov
      @Denis_Bobrov 3 месяца назад +22

      Oh, I didn't notice it )

    • @LoveDoveDarling
      @LoveDoveDarling 3 месяца назад +35

      And how the treads are in motion on the tanks. Editor going above and beyond. Bravo!

    • @taeliantalittia612
      @taeliantalittia612 3 месяца назад +6

      4:47

    • @miketothe2ndpwr
      @miketothe2ndpwr 3 месяца назад +8

      It's such a little detail for nerds. Love it as well

    • @LoveDoveDarling
      @LoveDoveDarling 3 месяца назад +5

      @@miketothe2ndpwr I don’t think it’s exclusive for nerds. It’s for anyone who pays attention at details to appreciate.

  • @courtney-ray
    @courtney-ray 3 месяца назад +512

    At 6:36 you were right on! The gap below your minimum observation WAS equal to the gap above the maximum observation and the true number of tanks!

    • @vez3834
      @vez3834 3 месяца назад +18

      Amazing accuracy!

    • @IDNeon357
      @IDNeon357 Месяц назад

      The tank serial numbers were all encrypted by both allies and axis powers making this story entirely false.

    • @NTelling
      @NTelling 25 дней назад +6

      @@IDNeon357 He addresses that in the video. He said the encryption was cracked.

    • @SpydersByte
      @SpydersByte 24 дня назад +3

      @@IDNeon357 lol what? First of all he said they deciphered the coding they were using but also how would you know this? and why do you say it like its established fact when it clearly isnt?

    • @SpydersByte
      @SpydersByte 24 дня назад

      yea Im surprised he didnt really point that out :D

  • @redryder3721
    @redryder3721 3 месяца назад +3463

    I know it's irrelevant, but there's the old joke about letting three sheep loose in a field, but first labelling them "1" "2" and "4" so the person rounding them up spends ages looking for the 3rd.

    • @agranero6
      @agranero6 3 месяца назад +68

      I read about this prank in the book Show Me How or More Show Me How.

    • @SunroseStudios
      @SunroseStudios 3 месяца назад +87

      it's vaguely relevant!

    • @SwedishNeo
      @SwedishNeo 3 месяца назад +180

      It would also make sense in this case since the Germans wanted to make the appearance that they were building more tanks than they actually were. As such they could have skipped a couple of number in their serial. But I guess it would create to much chaos for the German mind to handle. xD

    • @hakanl2585
      @hakanl2585 3 месяца назад +124

      MI5 officer Peter Wright wrote in his book Spycatcher that MI5 bugged the Soviet embassy in Ottawa. So MI5 market all listening cable with number 1 and up. But in
      case Soviet would find these cable MI5 omitted some number hoping that Soviet what almost have to tear down the embassy in order to find the missing number.
      ( But trick did not work since Soviet had some spy within MI5 informing Soviet how many cable and what number they had. So Soviet never searched for the omitted
      number. )

    • @h.a.9880
      @h.a.9880 3 месяца назад +130

      ​@@SwedishNeo "New orders from Berlin: We are to skip a few serial numbers when imprinting parts, so our tank production looks bigger than it is..."
      - "But zat will bring dizorder to mein numbers!"

  • @funnygeeks8126
    @funnygeeks8126 3 месяца назад +638

    1:22 "after the war, the allies can go into those tank making factories"
    I like how knowing if our math was right is more important than having won the war.

    • @trueriver1950
      @trueriver1950 3 месяца назад +10

      Of course!

    • @nekrataali
      @nekrataali 3 месяца назад

      The Cold War immediately took over international politics following WWII. Once the Allies realize the math was correct, it changes how they both conduct espionage and counteract it.

    • @michaelwright2986
      @michaelwright2986 3 месяца назад +38

      Always preparing for the next one.

    • @vinterskugge907
      @vinterskugge907 3 месяца назад +64

      ​@@michaelwright2986Or, as they say, "The generals are always fully prepared for the previous war".

    • @mattiasthorslund6467
      @mattiasthorslund6467 3 месяца назад +36

      To the mathematicians, checking their math was the motivation to win the war

  • @art1099
    @art1099 3 месяца назад +4307

    No war thunder sponsor? Missed opportunity

    • @Nick-the-fox
      @Nick-the-fox 3 месяца назад +92

      THis is targeting a different audience
      It's like a opera gx sponsor on a non gamer channel

    • @williamnathanael412
      @williamnathanael412 3 месяца назад +32

      What is war thunder

    • @Sjobling
      @Sjobling 3 месяца назад +312

      ​@@williamnathanael412 If you'd typed that into google instead of the RUclips comments, you'd have an answer immediately. But now, you have a sarcastic response 7 minutes later instead.

    • @serinat_1408
      @serinat_1408 3 месяца назад +71

      Right here I have a bag of german tanks! Do you know where you can also find German tanks? WAR THUNDER!!!!

    • @alexscriabin
      @alexscriabin 3 месяца назад +19

      ​@@Nick-the-foxDude what is an "anti-gamer channel"? Is it just one that reports on game devs being overworked at fromsoft or that was anti-gamergate ten years ago?

  • @bittencourt16
    @bittencourt16 2 месяца назад +27

    I've just simulated 10000 of this operation for number of tanks less than 100 and number of guesses between 10 and 50, and using the maximum value as the total of tanks gives 4 on an average error, while using the maximum value + average gap leaves 2.6 as average error. That method is simply 151% more precise!! Amazing!!!

  • @polyaddict
    @polyaddict 3 месяца назад +2396

    I love how british "they have a bit of a spy" is

    • @reidflemingworldstoughestm1394
      @reidflemingworldstoughestm1394 3 месяца назад +63

      It's not just a British thing. Sometimes I have myself a bit of a spy as well.

    • @diamondsmasher
      @diamondsmasher 3 месяца назад +75

      Personally, I have a bit of a lookey-loo

    • @dualcrocadile
      @dualcrocadile 3 месяца назад +20

      Sounds like a Karl Pilkington story

    • @Rubrickety
      @Rubrickety 3 месяца назад +9

      I'm glad I had a bit of a spy before making this exact same comment.

    • @bobknip
      @bobknip 3 месяца назад +8

      A bit of a stickybeak

  • @EchosTackyTiki
    @EchosTackyTiki 3 месяца назад +219

    In arms production it's fairly common for factories to assign serial number ranges to particular products in advance, so the serial number ranges having gaps within them is relatively normal. It's also normal for them to start production at something like 10,000 if they expect to make in the tens of thousands of that particular item, that way they all the items are serialized, but they also maintain the same number of digits in their serial number for uniformity without using a bunch of leading zeros. Overrunning that serial range usually results in a letter prefix or suffix being added.

    • @halfsourlizard9319
      @halfsourlizard9319 3 месяца назад +3

      By what metric is that better than using leading zeros? Or, why the aversion to leading zeros? (Also, why not just use GUIDs? Fixed size, convey identity but no other information, never going to run out.)

    • @mnxs
      @mnxs 3 месяца назад +14

      ​@@halfsourlizard9319As for the GUIDs, because the use of serial numbers for arms predates the invention of GUIDs by 100+ years. So, in other words, tradition - why change when you already have a perfectly workable scheme.

    • @cidiousblack2136
      @cidiousblack2136 3 месяца назад

      @@halfsourlizard9319 When creating records people will often omit leading zeros when recording numbers possibly out of laziness, possibly by convention. Forcing the leading digit to be a non-zero digit prevents this deletion from happening,
      Why care about leading zeros? The zeros still have meaning. For instance the number of digits present can be helpful in indicating that a number in a record is a serial number specifically. Further whenever number codes get concatenated it's important to not omit digits or this will change the shape of the number code, i.e. if the serial number were a concatenation of year-month-number. Granted concatenated codes should be dash separated or similar, But if we can't trust the clerk to put the leading zeros on the number, why would I trust the clerk to bother writing dashes between numbers.

    • @AmiiboDoctor
      @AmiiboDoctor 3 месяца назад

      It's normal now... but it wasn't normal then

    • @gaiamission7200
      @gaiamission7200 3 месяца назад

      ​@@AmiiboDoctor It was more normal than actually. Sequential serialization is fairly rare

  • @fatsquirrel75
    @fatsquirrel75 3 месяца назад +1077

    Pointing out that lower numbers are more likely is such a good observation. Brady keeps highlighting his genius video after video.

    • @hylen26
      @hylen26 3 месяца назад +77

      I don't know about genius but he does ask some excellent questions.

    • @rjwiechman
      @rjwiechman 3 месяца назад +71

      As my late Father would have said, "Not cessinarily!". It is also true that more of the lower numbered tanks would have been destroyed or broken down and replaced and no longer in service.

    • @Yggdrasil42
      @Yggdrasil42 3 месяца назад +20

      ⁠Exactly. Another type of survivorship bias.

    • @freitchetsleimwor2406
      @freitchetsleimwor2406 3 месяца назад +5

      So the number line does not reflect a set of equally likely observations. Some of the serial numbers that are not yet observed are less likely to be observed than others.
      I think I am understanding this right, the not yet observed numbers between the maximum and minimum have a higher average probability of being observed than that of the numbers outside the bounds. And if the biases don't cancel each other out, the prediction is skewed. I'm sure this is a well known probability thing I'm just working this out

    • @boggisthecat
      @boggisthecat 3 месяца назад +5

      It’s a fairly obvious observation, I think. The mathematics being shown assumes that all objects appear at once, so no temporal complications. Presumably the mathematicians engaged in this work factor in the production dates where they were known.
      Another confounding problem is repair or rebuild. For example, Russia is taking old tanks and rebuilding them into modern configurations. So these tanks are not entirely produced from new - but serial numbering is going to be a mix of old and new, dependent upon components. (It’s very complicated in this case, because there are multiple variants and changes between foreign and domestic components. We know how many thermal sights Russia bought from Thales in France, but don’t know how many domestic equivalents are being produced, as an example. So if you get a Thales serial number it’s somewhat useful, but domestic ones require some time to aggregate the data. If you can’t capture enough data then it’s not going to work, but then there are other more obvious reasons for why the information isn’t necessarily helpful in this case.)
      ‘Spys’ typically rely upon stuff like observing rail shipments. This can be gamed (which Russia has a long history of doing, because they aren’t fools) to feed false information to your opponents, however. Serial numbers are much more solid, provided you can make sense of the systems being used. These are kept very secret, unsurprisingly.

  • @LeonMatthews
    @LeonMatthews 3 месяца назад +113

    For several of my clients we incremented the serial number by some prime, rather than one, than in order to obfuscate the output somewhat. It also gave us some degree of parity checking on serial numbers later. Silly, really, but fun.

    • @accountxabcdef
      @accountxabcdef 2 месяца назад +8

      I would use a hash function. A secret number placed after the normal serial number, and then hash it and then use it as official serial number. Then every unit has its own official serial number, you have the secret and you can look it up, what the real serial number was and nobody is able to guess any different valid number. Even if he knows every number (except the secret) and your algorithm to create them.

    • @dafrandle
      @dafrandle Месяц назад

      @@accountxabcdef
      I would use a uuid and convert it to numbers via a bespoke translation - just have a check to avoid the rare collision

    • @AndreyCizov
      @AndreyCizov Месяц назад

      isn't it quite easy to figure out that all numbers are incremented by a prime number?

    • @accountxabcdef
      @accountxabcdef Месяц назад

      @@AndreyCizov
      You would need to see a few machines bought at the same time. You can not trust, that there will be all numbers used. Often there is a gap when an updated version is used. And when at that time the prime is changed, have fun to reverse engineer the prime...
      There will be enough people who are able to spot it or even reverse engineer it, but that number shouldn't be that great (depends on the batch size, amount sold to individual customers and prize - more expensive it's more reward to get something free as warranty).

  • @MuffinsAPlenty
    @MuffinsAPlenty 3 месяца назад +757

    Watching James Grime explain mathematics is such a joy.

    • @perplexedon9834
      @perplexedon9834 3 месяца назад +12

      All my homies love James Grime

    • @fariesz6786
      @fariesz6786 3 месяца назад +9

      he's just that fun mixture of adorable, approachable, nerdy, and just proficient in his job

    • @stapler942
      @stapler942 3 месяца назад +5

      Due to Siivagunner I have this mental image of him approaching menacingly to tell me about *e*.
      But I agree, he is a joy to watch.

    • @derhesligebonsaibaum
      @derhesligebonsaibaum 3 месяца назад +3

      yeah, he always seems to have so much fun doing it

    • @warp9988
      @warp9988 3 месяца назад

      Making Math awesome.

  • @bjrnstrottman5637
    @bjrnstrottman5637 3 месяца назад +21

    My first instinct was to use the Central Limit Theorem to assume that the sample mean would approximately equal the population mean. Since we know the distribution is uniform and the population mean of a population of size n is (n+1)/2, twice our sample mean minus one should approximate the population size.
    Here our sample mean was 17, so this method of estimates the population size as 2(17) - 1 = 33.

  • @mark97199
    @mark97199 3 месяца назад +1011

    This only works of the serial numbers are sequential. Knowing this, the US named the the third SEAL team "SEAL Team 6" to confuse Soviet intelligence.

    • @penfold-55
      @penfold-55 3 месяца назад +117

      And if you know where they start. For example, if the serial number was a date, this just wouldn't work (even though the numbers are sequential, they are not consecutive)

    • @AbstruseJoker
      @AbstruseJoker 3 месяца назад +68

      Dates would still reveal some info about how many tanks there are

    • @chickenwheel45
      @chickenwheel45 3 месяца назад +47

      He mentions that there's an encoding on top of this

    • @Sp4mMe
      @Sp4mMe 3 месяца назад +31

      Yeah, real world probably has a lot of further problems. Like what if one month all new tanks go to front X, one month they all go to front Y, and your information and rate of capture/observation is different, for example ... ?
      But then, you might also have some rough indications from observation planes or train schedules or something that might help correlate some gaps in your data. Of course, there might also be decoys and whatnot ... well, I'm sure a lot can be done there.

    • @BenjaminGatti
      @BenjaminGatti 3 месяца назад +16

      Serial numbers are by definition subset of a series. You need to know the series.

  • @user-fi4zi5il9z
    @user-fi4zi5il9z 3 месяца назад +8

    His enthusiasm is so contagius and it's so cool! the formula is surprisingly simple!

  • @jameswkirk
    @jameswkirk 3 месяца назад +665

    A company I worked for made computers & peripherals and used 64 bit random serial numbers. They had multiple manufacturing sites, and calculated that the odds of selecting two identical numbers was smaller than human bookkeeping and errors trying to coordinate multiple product lines.

    • @ragnkja
      @ragnkja 3 месяца назад +160

      So, like RUclips assigning video IDs, they decided that it was faster and more accurate to just check for duplicates, because the probability of the same number being assigned twice in the time it takes to check if it has already been used is extremely small.

    • @SaHaRaSquad
      @SaHaRaSquad 3 месяца назад +109

      ​@@ragnkja Even checking for duplicates would be unnecessary if cryptographic hashsums are used. The odds of getting randomly occurring collisions with them are so low that on average it would take much longer than the lifetime of the universe.

    • @Rivinwin
      @Rivinwin 3 месяца назад +14

      Lol, that's awesome. I love and hate it.

    • @Rivinwin
      @Rivinwin 3 месяца назад +51

      ​@@SaHaRaSquadYah, treat a huge range of numbers as a domain, split it into segments and assign a segment to each factory, ie. 64 bit number where the top 3 or 4 bits are specific to each factory, increment the value at each factory independently of eachother per product, assign a hash of that value as the product serial number 👍

    • @jurjenbos228
      @jurjenbos228 3 месяца назад +24

      Yep, if you use 64 bit numbers the probability of a single collision in the numbers starts to raise only after about 4 billion devices are manufactured. And even then: so what? Almost all numbers are unique.

  • @b1oodzy
    @b1oodzy 3 месяца назад +89

    I thought I was smart with my calculation of (1+15+16+23+30)/5x2 = 34 but this guy pulls out a giant sheet of paper and introduces probabilities.

    • @Ryanmathewsc
      @Ryanmathewsc 2 месяца назад +12

      My mind went to the same place. As the sample size increases, the average should approach the median number. I wonder if the methods in the video offer a meaningful improvement over simply doubling the observed average.

    • @_..-.._..-.._
      @_..-.._..-.._ Месяц назад +1

      The x2 part didn’t make sense to me hmm 🤔

    • @b1oodzy
      @b1oodzy Месяц назад +10

      @@_..-.._..-.._ The first part of the equation calculates the average which is 17. To calculate the maximum you'd need to do x2 to get 34.

    • @Exaspatial
      @Exaspatial Месяц назад

      Same here

    • @felipea.barretto7503
      @felipea.barretto7503 Месяц назад +2

      I did the same thing except I subtracted 1 to estimate 33. My reasoning is that if we have N tanks, all with equal probabilities, the expected average of the distribution is 1/N * (sum of 1 to N) = (N+1)/2 . Estimating this with the sample average μ, you get N = 2μ-1, which is why I subtracted the one.

  • @EXPLICITBG
    @EXPLICITBG 3 месяца назад +290

    Tanks for sharing

    • @volodyadykun6490
      @volodyadykun6490 3 месяца назад +2

      You know destroyers for bases, get ready for

    • @cubes_art7956
      @cubes_art7956 3 месяца назад +1

      Came here to say this.

    • @myc0p
      @myc0p 3 месяца назад +10

      I would like to extend my tanks to Ukraine 🇺🇦

    • @talananiyiyaya8912
      @talananiyiyaya8912 3 месяца назад +1

      Thanks*

    • @EXPLICITBG
      @EXPLICITBG 3 месяца назад

      @@talananiyiyaya8912 r/woosh

  • @otaviodiniz5934
    @otaviodiniz5934 3 месяца назад +134

    Man, it's 11pm local time, I'm awake since 4am, my week was a rollercoaster, I'm mad about my job, I'm dealing with a woman that is getting in my nerves, my bank account is zeroed, I'm tired and pissed...
    But for some reason, his enthusiasm telling this story made me happy instantaneously.
    Thank you for this, God bless you and your beloved ones. Got a subscription.

    • @hydra8sk
      @hydra8sk 3 месяца назад +4

      Keep it up! Better times are ahead pal

    • @connorkapooh2002
      @connorkapooh2002 3 месяца назад +8

      Bro, in the future you will stumble upon your comment and you'll remember where you are at now in your life. You've made it this far, you'll keep going

    • @sirllamaiii9708
      @sirllamaiii9708 3 месяца назад +2

      You need money brother? Any way i can help?

    • @arjanab6227
      @arjanab6227 3 месяца назад

      @@sirllamaiii9708such a kind Man U are bless you sir

    • @panthermodern6572
      @panthermodern6572 3 месяца назад

      Hope you're doing better now. And even if you're not, it's all gonna be alright ;)

  • @K_Forss
    @K_Forss 3 месяца назад +252

    My immediate thought was that the average of a random subset should be the same as the average of the whole, so the number of tanks should be twice the mean of the picked ones 2*(1+15+16+23+30)/5=34 for the first pick and 2*(3+10+15+18+24)/5=28 for the second. My guess is that they used multiple estimate methods and weighted the results depending on inherent uncertainties/errors of the methods

    • @sanandanojha2988
      @sanandanojha2988 3 месяца назад +21

      Yeahs that exactly what I was thinking! Although, I suppose that it might be more susceptible to outliers then the average distance method...

    • @journeymantraveller3338
      @journeymantraveller3338 3 месяца назад +18

      Same argument applies to the median. You can also get 95% confidence intervals for the mean and the median.

    • @Mayur7Garg
      @Mayur7Garg 3 месяца назад

      Why twice?

    • @yurie2388
      @yurie2388 3 месяца назад +12

      @@Mayur7Garg The average is roughly half of the total since you have both low and high numbers. Average tries to arrive at the middle point of the number set when all the numbers are unique and in series.
      (1+15+16+23+30)/5=17, which we know is too little since we have the number 30 in the series.

    • @Mayur7Garg
      @Mayur7Garg 3 месяца назад +8

      @@yurie2388 Basically it stems from the fact that the median and the mean would be identical for such a series. So if you know the mean, then you can use it like a median to assume that the final number is at twice the distance. But in that case, using the median in the first step directly is more appropriate. Also, one issue that I have with all these solutions including the one in the video is that they do not seem to work if the serial numbers do not start from 1 but from let us say 100.

  • @JS-mp7fy
    @JS-mp7fy 3 месяца назад +13

    I did this exact maths problem at high school in 1991, what a real blast from the past! Thank you!!!

  • @Limrasson
    @Limrasson 3 месяца назад +798

    His reaction to tank 30 immediately raised suspicion and I would have said "yeah, that's 30 tanks in the bag."

    • @dewhi100
      @dewhi100 3 месяца назад +60

      Yep "Tank 30, oh, hmm, interesting..."

    • @PixelPhobiac
      @PixelPhobiac 3 месяца назад +2

      🤣

    • @Alex-ff8si
      @Alex-ff8si 3 месяца назад +1

      300th like

    • @roffie
      @roffie 3 месяца назад +1

      30 got the dinks

    • @cubexyz199
      @cubexyz199 3 месяца назад +4

      I'm on the spectrum and I still cannot see it

  • @gustavakerman2566
    @gustavakerman2566 3 месяца назад +200

    Alternative title: Local British mathematician gets blindsided by sheer stupid luck

  • @macdofglasgow772
    @macdofglasgow772 3 месяца назад +60

    Excellent. I did laugh at the #1 and #30 thing. Always like Dr Grimes in these videos, I could listen to him just tel me interesting stuff all day.

    • @TheEvilCheesecake
      @TheEvilCheesecake 3 месяца назад

      It's just the one Grime actually

    • @chriswebster24
      @chriswebster24 3 месяца назад

      He was probably talking about him and his brother, together, the Dr. Grimes. His brother is a gynecologist.

  • @Wagon_Lord
    @Wagon_Lord 3 месяца назад +4

    I heard this story ages ago, but never understood how it worked. That "flipping the number line around" line makes so much sense; so simple once the trick's revealed. Lovely!

  • @jamesterwilliger3176
    @jamesterwilliger3176 3 месяца назад +551

    Spies be like "tank you very much" but the mathematicians be like "tanks but no tanks"

  • @caiocc12
    @caiocc12 3 месяца назад +4

    There's a thing called "fixed-format cryptography" which can be used to make sequential numbers look random. The nice thing about it is that the encrypted number is in the same domain as the plain number (i.e. the original numbers range from 0 to say, 1 million, the encrypted numbers will also be in that range), so the attacker doesn't know they are encrypted and thinks it's just a plain sequential number. I've used that to protect against brute-forcing IDs on a system, while keeping the IDs short enough to be encoded as a barcode

  • @rPuck
    @rPuck 3 месяца назад +40

    Tanks for sharing!!!

  • @lindhe
    @lindhe 3 месяца назад +1

    James is so good! Always a great video when he's in. Also: he always looks happy, even when picking bad samples.

  • @molieros
    @molieros 3 месяца назад +179

    James: There are 30 German tanks in the bag.
    Chuikov: We were aware of that.

    • @Alex-ff8si
      @Alex-ff8si 3 месяца назад +1

      50th like + first reply

    • @rogerxiao4458
      @rogerxiao4458 3 месяца назад +3

      Krebs: That seems unlikely.
      (Downfall movie reference if you don't get it.)

    • @TheBrad574
      @TheBrad574 3 месяца назад

      Someone read Cornelius Ryan's The Last Battle and his interview with Chuikov.
      I just noticed someone mentioned Downfall too. The book is the source material.

  • @stco2426
    @stco2426 3 месяца назад +5

    Cool. When I was studying population biology we were given a task to work out the number of taxis in a city and we used the capture, mark, recapture method, using the taxi number, rather than marking anything. So, just noting the numbers in a given time (capture and 'mark') and then noting the numbers in a given period, which was later (recapture v not seen before). There are all sorts of sample to population complexities and improvements to the estimate with longer observations (but issues with recounts if the obs period is too long). Also, an improvement if a third count period is used.
    I wonder if there are any seminal capture, mark recapture examples that Numberphile might comment on and re-create on brown paper?

  • @TheDuckofDoom.
    @TheDuckofDoom. 3 месяца назад +26

    I just tell the german book keeper that I think his records are sloppy, and he shows me all of his work to prove me wrong.

    • @MrZauberelefant
      @MrZauberelefant 3 месяца назад

      That was in a movie, wasn't it?

    • @lukasskymuh5910
      @lukasskymuh5910 3 месяца назад +3

      This would never work! .... not unless he plays war thunder...

  • @eshed
    @eshed 3 месяца назад +19

    I used this method with serial numbers of accordions made in the late 30s by Hohner, a German company. Now I have a spreadsheet named "The German Accordion Problem" with more than 150 rows.

    • @dragoncurveenthusiast
      @dragoncurveenthusiast 3 месяца назад

      Cool!
      So, how many did they produce per month?

    • @eshed
      @eshed 3 месяца назад +2

      ​@@dragoncurveenthusiast
      Unfortunately they didn't mark the month in the serial number, but fortunately they didn't restart every month either.
      That means I could estimate the total number of accordions with serial numbers between 1934 and 1940 to around 860000.

    • @crownhouse2466
      @crownhouse2466 3 месяца назад +1

      @@eshed Thats a lot of accordions

    • @JtotheAKOB
      @JtotheAKOB 2 месяца назад +1

      @@eshed you sure, they did not encode them, so their counter Accordion producers can not estimate the amount of accordions? :P

    • @eshed
      @eshed 2 месяца назад

      @@JtotheAKOB I'm relatively certain.
      Out of the 150, I have ~20 serial numbers for which I also know the actual production date. If you plot the numbers vs the dates, you get a lovely almost linear (R^2=0.995) graph. The only way I can think of to get this relationship while preventing accurate estimates, would be to randomly skip numbers with a constant probability.

  • @MichaelDoornbos
    @MichaelDoornbos 3 месяца назад +61

    I love the "German Tank Problem." There's a great video on RUclips showing this method of counting the Commodore 1571 Disk Drives. Using this technique for "other real-world problems" is a fun exercise.

    • @Grunchy005
      @Grunchy005 3 месяца назад +8

      Upvote for Commodore 1571

    • @thekinginyellow1744
      @thekinginyellow1744 3 месяца назад +1

      Wow, not even from 8-bit guy!

    • @LuisRamos-jg1gf
      @LuisRamos-jg1gf 3 месяца назад

      What's the video called? 😊

    • @ampulka
      @ampulka 3 месяца назад +1

      found it: "How Many Commodore 1581 Disk Drives? The German Tank Problem"

  • @Ring_Zero
    @Ring_Zero 3 месяца назад +5

    We're using similar techniques with serial numbers to investigate production numbers for relatively rare camera models from the early 1970s.

  • @S1nwar
    @S1nwar 3 месяца назад +108

    4 8 15 16 23 42... he literally started drawing half the LOST numbers i was on the edge of my seat

    • @GunNNife
      @GunNNife 3 месяца назад +25

      Using the formula from this video, the Lost tank bag has 48 tanks.

    • @gabor6259
      @gabor6259 3 месяца назад

      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42
      4 8 15 16 23 42

    • @ethanbuttimer6438
      @ethanbuttimer6438 3 месяца назад +2

      Haha me too I literally just finished watching an episode

    • @Cerzus
      @Cerzus 3 месяца назад +2

      I was looking for this comment

    • @FilmscoreMetaler
      @FilmscoreMetaler 3 месяца назад

      ​@@GunNNife Too bad it's not 108

  • @1_in_8billion
    @1_in_8billion 7 дней назад

    Hey everyone, I just started learning how to use octave and just for kicks I made a program to do this very estimation. (Thanks for sharing Numberphile, this is really neat stuff!) Here's the *script* if anyone wants to fiddle around with it: (I've added a percent error so you can see just how remarkably accurate this estimation is!)
    actualNumberOfTanks = ceil(rand*1000);
    disp(["actual number of tanks: ", num2str(actualNumberOfTanks)]);
    totalPoolOfTanks = [1:1:actualNumberOfTanks];
    numberOfPicks = ceil(rand*100);
    disp(["number of picks: ", num2str(numberOfPicks)]);
    tankNumberPicks = [1:1:numberOfPicks];
    for pick = [1:1:numberOfPicks]
    tankNumberPicks(pick) = ceil(rand*actualNumberOfTanks);
    end
    disp("tanks randomly selected: ");
    disp(tankNumberPicks);
    estimatedNumberOfTanks = max(tankNumberPicks) + ((max(tankNumberPicks) - numberOfPicks) ./ numberOfPicks);
    disp(["Estimated number of tanks: ", num2str(estimatedNumberOfTanks)]);
    percentError = round(((estimatedNumberOfTanks - actualNumberOfTanks)/actualNumberOfTanks)*100);
    disp(["percent error: ", num2str(percentError)]);
    %;D

  • @reedjasonf
    @reedjasonf 3 месяца назад +264

    The disgust in Dr. Grime's voice at 2:24 when he says "I'm NOT going to let you feel the weight of the bag! [Are you daft?]"

    • @reidflemingworldstoughestm1394
      @reidflemingworldstoughestm1394 3 месяца назад +39

      And rightfully so. Who gets to heft a German tank factory during a war?

    • @robinsparrow1618
      @robinsparrow1618 3 месяца назад +10

      the time code you put is after the moment you're talking about

    • @WofWca
      @WofWca 3 месяца назад +9

      2:16

    • @PeterNjeim
      @PeterNjeim 3 месяца назад +17

      ​@@robinsparrow1618this is a common phenomenon I've seen over the years. Someone will watch the video, after watching a funny part, they click pause, then copy the timestamp, forgetting that this time stamp is after the clip

    • @hdbrot
      @hdbrot 3 месяца назад +3

      ⁠​⁠​⁠@@PeterNjeimMaybe OP edits it in. Let‘s hope for the best :)

  • @adrianv.v.4445
    @adrianv.v.4445 Месяц назад +2

    When calculating the average gap, you should just count the difference between tanks (e.g., between 15 and 1, count that as 14). That way, what you get is the actual (assimptotically) un-biased estimator of the number of tanks. If we do it your way, we get: MAX + (MAX - k)/k = MAX * (1+1/k) - 1, which when we let k->infinity, MAX->Acual_value and therefore we get the Actual_value - 1. It can be also proven that the estimator is biased for any k, outputting smaller values than the real one.
    If we don't add that -k to the formula (that is, we count the gaps as just the difference), the estimator we get is MAX + MAX/k = MAX * (1+1/k), which is the actual un-biased estimator we should use in this case. One may call this the adjusted Maximum Likelihood Estimator (MLE). As you said in the video, the MLE is just the MAX (the number most likely to be right), but it is biased. What we did with this trick was, as you explained, correct it.
    A more standardized way to compute this correction would have been to calculate the Expected Value of the MLE we got, to then apply the necessary multiplicative correction. That is, if it is necessary at all (MLE might as well be unbiased itself). This is one of the most used methods for estimating stuff out in the real world (when we are able to get an MLE).

  • @bryan-nz
    @bryan-nz 3 месяца назад +36

    Have you ever done a video on "Hyper Log Log"? We use it in massive data systems for efficiently estimating the number of unique values. It is very interesting, and freakily accurate.

    • @EricKay_Scifi
      @EricKay_Scifi 3 месяца назад

      I've used that in BigQuery. APPROX_COUNT_DISTINCT is great for figuring out new data.

  • @ismbks
    @ismbks Месяц назад

    i missed this guy a lot, i remember binge watching his entire channel when i was in high school, brings me back

  • @brmolnar
    @brmolnar 3 месяца назад +45

    Seal Team 6 is named that to imply that there are at least 5 other Seal Teams. At least this is the common rumor.

    • @Laotzu.Goldbug
      @Laotzu.Goldbug 3 месяца назад +11

      This is actually true (at least according to Richard Marcinko's autobiography). Now presently there are well over six SEAL Teams (8?) but when Marcinko created a specialst SEAL unit in 1980 here were only two other ones, and "Seal Team 6" was a deliberate attempt at deceiving the Soviets.

  • @jbeckh2
    @jbeckh2 3 месяца назад

    This episode was great. If you come across more war history examples, please post them. My son loves war history and was fascinated by this. This helps to understand why math is important.

  • @mladengavrilovic8014
    @mladengavrilovic8014 3 месяца назад +51

    it would also make sense to calculate the average of the samples and multiply it by 2 as the average of consecutive numbers starting at 1 would be about n/2 and the average of the samples would also approach the same value.

    • @timseguine2
      @timseguine2 3 месяца назад +17

      Close. The average of the observations is an estimator of the mean of the serial numbers in the bag. You got that much right. But the average serial number is (n+1)/2. So you have to double it and then subtract one.

    • @raiseer
      @raiseer 3 месяца назад +5

      Was my first idea, too. They did basically the same with extra steps :)

    • @rianfelis3156
      @rianfelis3156 3 месяца назад +6

      The reason for those extra steps is that you usually have padding around the serial numbers, like just start counting at 1500 because the 15 means something else, and the last two digits are sequential. Which they did touch on, but not a lot.

    • @halbronk7133
      @halbronk7133 3 месяца назад +10

      This is the method I thought of too, but it turns out that the numbers you find other than the max aren't relevant. However many tanks there are, finding 1, 2, 3, 4, and 30 is the same as finding 26, 27, 28, 29, and 30 (as long as the serial numbers start at 1).

    • @timseguine2
      @timseguine2 3 месяца назад +6

      @@halbronk7133 "the numbers you find other than the max aren't relevant": This isn't precisely true. They are relevant in the sense that they produce a valid estimate for the maximum. The problem is that it ignores relevant information that we know about the problem (that the numbers are sequential without gaps). And usually when you don't use some piece of information to derive your answer then it is possible to do better.

  • @YEASTY_COMMIE
    @YEASTY_COMMIE 3 месяца назад +5

    If you take the simpler formula of twice the average value of the tanks, it actually gives better prediction in this case (34 and 28, if I can still perform additions)

  • @Demasx
    @Demasx 3 месяца назад +26

    This feels like one of those widely usable maths that I won't be able to find an application for anytime soon... then when the time comes, I'll remember there's a solution but not what it is 😅 Bookmarking it now for that future occasion, haha

    • @EdMcF1
      @EdMcF1 Месяц назад

      Ukrainians might find it useful

  • @dattaprasadgodbole
    @dattaprasadgodbole 3 месяца назад

    Every part of this video - from finding out the numbers to objections raised - was brilliant. I love this video.

  • @Canzandridas
    @Canzandridas 3 месяца назад +6

    Somewhere deep within my brain I'm pleased with this video because Dr Grime always reminds me of the young folk who went to ww2 saying they were adults when they weren't and this video is about tanks

  • @indranilroy4822
    @indranilroy4822 Месяц назад

    I always find it fascinating how these equations can be derived after rigorous application of a simple general concept, like at the beginning of the video you can feel that the frequency of smaller numbers (hence more smaller gaps) would affect the estimate but the quantifying part takes time to visualize in its precise form

  • @romansanders
    @romansanders 3 месяца назад +12

    Apple serial numbers were sequential until about 5 years ago. They even contained information about which factory produced the item and when.

    • @MrZauberelefant
      @MrZauberelefant 3 месяца назад +5

      They still should, trackability is vital information.

  • @bastawa
    @bastawa 3 месяца назад

    That was brilliant! your initial picks are exactly why it was so hard for me to grasp probability at school until I realized it is about multiple events and doesn’t work that great for a single event

  • @impossiblemission4ce
    @impossiblemission4ce 3 месяца назад +45

    First Enigma, now these tanks. Sometimes it feels as though James is gearing up for a time travel mission.

    • @talananiyiyaya8912
      @talananiyiyaya8912 3 месяца назад

      Obviously not...

    • @_invencible_
      @_invencible_ 3 месяца назад +3

      @@talananiyiyaya8912 nice try, MI6

    • @sandekv
      @sandekv 3 месяца назад +5

      He is winding down from one. He went there, helped Britain win, and came back.

    • @jimmyzhao2673
      @jimmyzhao2673 3 месяца назад +1

      @@sandekv He's slowly revealing that to us.

  • @SoCalFreelance
    @SoCalFreelance 17 дней назад +2

    I would have rather learned the actual story of how the tank calculation was accomplished.

  • @aleksihermonen9017
    @aleksihermonen9017 3 месяца назад +61

    I was thinking about taking the average and doubling it. The idea being that the average would be approximately in the middle of the true number, so double the average would be close to the true number.

    • @PsychoMuffinSDM
      @PsychoMuffinSDM 3 месяца назад +5

      That's what I did, lol.

    • @xerkules2851
      @xerkules2851 3 месяца назад +4

      Same here. That method gives very similar estimates in these examples.

    • @TomVennix
      @TomVennix 3 месяца назад +8

      I think you can improve this estimate by subtracting 1 at the end, since the average of the numbers 1 up to and including N is (N+1)/2 rather than N/2. Denoting the sample average by X, your idea is that X should be approximately equal to (N+1)/2, which would imply that N is approximately equal to 2X-1.
      I'm actually curious to see how this performs (in general) compared to the method presented in the video.

    • @akshaj7011
      @akshaj7011 3 месяца назад

      That wouldn't work if the serial numbers didn't start from 1

    • @aleksihermonen9017
      @aleksihermonen9017 3 месяца назад +4

      @@akshaj7011 That's true, but the average cap wouldn't work either if they take account to the cap from 0 to the first element.
      If the starting point would be unknown, i would probably use standard deviation in the same manner.

  • @RichardJBarbalace
    @RichardJBarbalace 3 месяца назад +4

    I think there may be a simpler and more accurate way to do the estimation. My first thought gave estimates of 34 and 28 for the two trials, beating Brady's estimates of 35 and 27.8 both times compared to the actual number 30. Assuming "everything is equal and random" (i.e., a uniform distribution), just take the average of the tank numbers and double it. This also balances all the potential gaps.

    • @ChristopheSmet123321
      @ChristopheSmet123321 3 месяца назад +1

      That is certainly a valid method as well, also unbiased (meaning on average you will be spot on). However, the "maximum plus average gap" method is more efficient, i.e., it has a lower mean squared error: the squared difference to the actual N will on average be smaller than using your method. And that is what you want from an estimator!

  • @LudicrousTachyon
    @LudicrousTachyon 3 месяца назад +10

    For electronics with network cards, companies are assigned ranges of MAC addresses as they are supposed to be universally unique. The range could allow one to estimate the number of devices they sell.

    • @trueriver1950
      @trueriver1950 3 месяца назад +2

      Life, including the Y-T algorithm, is strange indeed

    • @stargazer7644
      @stargazer7644 Месяц назад

      The operative words here are "supposed to be". And nobody says they have to be assigned sequentially. Each organizationally unique identifier (OUI) can create 16 million unique MAC addresses. And you can have more than one OUI.

  • @heinaung6967
    @heinaung6967 3 месяца назад

    Thank you Brady for making these videos, every time I watch it motivates me to do my job better as an engineer/computer scientist

  • @JaniLaaksonen91
    @JaniLaaksonen91 3 месяца назад +23

    Would make a nice graph plotting your best guess of total tanks, pulling one tank at a time. Any time you get a new biggest number the plot would jump up, and when you get smaller numbers it will slowly decend as your average gap gets smaller. It would jerk up and down, approaching the actual total number.

    • @virt1one
      @virt1one 3 месяца назад +1

      agreed that would be nice to look at, though you'd want a larger set than 30. should start out a as a line jumping up and down but rapidly smoothing out. After it calmed down a bit you could probably do a bit of "eyeball extrapolation" to get a more accurate estimate than the last prediction.

  • @monkeyboyDylan
    @monkeyboyDylan 2 месяца назад

    I came up with another way to estimate the number of tanks. Not sure which method is superior. First, I determined the general formula for the average from a set of numbers counting sequentially from 1 to N. I rearranged it to solve for N. Then you take your sample, determine the average and estimate N. Using the same samples as in the video I got N=33 and N=27. The average answer was 30!!
    I derived this as follows:
    N1=1
    N2=1.5
    N3=2
    N4=2.5
    N(x)= x-((x-1)*0.5))
    = x - (0.5x - 0.5)
    = 0.5x + 0.5
    Avg = (0.5x + 0.5)/x
    = 0.5 + 1/2x
    X = 2(Avg - 0.5)
    =2Avg - 1
    (23, 15, 16, 1, 30)
    Avg = 17
    X = 33
    3, 10, 15, 18 24
    Avg = 14
    X=27

  • @betabenja
    @betabenja 3 месяца назад +6

    6:24 scary camera pan

  • @RAFAELSILVA-by6dy
    @RAFAELSILVA-by6dy 7 дней назад

    I got a solution to this problem in a maths challenge about eight years ago. My approach used conditional probabilities and the expected number of tanks in the bag would be:
    N = MAX(k -1)/(k-2)
    For five observations (k = 5), this gives N = MAX(4/3). This is higher than the average gap approach, which gives N = MAX(6/5) - 1

  • @Diekyl
    @Diekyl 3 месяца назад +17

    At first, I was perplexed about the method of estimating monthly production with just serial numbers, but I am glad they explained they had a way to decode the month and factory of the tank as well. I assumed some of these numbers must have been intentionally hidden or misleading.

    • @suit1337
      @suit1337 3 месяца назад +2

      no, they were just contracted to different manufacturers and sub-models (Ausführung) and we're assigned specific number ranges
      the gearboxes, or rather specific the engines with the geartrain attached were often shared between different models, like the Panzer V Panther and Panzer VI Tiger shared the same engine platform, and only was different in minor details and power
      in the later stages of the war it was not uncommon to use what was in stock or repair tanks with parts from different models

  • @LeetHaxington
    @LeetHaxington 3 месяца назад +2

    Yeah a simple probability demo but this only works with the perhaps wild and reckless assumption that you’re dealing with linear incremented tanks.
    I didn’t know probability theories when I was 5 but even as a 5 year old I wouldve been hesitant about writing such an obvious pattern on my secret tank productions.
    Literally any information you provide can be used for something. Here its a serial number.
    Anyway just for example, you could use power of 2 as your serial. But they should have some type of encryption for military use. Even in ww2. And using ww2 logic they should’ve known yo have like a confusing and misleading numbering system. Like a fibbonocci multiplier equation that hits every digit and looks linear until you get the full picture.
    This would let you spot fake serial numbers in tanks as a bonus in addition to concealment.
    Instead of counting anyway they should’ve been looking at supply in and out of factories. They should’ve known not enough metal was physically delivered for their answers.
    You kind of tailed off explaining why the spies were so bad

  • @pallavinavin4988
    @pallavinavin4988 3 месяца назад +12

    Love ur passion, professor

  • @meownezz
    @meownezz 3 месяца назад

    Information and mathematics once again showing their overwhelming and seemingly timeless relevance. 🙂

  • @PhilBoswell
    @PhilBoswell 3 месяца назад +13

    RUclips recommended me a short video by Hannah Fry about this very thing just this morning: I don't recall how old the video was but life is strange!

  • @shaun7163
    @shaun7163 3 месяца назад

    This guy is the absolute best and has been for years!

  • @aksela6912
    @aksela6912 3 месяца назад +18

    OK, what about this: As the sample size increases, the average of the sample will approach the average of the population, so let's estimate the average like that. For a uniform distribution starting at zero the maximum is simply two times the average, but in this example the minimum is one, so we'll just subtract one from our average. Using this method I get 32 and 28 tanks, respectively.

    • @cryme5
      @cryme5 3 месяца назад

      Or double the median. It would have been 32 and 30. Not sure which is usually closer, I feel like you need a Bayesian analysis with a prior.

    • @aksela6912
      @aksela6912 3 месяца назад

      Although these specific estimates has less error than the ones presented by James, on average his method will be better, at least for larger samples. I did some simulations, and for small samples, say three, it's pretty close, but James' method has a lot more bias.

    • @cryme5
      @cryme5 3 месяца назад

      ​@aksela6912 Funny thing is, no matter the prior you use, the posterior probability of N (the total number of tank) is just the prior truncated starting from M (the maximum of the observed serial numbers). In other words, a Bayesian answer, no matter the prior, should only depend on M (not even on the number of samples).

    • @aksela6912
      @aksela6912 3 месяца назад +2

      @@cryme5 For a uniform distribution the variance of the sample median will be greater than the variance of the sample mean, and as mean and median should be the same it will be better to use the one with less variance. I have to reiterate though, sample mean times two is a poor estimator, even if it feels more intuitive, and it feels like you're utilising the collected data better.

    • @EebstertheGreat
      @EebstertheGreat 3 месяца назад +1

      @@aksela6912 James's method is unbiased. If you observe n tanks and the maximum value you observe is m, then the minimum variance unbiased estimator is m + m/n - 1. Your estimator of twice the sample mean minus one is also unbiased, but its variance is higher. And it doesn't use the important information of the sample maximum, which means the estimate might actually give a value we _know_ is too small.

  • @mateodemicheli2420
    @mateodemicheli2420 3 месяца назад

    Awesome concept of a video, I love how you explain each part slowly of the puzzle and the graphs, it helped a lot. Im sucribing right now

  • @GeekRedux
    @GeekRedux 3 месяца назад +8

    12:17 "But we broke that code, okay? That's another story." Well, now we've got to hear it! Enigma, or something else?

    • @TheBendermen
      @TheBendermen 3 месяца назад +1

      The Engine machines were for coded communications, I think. I think he meant that the serial numbers were coded, which isn't uncommon for different companies and favorites to have different ways of doing things

  • @FayCarllyle
    @FayCarllyle 3 месяца назад +1

    The way we communicate with others and with ourselves ultimately determines the quality of our lives.

  • @MangoJones139
    @MangoJones139 3 месяца назад +4

    I really like Brady's talent for asking "good questions"

  • @jokoluna6978
    @jokoluna6978 3 месяца назад

    This video is brillant! I knew about the story and always thought there is some really complicated math behind the scientists work. Nicely explained, thanks! :)

  • @Xelopheris
    @Xelopheris 3 месяца назад +19

    I literally saw the Hannah Fry video about this yesterday and kind of assumed that this would be a Hannah Fry numberphile video.

  • @sabinrawr
    @sabinrawr 2 месяца назад

    Brady's final questions show amazing insight. My favorite anecdote involves SEAL Team Six. There was not a 5, they just used the number to make people think there were more teams. I don't know if this story is true, but I like it and it shows that you have to know the parameters of the numbers instead of assuming a sequence starting with 1.

  • @AloisMahdal
    @AloisMahdal 3 месяца назад +9

    I keep coming back to the Brady's question at 11:41 -- if in my distribution, lower numbers are more likely, would there be an easy correction for that?

    • @forasago
      @forasago 3 месяца назад

      You would have to come up with a formula for how much more likely the lower numbers are and calculate some kind of upward bias out of that. I don't think the answer could be considered "easy", no.

    • @ArcaneOath
      @ArcaneOath 2 месяца назад +2

      For the purposes of war estimations, I suspect you'd find the opposite true, particularly as time goes on - the data would become skewed towards newer serials for everything, as older models were destroyed or made inoperable.
      Probably best to hash military serial numbers at manufacture time though, regardless.

  • @paladin656
    @paladin656 3 месяца назад

    I saw the thumbnail and thought this was going to be about counting takes on the move in a column or formation, but the history tie in made this really interesting. Thanks for this!

  • @spencerarmon4491
    @spencerarmon4491 3 месяца назад +7

    Would be cool to see the mathematical derivation of calculating the expected value of the tanks using an infinite sum of the probability at the beginning

    • @Last_Resort991
      @Last_Resort991 3 месяца назад

      Its not an infinite sum when is finite. It has N elements

    • @spencerarmon4491
      @spencerarmon4491 3 месяца назад +1

      @@Last_Resort991to properly calculate the expected value, it would be an infinite sum from the max number seen to infinity

  • @beal_a
    @beal_a 3 месяца назад +3

    IIUC, this is also a problem where frequentist and bayesian techniques arrive at different answers. I'd love to see an explanation of that.

  • @fespa
    @fespa 3 месяца назад +4

    Another great and entertaining video. Thank you. I would love to read the paper about the why the spies were so wrong.

    • @Jeff-jr4xw
      @Jeff-jr4xw 3 месяца назад +1

      Me too. I thought maybe they were being fed false information?

  • @bartz4439
    @bartz4439 3 месяца назад +1

    never stop your content! what a story!
    can you cover up breaking coding of serial number too please?

  • @kleddit6400
    @kleddit6400 27 дней назад +5

    1:51 “Is that a German tank or?” *every tank enthusiast goes oof*

  • @Ojisan642
    @Ojisan642 2 месяца назад

    James Grime is really a fantastic educator.

  • @WAMTAT
    @WAMTAT 3 месяца назад +4

    Nothing better than James talking WW2

  • @willywodka
    @willywodka 3 месяца назад

    Love the honest enthusiasm on this formula!

  • @Eddy002
    @Eddy002 3 месяца назад +3

    I think the “failed” demo was perfect since you had to explain not only how it works, but also where the formula fails.
    Reminded me of school. The teacher would teach the easiest way to understand something, but then on a test it would be the hardest example/use of that formula. School failed, numberphile succeeded.

  • @matmar10
    @matmar10 2 месяца назад +1

    This is such a fun video on many levels.

  • @EXPLICITBG
    @EXPLICITBG 3 месяца назад +9

    “I will do one”
    Lo and behold, one he proceeded to do

  • @svenlima
    @svenlima 3 месяца назад +2

    It's the same question we posed as kids: "How do you count a herd of sheeps?" - "You count the legs and divide the number by 4." At the time we found that funny.

  • @luketurner314
    @luketurner314 3 месяца назад +10

    10:24 speedrun

    • @I-md6mq
      @I-md6mq 3 месяца назад +4

      I saw this comment at 10:23, dang..

  • @filipgaming1233
    @filipgaming1233 2 месяца назад

    my first idea is that to assume that the range of serial numbers starts at 0 and you find the sample mean, then you can estimate the maximum value of the range by doubling the mean. assuming distribution is uniform, the mean of the distribution is at the midpoint, and doubling it gives an estimate of the maximum value ... my model did pretty well here: 34 and 29

  • @ventinor7451
    @ventinor7451 3 месяца назад +6

    Nothing like a James Grime Numberphile video.

  • @aivehn
    @aivehn 3 месяца назад

    Great overlap of math and history. Tanks a lot!

  • @sebastiandierks7919
    @sebastiandierks7919 3 месяца назад +6

    The British using spies, and the Germans orderly counting their gearboxes. How stereotypical! Nice presentation James Bond. Errr, Grime.

  • @axiezimmah
    @axiezimmah 3 месяца назад

    Before watching the video i had another method which i think works pretty well too.
    Because youre pulling random samples, the samples can be assumed to be distributed somewhat randomly along the whole range. So if you take the average of the numbers you have found, that can be assumed to be approximately the middle of the range. Multiply that by 2 and you should get close to the max of tbe range.

  • @nayhem
    @nayhem 3 месяца назад +3

    3:04 You lost the _Lost_ fans here.

  • @OsamaRana
    @OsamaRana 3 месяца назад

    This was a delight to watch, like all videos starring James.

  • @altf4218
    @altf4218 3 месяца назад +5

    Great video once again. Brilliantly explained by Dr Grime :D
    Now I'm tempted to do this with all the license plates I see on cars in a day, to estimate how many cars there are in my city. Although, I guess I would have to observe at least hundreds of license plates to get any meaningful results. I wonder also if you can construct statistical tests with p-values for these situations.

    • @TWX1138
      @TWX1138 3 месяца назад +4

      That may not work unless you know how the motor vehicle registration department issues plates, how geographically wide an area are covered by the issuing agency, what their policies on reissuing prior numbers after retirement of prior registrations are, and if they use any sub-ranges or have exclusion ranges.
      Where I live there are several license plate styles, and if one isn't applying for personalization of the text itself, each plate style has a different range of numbers associated with them. That sort of thing would need to be accounted for, in addition to the possibility that the plates are issued across the entire state.

    • @ragnkja
      @ragnkja 3 месяца назад +2

      @@TWX1138
      At least in Norway, it’s reasonably straightforward, since the two letters are fully dependent on where the vehicle is registered (or it being an electric vehicle, though nowadays electric vehicles can get geographical plates instead of E* plates if the first owner wants that), and the five digits are assigned consecutively (though do note that the first digit is never a 1).