Vsauce! Kevin here, and I’ve built a computer capable
of explaining how you get smarter. Out of these matchboxes, some colorful beads
and… Shrek. Real quick, before we get into the game we’re
about to play, I wanna tell you about the game that I’ve been playing. I partnered with Raid: Shadow Legends on this
video and if you follow me on Instagram you know that I’m a huge fan of RPGs. Well, Raid is the most immersive champion-collecting
experience you’ll get on a smartphone. It has a deep story, detailed graphics, giant
boss fights, and hundreds of champions to collect and customize. And you can play it free. So to check it out, download Raid using only
my link in the description to get 50,000 silver immediately and a free epic champion courtesy
of the dev team. Thanks again to them for supporting Vsauce2,
go check out their game, it’s amazing how far mobile gaming has come, Now let’s
get back to the inner workings of our game. Okay. 24 matchboxes, all filled with beads, and
covered in potential moves for the game we’re about to play and… this is our computer. Now, Shrek comes along and…wait. How is THIS a computer? Aren’t computers, just like, electronic
machines that run software? What IS a computer? Well, the earliest computer was YOU. Or… your ancestors. They used calculating machines like the abacus
to input information which output a result but we were the ones computing. The human operators of early calculating machines
were literally called “computers.” Okay, back to our matchboxes. Once we introduce a game board, Shrek and
the gang, this matchbox and bead setup processes our input, gives us an output and not only
that… it also learns. This is not just a computer, this is an artificial
intelligence machine capable of matching wits with the brightest minds humanity has to offer. At a game called Hexapawn. Here’s how. Hexapawn is based on chess — each player
has 3 pawns on a board with just 9 squares. The pieces move like chess pawns, too. They can go forward one space if that space
is unoccupied, if it is occupied by the opponent then they can’t go forward. Sorry, Donkey. You can, however, move diagonally, but only
to take an opponent’s piece. There are three ways to win: Get a pawn to the other side of the board. Take all of your opponent’s pieces. Or leave your opponent without a possible
move, like a checkmate in chess. Our setup works like this: I’ve got 24 matchboxes
here, and each one corresponds to the position of pieces on the board during that round. I’ve got my Team Kevin pawns vs. the computer’s
Team Shrek. And do you know what that means? That means that we’ve officially turned
Hexapawn into: Shreksapawn. Alright. Let’s play. The human, that’s me, always goes first. Wait. Why? Because recreational mathematician Martin
Gardner said so. He actually created Hexapawn and its rules
as a simplified version of a 304-matchbox computer called MENACE. 15 years after helping the British break the
Nazis’ codes in World War II, Donald Michie invented MENACE to learn how to master Tic-Tac-Toe. And now 59 years later, I’m on YouTube playing
Shreksapawn. Since I go first, my moves occur in only the
odd-numbered rounds. 1, 3, 5 and 7. Therefore, the matchboxes are grouped by possible
Team Shrek moves in rounds 2, 4, and 6. One of us is guaranteed to win before Round
8. So, Team Shrek has no Round 8 moves. Each box contains one colored bead for each
potential move on that board position. So like this first box has a green, a blue,
and a purple. And I’ve cut a hole at the bottom of the box
that will only allow one bead at a time to fall out. So I’ll just shake the box and let one bead
out. And it’s purple. That means if my pawn was here and it was
Team Shrek’s move, Team Shrek would make the purple arrow move. Like this. If a blue bead had fallen out, then Team Shrek
would’ve made the blue arrow move. And if it was a Green Bead then Team Shrek
would’ve made the green arrow move. And taken my pawn. Okay so that’s how Team Shrek will move. Team Kevin will move however I want Team Kevin
to move because I’m Kevin and then we’ll play back and forth until there’s a winner. Alright, Round : Fight!: I decide to move
Lord Farquaad forward. For Round 2 I now use. this box to determine Team Shrek’s move. So we’ll give it a shake. Woah! Let’s try that again. And it’s the green move. So Donkey moves forward. Now it’s my turn and I decide that, look,
I can just take Princess Fiona when I move diagonally and win the game. That’s it. Now here’s the important part. When Team Shrek makes a losing move, I remove
that bead from the box. That way the computer can’t make the same
bad move the next time that this situation comes up. By removing its losing beads, the computer
learns to play better. When Team Shrek does win, then instead of
removing the bead I’ll just put the bead back in the box. Okay, I’m gonna play a bunch of rounds now
and I’ll keep track of wins over here, with a K when I win and I’ll write an S for a Shrek
win. Here we go. Okay, I’ve played 14 games. I started off winning a lot more than I was
losing… and then things changed. Out of the last 7 games, Team Shrek has won
6 of them. The computer is clearly getting better at
the game… but is it really learning? I mean, I’m just taking beads out of matchboxes
how is that learning? What is learning? At the most basic level, learning is acquiring
new knowledge or a new skill, or modifying an existing behavior. Every time I take a bead out of a matchbox,
the computer loses a behavior that leads it to an outcome of failure. That increases the probability that the computer’s
move each round leads it to success — which in our case, is winning Shreksapawn. After a sufficient number of games, the computer
will evolve to play perfectly. My Team Shrek computer may not be thinking
on its own, but it is learning. And it can also learn in a different way. Removing beads is basically a form of learning
by punishment. When Team Shrek makes a bad move, I’m punishing
the computer for being wrong. I don’t have to worry about the computer
feeling bad about losing, these matchboxes aren’t gonna get frustrated and quit playing
and run away crying and slam the door in my face and tell me I’m not their real dad. But what happens if instead of punishing my
computer, I reward it? Instead of just putting the good play bead
back in the box when the computer wins, I could add another bead of the same color that
made the winning move. That would reduce the probability of a losing
bead appearing by increasing the probability of a matchbox generating a winning bead. The computer would still eventually reach
perfect play because I’ll still remove the losing beads, but it will take longer because
it’s winning more often. If it could feel, it would probably feel better
about winning more often along its longer journey toward perfection. So the fastest way to perfect play is by punishing
the computer’s mistakes. But the way to win as many games as possible
along the way is to reward its victories. To improve at hexapawn, our matchbox computer
actually uses a type of genetic algorithm. It’s a way to solve problems and learn based
on natural selection. Based on the process that drives biological
evolution. The beads of learning in your life may be
refined by punishment. Put your hand on a hot stove once, and learn
that, “Ow! That’s painful.” So you remove the touch-hot-stove-bead from
your brain. They may also be augmented by rewards. “My parents bought me ice cream for getting
an A on my exam.” Add another get-good-grades-bead to your matchbox
head computer. Hexapawn is an obscure, academic game from
over 50 years ago, and you can make a matchbox computer that learns to win every time. But by allowing this matchbox computer full
of colored beads to learn, the player who’s learning a bit more about learning is… you. And as always, thanks for watching. If you wanna make your own matchbox, oh I
lost a bead, matchbox computer, download my template for free over at Twitter.com/VsauceTwo. That’s at Vsauce T, W, O. If you wanna watch more Vsauce2 videos, just
uh click over here, and if you aren’t subscribed to Vsauce2 then maybe you should uh, put a,
“subscribe to Vsauce2” bead in your brain. Wow. That was weirdly creepy.

The Game That Learns

100 thoughts on “The Game That Learns

  • March 19, 2019 at 12:18 am
    Permalink

    I wanted to respond to two types of comments that have appeared more than once. I read all the comments and really appreciate when you all dig deep into these topics.

    First, we're missing some matchboxes because we don't actually need them! Some matchboxes work for two scenarios — once for the board position they display, and also for the board position that is a mirror image of it. The computer learns both board positions at the same time, but yes, at first glance it appears as though I just left some out. Martin Gardner didn't think they were necessary, either.

    Second, Hexapawn is a much simpler version of chess, so terms like "checkmate" and "stalemate" aren't exactly the same. They're simpler, too. In chess, checkmate is achieved when there is no way for your opponent to move without the king being captured. A stalemate occurs when a player has no legal move. A stalemate results in a draw.

    So, when that occurs in Hexapawn, it has the trappings of a stalemate but has the result and the spirit of a checkmate — the win is awarded to the player who moves in a way that creates a stalemate for their opponent. Because the situation results in a win instead of a draw, I thought it was more appropriate to compare it to checkmate, though it may have been clearer to avoid the language of "checkmate" entirely.

    Reply
  • June 27, 2019 at 7:55 am
    Permalink

    everytging was great until the sponors

    Reply
  • June 27, 2019 at 9:22 am
    Permalink

    Natural selection doesn't produce anything, it "selects".

    Reply
  • June 28, 2019 at 12:35 pm
    Permalink

    Nobody:

    Stan twitter: SKKSKSKSKSKSK

    Reply
  • June 30, 2019 at 8:37 am
    Permalink

    This took too long, but if you want the template, here it is https://mobile.twitter.com/VsauceTwo/status/1107733737364770817

    Reply
  • July 2, 2019 at 6:17 am
    Permalink

    You know wouldn't it be cool if the machine also learned from other's mistake than just itself's because now perfecting and more wins can both happen… Bet the creator of hexapon didn't think of that!

    Reply
  • July 3, 2019 at 1:23 am
    Permalink

    10:27 he sounds like hiccup from how to train a dragon

    Reply
  • July 3, 2019 at 7:52 am
    Permalink

    WHAT HAS RAID NOT INFECTED!?

    Reply
  • July 3, 2019 at 11:53 am
    Permalink

    Bad mobile game

    Reply
  • July 5, 2019 at 12:17 am
    Permalink

    This was a brilliant video!!!! I was really excited about the Menace simulation that StandUpMaths did but this is WAY more practical. So SO happy you made this & YouTube recommended it! Definitely trying this out.

    Reply
  • July 5, 2019 at 3:30 am
    Permalink

    I was about to click to another video and then he said Shrek

    Reply
  • July 5, 2019 at 4:49 am
    Permalink

    Seriously all that for a Shrek pun?

    I love it

    Reply
  • July 5, 2019 at 7:03 pm
    Permalink

    Person: spills the tea
    Another person: SKSKSKSK

    Reply
  • July 5, 2019 at 7:04 pm
    Permalink

    Who else hates when Vsauce channels put too much emphasis on pauses?

    Reply
  • July 6, 2019 at 6:05 am
    Permalink

    What happens when a matchbox is empty? 404?

    Reply
  • July 8, 2019 at 8:59 am
    Permalink

    Am i the only one who understood Shrek‘s-a-porn?

    Reply
  • July 8, 2019 at 9:56 pm
    Permalink

    using the real hexapawn to show people how it's done
    vsauce: nah
    using shrekapawn to get more view…
    vsauce:stonk!!

    Reply
  • July 9, 2019 at 2:40 am
    Permalink

    Really interesting way of showing a brute-force depth-first-search, although this algorithm doesn't work for difficult games like go, chess because the state is too large for brute-force algorithms.

    Reply
  • July 9, 2019 at 2:32 pm
    Permalink

    thank god you lost the third round

    Reply
  • July 9, 2019 at 2:33 pm
    Permalink

    If i put two computers against each other will they end with the same algoritm?

    Reply
  • July 10, 2019 at 6:05 am
    Permalink

    if only this was in harry potter

    Reply
  • July 10, 2019 at 6:50 am
    Permalink

    Actually in chess if your opponent can't move its check, checkmate is when it they move their king you won't get them

    Reply
  • July 10, 2019 at 10:06 am
    Permalink

    Next Video : WhAt Is A dUcK?

    Reply
  • July 10, 2019 at 8:28 pm
    Permalink

    WOW the only sauce vid my semi smart 11 year old brain can compute

    Reply
  • July 11, 2019 at 3:33 pm
    Permalink

    Shreksapawn
    Shreks a pawn

    Reply
  • July 11, 2019 at 4:13 pm
    Permalink

    “Add another get-good-Grades-bead to your matchbox head computer”

    If only it was that easy

    Reply
  • July 12, 2019 at 11:19 am
    Permalink

    Naughty kids gets the beads removed

    Reply
  • July 12, 2019 at 3:26 pm
    Permalink

    If you start off moving the middle piece forward you win every time though, the computer can't learn to react to that if you play it well.

    Reply
  • July 14, 2019 at 3:24 am
    Permalink

    Its all fun and games until Skynet.

    Reply
  • July 15, 2019 at 5:23 am
    Permalink

    Literaly a best physical example of natural selection and computational evolution I've seen thus far, and I am tempted to do 3 things

    1. Run a C++ script that does this
    2. Do this IRL
    3. See if humans/boxes and or C++ could do this on a much grander scale (i.e. full game of chess)

    For a full game of chess, I'm thinking the bead corresponds to a piece and all possible moves, but this would take years to truly make (more chess combinations than people on earth circa 1985 if I remember correctly) let alone truly have a winning "computer". On top of that, I would need literal THOUSANDS of INDIVIDUAL BEADS per Altoids tin (face it… match boxes won't work)… thank whatever all benevolent deity you believe in for C++ and actual electronic computers….

    Reply
  • July 15, 2019 at 1:25 pm
    Permalink

    I see something that I don't want to see and don't want to put on my skin

    Reply
  • July 15, 2019 at 8:02 pm
    Permalink

    I have a question, in retrospect I think the answer should be fairly obvious. Wouldn’t it just be faster to both reward and punish the computer to get it to learn faster? In my mind, the answer is a resounding yes. If you removed bad beads and add in good beads? I think it would learn twice as fast, wouldn’t it?

    Reply
  • July 16, 2019 at 9:51 pm
    Permalink

    if S wins, you could remove all other beads

    Reply
  • July 17, 2019 at 1:19 pm
    Permalink

    man. could you please, please don't pause when you are talking. it's so DUMMMMMMM when you do that.

    Reply
  • July 18, 2019 at 3:57 am
    Permalink

    What is your favorite game?

    9 year olds:fOrTnItE
    Kevin:S H R E K S A P A W N

    Reply
  • July 18, 2019 at 7:26 pm
    Permalink

    The title sounds like an SCP.

    Reply
  • July 19, 2019 at 1:09 am
    Permalink

    How about you set up one box set like this and another one for playing on the odd turns, and then play them against each other and see how quickly they learn? Or maybe adapt it to a slightly bigger game; say, how many matchboxes would be needed for a 4×4 Octapawn game?

    Reply
  • July 19, 2019 at 3:49 pm
    Permalink

    If you wom 3rd round that would be interesting

    Reply
  • July 19, 2019 at 4:24 pm
    Permalink

    So. CHESS MATCHBOX AI NEXT TIME?

    Reply
  • July 19, 2019 at 5:00 pm
    Permalink

    nobody:
    VSCO girl: 7:34

    Reply
  • July 19, 2019 at 9:01 pm
    Permalink

    hey its simplified q learning

    Reply
  • July 20, 2019 at 9:12 am
    Permalink

    I was just waiting for Kevin to win 3 times in a row

    Reply
  • July 20, 2019 at 1:07 pm
    Permalink

    This is how I learn new things
    Edit : in a fun way

    Reply
  • July 20, 2019 at 9:17 pm
    Permalink

    This video is good because of shrek

    Reply
  • July 22, 2019 at 12:22 am
    Permalink

    Time for me to do this on a modern computor

    Reply
  • July 22, 2019 at 2:23 am
    Permalink

    With that win count system, good thing you didn't win 3 in a row…

    Reply
  • July 22, 2019 at 6:57 am
    Permalink

    I didn't expect to hear Synthetic Life in this video, but it's always welcome.

    Reply
  • July 23, 2019 at 3:00 pm
    Permalink

    I love machine learning

    Reply
  • July 23, 2019 at 11:09 pm
    Permalink

    Shrek-s-a-pawn

    Reply
  • July 24, 2019 at 2:58 pm
    Permalink

    I have a question: What would happen if two perfectly trained "computers" would play against each other?

    Reply
  • July 26, 2019 at 7:28 am
    Permalink

    Me: moves right pawn foward for first move

    Computer: NANI? WHAT DO I DO.

    Reply
  • July 27, 2019 at 9:30 pm
    Permalink

    This is the earliest use of Q learning

    Reply
  • July 27, 2019 at 10:36 pm
    Permalink

    Avoids KKK –> gets SS instead
    Damn…

    Reply
  • July 28, 2019 at 2:06 am
    Permalink

    basically q learning

    Reply
  • July 28, 2019 at 2:23 am
    Permalink

    When your mom is arguing with you Shrek isn't educational.

    Reply
  • July 28, 2019 at 4:07 pm
    Permalink

    Is dis like Q pr0gramign?

    Reply
  • July 29, 2019 at 7:25 pm
    Permalink

    5:10
    Kevin: I AM KEVIN.
    Kevin: Or am I?
    *weird vsauce music plays*

    Reply
  • July 29, 2019 at 11:57 pm
    Permalink

    O god I love this video

    Reply
  • July 30, 2019 at 12:41 pm
    Permalink

    This is just the plot of Shrek 5 Change my mind.

    Reply
  • July 30, 2019 at 7:09 pm
    Permalink

    S H R E K

    Reply
  • July 30, 2019 at 10:52 pm
    Permalink

    imagine if kevin won the 3rd game
    That would've been kk…

    Reply
  • July 31, 2019 at 3:12 am
    Permalink

    "Shreksapon"
    The world's sexiest game

    Reply
  • July 31, 2019 at 1:20 pm
    Permalink

    8:50 this got dark quite fast

    Reply
  • August 1, 2019 at 1:39 am
    Permalink

    Ah yes, the game that I was so ambitious to remake that I relearned JavaScript

    Reply
  • August 1, 2019 at 8:12 am
    Permalink

    7:54 you scared the heck out of me

    Reply
  • August 1, 2019 at 1:47 pm
    Permalink

    Me: Some-
    A chain of comments: Body once told me the world is gonna rule me-
    You: I ain't the sharpest tool in the shed

    Please continue:
    SOME BODY-

    Reply
  • August 1, 2019 at 4:57 pm
    Permalink

    Skillfully dodging 3 `K`'s in a row

    Reply
  • August 3, 2019 at 2:38 am
    Permalink

    this was made on my bday this year

    Reply
  • August 3, 2019 at 9:29 am
    Permalink

    Technically it can learn but its actually bot learning lets say the computer does a good move and the bead is removed next time it could do a bad because it cant do a good one again

    Reply
  • August 4, 2019 at 10:50 am
    Permalink

    1:26

    Kevin: What IS a Computer?

    A computer is a machine or device that performs processes, calculations and operations based on instructions provided by a software or hardware program. It is designed to execute applications and provides a variety of solutions by combining integrated hardware and software compone.

    There you go, Happy?

    Reply
  • August 4, 2019 at 10:50 am
    Permalink

    1:26

    Kevin: What IS a Computer?

    A computer is a machine or device that performs processes, calculations and operations based on instructions provided by a software or hardware program. It is designed to execute applications and provides a variety of solutions by combining integrated hardware and software compone.

    There you go, Happy?

    Reply
  • August 4, 2019 at 10:50 am
    Permalink

    1:26

    Kevin: What IS a Computer?

    A computer is a machine or device that performs processes, calculations and operations based on instructions provided by a software or hardware program. It is designed to execute applications and provides a variety of solutions by combining integrated hardware and software compone.

    There you go, Happy?

    Reply
  • August 4, 2019 at 10:51 am
    Permalink

    1:26

    Kevin: What IS a Computer?

    A computer is a machine or device that performs processes, calculations and operations based on instructions provided by a software or hardware program. It is designed to execute applications and provides a variety of solutions by combining integrated hardware and software compone.

    There you go, Happy?

    Reply
  • August 5, 2019 at 2:33 am
    Permalink

    I shall build a matchbox computer that can beat my friend at Yu-Gi-Oh!

    Reply
  • August 8, 2019 at 3:23 pm
    Permalink

    SHRREEEEEKKKK

    Reply
  • August 10, 2019 at 3:11 am
    Permalink

    in german computer translates to the literal meaning of calculator and calculator translates to the literal meaning of pocket calculator

    Reply
  • August 10, 2019 at 4:29 am
    Permalink

    10:26 that's hot that's hot

    Reply
  • August 11, 2019 at 1:18 am
    Permalink

    Do more gameplay pls

    Reply
  • August 14, 2019 at 11:19 pm
    Permalink

    I get smarter by watching shriek dank memes

    Reply
  • August 16, 2019 at 7:44 am
    Permalink

    1:27 people in 5034

    Reply
  • August 16, 2019 at 1:36 pm
    Permalink

    bruhxapawn

    Reply
  • August 16, 2019 at 8:07 pm
    Permalink

    Millionth view!

    Reply
  • August 17, 2019 at 12:44 am
    Permalink

    Can you imagine if Kevin got 3 wins in a row,,,,

    Reply
  • August 18, 2019 at 6:09 am
    Permalink

    Nobody:
    Kevin: WHAT IS LEARNING?!

    Reply
  • August 18, 2019 at 6:25 am
    Permalink

    Shrekmate

    Reply
  • August 19, 2019 at 2:45 am
    Permalink

    best anime episode

    Reply
  • August 19, 2019 at 3:16 am
    Permalink

    Don’t mean to be nitpicky, but 10:03 is actually slightly incorrect. The algorithm you used is Q-Learning, where you reward a computer for doing what you want and punish it for doing what you don’t want. The natural selection algorithm you described is more in line with NEAT, where multiple AI systems get tested at once, with the best ones “breeding” the next generation.

    Reply
  • August 20, 2019 at 2:30 am
    Permalink

    Somebody once told me the world was gonna roll me

    Reply
  • August 20, 2019 at 9:38 am
    Permalink

    What if two computers played with each other? Will thay both become perfect and what if a new computer played with a perfect computer?

    Reply
  • August 20, 2019 at 4:14 pm
    Permalink

    why are there only 2 moves for computer in second round and not three?

    Reply
  • August 22, 2019 at 5:53 am
    Permalink

    That "touch hot stove" bead…

    Reply
  • August 22, 2019 at 3:55 pm
    Permalink

    Imagine Kevin wins three times in a row…

    Reply
  • August 25, 2019 at 6:32 am
    Permalink

    Notice he was losing the third and sixt game on purpose :p

    Reply
  • August 26, 2019 at 7:50 pm
    Permalink

    You can make as many excuses and give us as many papers and theories as you want – we both know this is a game designed for an only-child who's parents are busy.

    Reply
  • August 26, 2019 at 11:10 pm
    Permalink

    Why are you on YouTube? You should be a Science Teacher👨🏻‍🔬

    Reply
  • August 27, 2019 at 2:12 am
    Permalink

    6:49
    He got lucky.

    Reply
  • August 27, 2019 at 10:30 am
    Permalink

    2:58 actually thats a stalmate in chess, not a checkmate.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *