Ten years ago, computer vision researchers
thought that getting a computer to tell the difference
between a cat and a dog would be almost impossible, even with the significant advance
in the state of artificial intelligence. Now we can do it at a level
greater than 99 percent accuracy. This is called image classification — give it an image,
put a label to that image — and computers know
thousands of other categories as well. I’m a graduate student
at the University of Washington, and I work on a project called Darknet, which is a neural network framework for training and testing
computer vision models. So let’s just see what Darknet thinks of this image that we have. When we run our classifier on this image, we see we don’t just get
a prediction of dog or cat, we actually get
specific breed predictions. That’s the level
of granularity we have now. And it’s correct. My dog is in fact a malamute. So we’ve made amazing strides
in image classification, but what happens
when we run our classifier on an image that looks like this? Well … We see that the classifier comes back
with a pretty similar prediction. And it’s correct,
there is a malamute in the image, but just given this label,
we don’t actually know that much about what’s going on in the image. We need something more powerful. I work on a problem
called object detection, where we look at an image
and try to find all of the objects, put bounding boxes around them and say what those objects are. So here’s what happens
when we run a detector on this image. Now, with this kind of result, we can do a lot more
with our computer vision algorithms. We see that it knows
that there’s a cat and a dog. It knows their relative locations, their size. It may even know some extra information. There’s a book sitting in the background. And if you want to build a system
on top of computer vision, say a self-driving vehicle
or a robotic system, this is the kind
of information that you want. You want something so that
you can interact with the physical world. Now, when I started working
on object detection, it took 20 seconds
to process a single image. And to get a feel for why
speed is so important in this domain, here’s an example of an object detector that takes two seconds
to process an image. So this is 10 times faster than the 20-seconds-per-image detector, and you can see that by the time
it makes predictions, the entire state of the world has changed, and this wouldn’t be very useful for an application. If we speed this up
by another factor of 10, this is a detector running
at five frames per second. This is a lot better, but for example, if there’s any significant movement, I wouldn’t want a system
like this driving my car. This is our detection system
running in real time on my laptop. So it smoothly tracks me
as I move around the frame, and it’s robust to a wide variety
of changes in size, pose, forward, backward. This is great. This is what we really need if we’re going to build systems
on top of computer vision. (Applause) So in just a few years, we’ve gone from 20 seconds per image to 20 milliseconds per image,
a thousand times faster. How did we get there? Well, in the past,
object detection systems would take an image like this and split it into a bunch of regions and then run a classifier
on each of these regions, and high scores for that classifier would be considered
detections in the image. But this involved running a classifier
thousands of times over an image, thousands of neural network evaluations
to produce detection. Instead, we trained a single network
to do all of detection for us. It produces all of the bounding boxes
and class probabilities simultaneously. With our system, instead of looking
at an image thousands of times to produce detection, you only look once, and that’s why we call it
the YOLO method of object detection. So with this speed,
we’re not just limited to images; we can process video in real time. And now, instead of just seeing
that cat and dog, we can see them move around
and interact with each other. This is a detector that we trained on 80 different classes in Microsoft’s COCO dataset. It has all sorts of things
like spoon and fork, bowl, common objects like that. It has a variety of more exotic things: animals, cars, zebras, giraffes. And now we’re going to do something fun. We’re just going to go
out into the audience and see what kind of things we can detect. Does anyone want a stuffed animal? There are some teddy bears out there. And we can turn down
our threshold for detection a little bit, so we can find more of you guys
out in the audience. Let’s see if we can get these stop signs. We find some backpacks. Let’s just zoom in a little bit. And this is great. And all of the processing
is happening in real time on the laptop. And it’s important to remember that this is a general purpose
object detection system, so we can train this for any image domain. The same code that we use to find stop signs or pedestrians, bicycles in a self-driving vehicle, can be used to find cancer cells in a tissue biopsy. And there are researchers around the globe
already using this technology for advances in things
like medicine, robotics. This morning, I read a paper where they were taking a census
of animals in Nairobi National Park with YOLO as part
of this detection system. And that’s because Darknet is open source and in the public domain,
free for anyone to use. (Applause) But we wanted to make detection
even more accessible and usable, so through a combination
of model optimization, network binarization and approximation, we actually have object detection
running on a phone. (Applause) And I’m really excited because
now we have a pretty powerful solution to this low-level computer vision problem, and anyone can take it
and build something with it. So now the rest is up to all of you and people around the world
with access to this software, and I can’t wait to see what people
will build with this technology. Thank you. (Applause)

How computers learn to recognize objects instantly | Joseph Redmon
Tagged on:                                     

100 thoughts on “How computers learn to recognize objects instantly | Joseph Redmon

  • May 18, 2019 at 8:24 pm
    Permalink

    5:32 Detected a parrot as pizza.

    This is how the flesh-eating robots begin.

    Reply
  • May 18, 2019 at 9:27 pm
    Permalink

    calls it darknet come on the shadiest name you can think of

    Reply
  • May 18, 2019 at 10:28 pm
    Permalink

    this is amazing!

    Reply
  • May 19, 2019 at 12:24 am
    Permalink

    TELL THE DEFERENCE WEATHER TO DEAM YOU AS A THREAT AND ERASE YOU

    Reply
  • May 19, 2019 at 12:27 am
    Permalink

    THIS IS THE GUY THAT IS GOING TO BE RESPONSIBLE FOR KILLING YOUR CHILDREN AND GRANDCHILDREN

    Reply
  • May 19, 2019 at 1:28 am
    Permalink

    Just because you can build something doesn't mean you should

    Reply
  • May 19, 2019 at 3:57 am
    Permalink

    Unless you are in some hacker conference there is no use of such red fancy fonts.

    Reply
  • May 19, 2019 at 6:11 am
    Permalink

    Such a bad presentation omg…
    I don't even know how anything he says has anything to do with the complication on the screen

    Reply
  • May 19, 2019 at 7:07 am
    Permalink

    i can't believe this appears in my feed after 2 years, mid 2019

    shame on you, yt

    Reply
  • May 19, 2019 at 7:44 am
    Permalink

    What if i suddenly start running faster?

    Reply
  • May 19, 2019 at 3:27 pm
    Permalink

    "I can't wait for the globalists to take over using this technology, thank you very much. *applause*"

    Reply
  • May 19, 2019 at 3:56 pm
    Permalink

    What kind of skateboard is this? 3:21

    Reply
  • May 19, 2019 at 7:37 pm
    Permalink

    The most amazing thing.. is not just to build things but also to share.

    Be sure you are great by sharing what you think that will made our life better.

    Thank you for the video.

    Reply
  • May 19, 2019 at 8:57 pm
    Permalink

    yooooooooooo what is that theme? LMAOooooooooooooooooooooo

    Reply
  • May 19, 2019 at 9:02 pm
    Permalink

    This doesn't even say how they learn that but only that they do and that they do it fast (SYAC). Thanks for clickbating me, boo.

    Reply
  • May 20, 2019 at 3:52 am
    Permalink

    nice just whot i wating for i need a robot to kill my stalker

    Reply
  • May 20, 2019 at 5:30 am
    Permalink

    check out the Chinese "sky net" system

    Reply
  • May 20, 2019 at 11:15 am
    Permalink

    But can they distinguish drawings?!

    Reply
  • May 20, 2019 at 2:00 pm
    Permalink

    this man is key to future

    Reply
  • May 20, 2019 at 2:03 pm
    Permalink

    please tell him do optimize it for the RPI

    Reply
  • May 20, 2019 at 4:21 pm
    Permalink

    3:22 skateboard.

    Reply
  • May 20, 2019 at 4:42 pm
    Permalink

    Damn that's kewl ..

    Reply
  • May 20, 2019 at 4:54 pm
    Permalink

    speechless..

    Reply
  • May 20, 2019 at 8:15 pm
    Permalink

    4:30 When you tell a joke, but no one catches on…

    Reply
  • May 20, 2019 at 8:19 pm
    Permalink

    -"Darknet"
    -All red and black
    -Logo resembles some sort of satanistic stuff
    -Allows for robots to detect objects quickly

    Seems not nefarious at all.

    Reply
  • May 20, 2019 at 9:18 pm
    Permalink

    Rescue Huawei ⛑️ Boycott Google 🚫

    Reply
  • May 21, 2019 at 1:26 am
    Permalink

    things goverment will take behind the stage and hide and run

    Reply
  • May 21, 2019 at 2:34 am
    Permalink

    (future windows 10 update) "integrated fixes for threat analysis and solution options" a week later half the world population is dead with computer overlords maximizing power resources.

    Reply
  • May 21, 2019 at 5:34 am
    Permalink

    But can it detect who is a good boy?

    Reply
  • May 21, 2019 at 9:39 am
    Permalink

    You can combine this system and AI to make the computer can learn our world and evolute by himself then imagine! Finally we can create an artificial human. A robot that has feeling like us and he is so smart, can count quickly like calculators, can speak all languages like Google Translate, can easily memorize things, and also can make strategy 1000 steps ahead just like rendering.

    Reply
  • May 21, 2019 at 10:19 am
    Permalink

    All of this is done using Intelligent Design.

    Reply
  • May 21, 2019 at 10:30 am
    Permalink

    Elon Musk did this in his 3rd grade science fair.

    Reply
  • May 21, 2019 at 11:11 am
    Permalink

    It may detect the difference between a dog and a cat but they are as important to it as the book in the background. Need way more than realtime object detection before i let an AI make life or death decisions.

    Reply
  • May 21, 2019 at 12:05 pm
    Permalink

    Didn't Jian Yang create this in Erlich Bachman's incubator?

    Reply
  • May 21, 2019 at 10:32 pm
    Permalink

    sos un genio chabón

    Reply
  • May 22, 2019 at 12:55 am
    Permalink

    3:21 Skateboard 😂😂

    Reply
  • May 22, 2019 at 2:02 am
    Permalink

    but it doesn't know chair at 2:05

    Reply
  • May 22, 2019 at 4:31 am
    Permalink

    This is the guy with the MLP resume

    Reply
  • May 22, 2019 at 6:36 am
    Permalink

    4:44 big cat

    Reply
  • May 22, 2019 at 11:01 am
    Permalink

    Isn't this kind of "spam" in the form of a TED talk?

    Reply
  • May 22, 2019 at 2:41 pm
    Permalink

    "see food" you'll get this if you watch Silicon Valley 😂

    Reply
  • May 22, 2019 at 2:45 pm
    Permalink

    This guy is odd

    Reply
  • May 23, 2019 at 3:13 am
    Permalink

    This would have been far more valuable if he demystified the AI behind the tech so it's not so spooky to people, undoing some of the damage done by science/tech journalism. Missed opportunity. Instead, TED provides a platform for someone to flex on and on about their research paper. What is this twitter?

    Reply
  • May 23, 2019 at 9:09 am
    Permalink

    Just some good old wizard's and warlocks trying to convinve u in to satanism … nothing new under "the suns ""

    Reply
  • May 23, 2019 at 10:14 am
    Permalink

    3:23 detected a skateboard

    Reply
  • May 23, 2019 at 11:52 am
    Permalink

    bhai code de na 😯

    Reply
  • May 23, 2019 at 11:52 pm
    Permalink

    Ohh lovely Skynet junior 😀

    Reply
  • May 23, 2019 at 11:57 pm
    Permalink

    ohhh no no no on no… :/

    Reply
  • May 24, 2019 at 3:05 am
    Permalink

    It's awesome and u made it open source ….. But it must be used under controlled domain otherwise catastrophic events will occur in future… Otherwise train the classifier upto 99.99999999999999…one million 9's in the end .then only it's good to use in real time.

    Reply
  • May 24, 2019 at 1:38 pm
    Permalink

    nowaday.. yolo3 is updated. so powerful..I think It's more convenient than SSD

    Reply
  • May 25, 2019 at 8:03 am
    Permalink

    why do i get the feeling that all those times i had to prove i wasnt a Robot……
    and then haveing to choose all the boxes with a picture of some kind of objects but only choose the boxes containing the 1 object…..

    wow would you look at that….instead of having to Pay Someone or Instead of doing the Hours them selves and having to Run the programming of it over and over them selves by teaching the computer how and what to focus on…..

    they had us do it for them….FOR FREE

    the outcome this technology……

    Reply
  • May 25, 2019 at 5:17 pm
    Permalink

    I'm still waiting for the explanation of how computers learn to recognize objects.

    Reply
  • May 28, 2019 at 3:59 pm
    Permalink

    More video show me please send me link your videos

    Reply
  • May 29, 2019 at 12:19 am
    Permalink

    I loved this demonstration, thanks for creating YOLO, i have done a project using YOLO to detect 3 classes (person, hat, vest), the idea is to use it to detect wether the workiers on construction sites are wearing appropiate PPEs or not. i trained it with a custom dataset and is running on a Raspberry PI with an Intel NCS2, you can see my results here: https://youtu.be/rFMc3FNQFL4

    Reply
  • May 29, 2019 at 11:54 pm
    Permalink

    Amazing! Thank you for open sourcing this. I will be using it as a part of my smart dorm room project I am building !

    Reply
  • June 2, 2019 at 8:29 pm
    Permalink

    This is how America should be with everything, just, I created this, here you go. But instead we got a bunch of greedy bastards holding the world back trying to make money on a good idea and if they never get what they want, the world loses in the tech.

    That is the kind of mind set that is dooming society. What they have done here is created something and let anyone have access so they can now create even more. That is what we need.

    Reply
  • June 2, 2019 at 8:32 pm
    Permalink

    Computers don't learn to recognize things instantly. The title is misleading. They teach the computer to recognize things and it just recognizes them. But if you took the stop sign and altered it, it wouldn't recognize it. That is because it isn't learning, it is programmed. It isn't AI. AI would be able to know it is a stop sign even when it looks nothing like one, just as an example.

    Reply
  • June 3, 2019 at 8:08 pm
    Permalink

    i am not sure if its save to be opensource…

    Reply
  • June 4, 2019 at 1:56 am
    Permalink

    Anyone who can post the link for the github code ?

    Reply
  • June 4, 2019 at 9:37 am
    Permalink

    Computers do not recognise instantly the recognise at the speed of the processor

    Reply
  • June 7, 2019 at 2:50 pm
    Permalink

    Look at – (1) `Picture of Dragon (Satanic Pic) in his Laptop , (2) software's name is `Darknet' , (3) The circle is similar to Illuminatic, (4) All are in red color …………. it is not a coincidence

    Reply
  • June 8, 2019 at 10:20 pm
    Permalink

    Wow….. me and this guy has the same last name

    Reply
  • June 11, 2019 at 5:09 am
    Permalink

    Finally a Ted talk actually works for something. Thank you local Thor. We appreciate technology.

    Reply
  • June 14, 2019 at 8:45 am
    Permalink

    Does any one knows if the source code is open source?

    Reply
  • June 15, 2019 at 2:08 am
    Permalink

    I like his desktop, what is his OS? and theme ?

    Reply
  • July 4, 2019 at 12:40 am
    Permalink

    5:45 on m'laptop

    Reply
  • July 7, 2019 at 1:18 pm
    Permalink

    So great that darknet comes with an open source licence !!

    Reply
  • July 15, 2019 at 4:52 am
    Permalink

    Youtube recognized me for watchimg related videos like these.

    Reply
  • July 16, 2019 at 6:18 pm
    Permalink

    I did not find answer for " How computers learn to recognise objects instantly " in the whole video … Just saw recognising objects instantly !!!!

    Reply
  • July 18, 2019 at 2:53 pm
    Permalink

    awesome…!!!!

    Reply
  • July 21, 2019 at 6:06 am
    Permalink

    This looks like mostly bullshit and is bullshit. How the f**k is he processing things so quickly on a normal pc.

    Reply
  • July 23, 2019 at 1:39 am
    Permalink

    for a second it thought he had an invisible skateboard but I like darknet all the same.

    Reply
  • July 24, 2019 at 5:09 am
    Permalink

    5:42 parrot as a teddy bear

    Reply
  • July 26, 2019 at 6:00 am
    Permalink

    I love it great

    Reply
  • July 28, 2019 at 1:20 pm
    Permalink

    I thought he is going to explain about cnn.

    Reply
  • July 30, 2019 at 11:01 pm
    Permalink

    not sure if darknet is the best name

    Reply
  • August 4, 2019 at 4:00 pm
    Permalink

    Nice!!!!!!!!

    Reply
  • August 8, 2019 at 12:45 am
    Permalink

    How can anyone like this video?
    Did you get an answer? This is just an ad for a very poorly named company…

    Reply
  • August 8, 2019 at 1:37 pm
    Permalink

    Highly appreciable work by this dude..
    Bro you daknet is so sophesticated and you open sourced it….you are a hero

    Reply
  • August 10, 2019 at 6:39 pm
    Permalink

    sound is broken somehow

    Reply
  • August 11, 2019 at 1:26 pm
    Permalink

    wonderful work👏👏👏👏

    Reply
  • August 11, 2019 at 2:27 pm
    Permalink

    Dude, I want that red, hellish interface 'theme' or whatever. What's it called?

    Reply
  • August 15, 2019 at 8:26 am
    Permalink

    can we recognize faces with that technique?

    Reply
  • August 24, 2019 at 7:55 am
    Permalink

    Can I get the link for the code ..

    Reply
  • September 3, 2019 at 3:35 pm
    Permalink

    amazing

    Reply
  • September 4, 2019 at 6:59 am
    Permalink

    how can we access the darknet technology ?

    Reply
  • September 18, 2019 at 3:49 am
    Permalink

    And yet the computer still can't tell the difference between a cat and a dog without having been trained on the specific differences to express beforehand, meaning the misleading example that motivated the talk still doesn't work. Else, a good talk.

    Reply
  • September 21, 2019 at 8:44 pm
    Permalink

    Mind blowing!

    Reply
  • September 22, 2019 at 11:06 am
    Permalink

    what if we want to target only one particular person and ignoring others???

    Reply
  • October 2, 2019 at 1:14 pm
    Permalink

    Watch it's final result by me:
    https://youtu.be/F8-6kqiT0ps
    (TUTORIAL SOON)

    Reply
  • October 4, 2019 at 6:59 am
    Permalink

    I was looking for the Genius and I found the Alien.

    Reply
  • October 8, 2019 at 7:01 am
    Permalink

    only one question: How computers learn to recognize objects instantly?

    Reply
  • October 13, 2019 at 3:59 pm
    Permalink

    Easier to stalk your crush now – what a time to be alive.

    Reply
  • October 17, 2019 at 4:31 pm
    Permalink

    Here, We discuss about the Google's awesome feature called Google Colab, Google Colaboratory and Colab Notebook. We give you information about What is Google Colab ? and also discuss the What is Google Colaboratory Notebook ?

    https://www.techfoul.com/2019/10/google-colab-what-is-google.html

    Reply
  • October 21, 2019 at 11:49 am
    Permalink

    You can have a look at an example here:
    https://predictivehacks.com/object-detection-with-yolo/

    Reply
  • October 31, 2019 at 7:36 am
    Permalink

    Wonderful explanation. The computer vision is one of the great challenge in Robotics and anonymous vehicles, this algorithm will act most appropriate like the biological model vision. It is going to strive its effort in Pathology domain as well…

    Reply
  • November 2, 2019 at 10:11 am
    Permalink

    The terminators will definitely use some version of his code

    Reply
  • November 5, 2019 at 7:30 am
    Permalink

    Oh damn – it sort of pains me to see how people (the audience) struggle to see the awesomeness and achievement of this person. He probably could have lit a fart for more total applause.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *