Computer vision let the cold machine read and colorful world

By Tim Shaw,2015-09-24 20:55
Computer vision let the cold machine read and colorful world

    Computer vision: let the cold machine read and

    colorful world

    In 2010, Stanford university, Princeton university and Columbia University scientists started ImageNet large-scale Visual identification Challenge (ImageNet Large Scale Visual Recognition Challenge, ILSVRC), promote the sustainable development of the computer Visual identification challenges., according to the New York times in 2014 computer identify challenges, the computer system for target recognition accuracy is almost doubled, image classification error rate is reduced by half.

    On this basis, developed by Microsoft research Asia, visual computing group of computer vision system, has recently gained breakthrough., according to a paper published the team contains approximately 1.2 million copies in training images, 50000 images and 100000 test images, is divided into 1000 different categories of "ImageNe1000 challenge" of Microsoft research team development system of successful identification error rate reduced to 4.94%, lower than the human eye about 5.1% of the identification error rates for the first time.Why computer vision system, and where to go?Microsoft Asia research institute researcher sun about incarnations of computer vision:

    A popular global star through "inspired countless people's desire to explore the vast universe mystery, also let many people remember the Tars the smart lovely, humor humor and wit of the intelligent robot.Artificial intelligence theme of Hollywood films have been widely popular among fans, humans build with endless imagination and dazzling stunts a nothing and a wonderful future world, is infatuated.Back to reality, however, computer scientists' action to seemingly far behind the pace of film artists' imagination - it is a movie, to develop a like, can understand the world around us, to understand human language, and fluent dialogue intelligent robots and humans, we have a long way to go.

    Star cross, can see, hear, can say intelligent robots, are popular with the audience.Image: "star through" stills

    Long time, the computer can see, hear, can say has always been my colleagues pursuit of the goal and the computer industry.More than 10 years of cultivation in the field of computer vision, give the computer a pair of eye, let it can understand this colorful world, has been inspired me in this challenging path forward the important strength.Although computer cannot yet was as smart as demonstrated in the movie, but has made a lot of amazing achievements.

    How the world in our eyes

    For humans, "man" seems to be a natural instinct, newborn baby a few days can imitate their parents;It gives us only by very few details will tell each other's ability, we borrow a dim light can still recognize friends on the end of the corridor.This for humans the ability to easily, however, are now finding life difficult for computer.In the past for a long time, the computer vision technology is stuck, before further explore, you talk about how we see the world with the eye.

    Believe that everyone in the middle school physics class tasted of small hole imaging principle.But the person's eye is much more complex than the camera, when we observe an object, a second glance about three times, and have one resides.When the retinal photoreceptor felt the outline of the candle, a known as central sunken area in fact is to record the shape of a candle in the form of distortion.

    So why problem comes, we see the world neither distortion and no

    deformation?Is very simple, because human beings have the universal "converter" in the cerebral cortex, it will be our visual nerve signal is converted to the real image of the capture.This "converter" can be simplified as four areas, biologists are respectively called V1, V2, V4 and IT areas.The neurons of V1 area, only for a small part of the whole visual areas to respond, for example, found a straight line, some neurons become active.Part of the line can be any thing, may be at the table, may be the floor, may be a stroke of the characters in this article.Eyes every glance, this part of the neuron activity can change quickly.

    Mystery in the cerebral cortex at the top of the IT area, biologists found that objects (such as a face) anywhere in the field of vision, some neurons has been fixed in the active state.That is to say, the human visual recognition from the retina to the IT area, the nervous system can identify from the subtle features, to gradually become can identify the target.If computer vision can also have a "translator", the computer identification efficiency will be greatly improved, the operation of the human eye visual nerve provides enlightenment for the computer vision technology breakthrough.

    Why do the computer always "unclear"

    Though the mystery of the human eye recognition has been gradually revealed, but used directly in the computer but not easy.We will find that the computer identification is always in the "mixed", once the light, Angle changes, such as computer is hard to keep up with the rhythm of the environment, will course.For computer, identify a person in different environment, it is better to identify it's much easier to two people in the same environment.This is because the researchers initially tried to face to imagine as a template, using machine learning method to master the law of the template.However face although it is fixed, but different Angle, light, dress up, appearance also has difference, made it difficult for simple template matching all face.

    Therefore, the core of the face recognition problem is that how to make computer to ignore the inner differences of the same person, and can be found between the two, respectively, namely, similar to the same person, different people different.