Ali deep learning practice

By Marion Thompson,2015-03-06 06:25
21 views 0
Ali deep learning practice

    Ali deep learning practice

    Two days of alibaba "2015 in hangzhou, the cloud conference" conclude smoothly recently in hangzhou town in the cloud.In various areas and enterprise elite gathered in the town in the cloud, hangzhou to discuss a new era of innovative undertaking and cloud change direction in the future.The exclusive domain of more than 30 professional theme BBS unusually brilliant.

    In recent years, with the vigorous development of the large data on the Internet, a lot of technology, the application of artificial intelligence like mushrooms, such as Google, Facebook, ali, tencent, baidu, etc with very wide, and all sorts of application are introduced by using the method of deep learning.Alibaba this have done quite a few years to study in depth, in "2015" hangzhou, the cloud conference opened special deep learning, invited seven experts from graphics, images, safe and sound, voice, etc, this paper introduces the alibaba using deep learning technology, application, and analyzing the experience of the technology of deep learning approach.

    : hua large-scale image search based on depth of learning, understanding and facial recognition technology

    Alibaba search business researcher in hua "2015 in hangzhou, the cloud conference" special deep learning brought titled "large-scale image search based on deep learning, understanding and facial recognition technology, sharing, reviewed all the field of image search to identify the status quo, at the same time also to alibaba was introduced in detail in the large-scale application of image search, human recognition, including technical output.

    Started as early as 1990 thousands of image retrieval, visual search is called the "Sunset" project in 2000, mainly the difficulty is the "semantic gap", image features and high-level semantic difference between the understanding of the human brain, there are some 2006 of Hashing inverted search, but not solve the related problems.In 2012, deep learning into industry extensively, deep learning technology greatly helped the visual search.Since January 2014, already has more than 100 related to visual CV.

    Today, baidu see did a lot of work on knowledge chart, but search results correlation, coverage, and a large room to improve.Microsoft launched the small ice is a dog, mainly is the dog of recognition, alibaba's shot is made for main mobile client, make goods.

    Key requirements of visual search

    For commodities, goods or more special, for commodity in terms of visual search, there are a lot of challenges, there are also many opportunities, "stand for" launched in 2014, after continuous development summary down has the following challenges:

    1. The correlation between higher expectations, because users more through licences, visual search, hope to buy with money, hope is the same;

    2. Higher coverage;

    3. The image quality has changed a lot, the user in the natural scene, so there may be reflective, the background is very complex;

    4. On the performance of the system is higher, now stand for in the system and image photography, rapid object detection, search, are almost in real time, and the search volume is hundreds of millions of images, the system performance requirements are very high;

    5. Measure of the goods, such as independent visits every day, after conversion, intention to search goods sold every day, this is called a "shine on a demon mirror".

    Is the use of deep learning for technological breakthroughs

    1. The correlation:

    A) the background detection: complex, the main body is small, often in the offline database construction process and the process of online search requires automaticity is used to detect the main body;

    B) classification and recognition: deep learning can help classification precision and accuracy;

    C) face analysis: to face detection, key point positioning, DCNN model, to make identity, gender, age, race, appearance level the score, in an attempt to SouTu in use to improve facial analysis correlation;

    D) image characteristics: clothes and cans, milk is not the same, the clothes are multiple attribute to describe it is a description of the clothes has multiple dimensions, can have a round neck, short sleeves, cartoon, white T-shirt, etc.In order to deal with such a situation, use a deep learning for the characteristics of deep learning, and use local features.

    2. Coverage: the Internet as a knowledge base, supervised learning, here also have unsupervised learning.Internet or electric business platform has a large amount of data, but at the same time also have noise, noisy data but counterproductive, so to clean up noisy data, for example by logging, network and other networks outside all information user behavior and platform for model training and data cleaning, can greatly improve the quality of the data.

    Scalability: index established adopting distributed index.

    4. User experience: take stand for the display of visual search, you open the phone taobao, click on the search box, if is "double a" latest version has like the camera beside the

    search page, click on the camera to licence search can open the camera or photo album is a selection of images.

    Is hua made a prospect for the future and the development trend of image technology.In the future, will still face the challenge of correlation, coverage, image quality and the performance of the system, measure is very strict and have nowhere to go.At the same time also has a lot of chance, it is a big data depth study, a large amount of data and user behavior and the era of smart devices, a lot of opportunities.Finally, the future image technology can be summarized from three aspects and promote: model, data, and the user.

    Ma Zejun: what voice, perfection of the cloud voice search experience

    In addition to the pictures, images, sounds of nature, especially the voice is also very critical of human communication, in the era of mobile Internet, mobile phones have more way.Alibaba mobile business group of speech recognition for ze jun what search brought titled "what voice, perfection of the cloud voice search experience".

    At home and abroad in recent years, many technology companies invested a lot of money and manpower in the development of speech technology.Abroad have apple Siri, Microsoft XiaoNa Contana etc., they're basically based on to provide users with intelligent voice assistant.At home, have a hkust fly, baidu, and go out to ask, they mainly invested in intelligent hardware, Internet, and online education, etc.What voice is intended to provide users with better experience, can make search "saying don't begin".

    From 2000 to 2010, the speech recognition accuracy is very small, deep learning began in 2010, the neural network to speech recognition accuracy rate increased by 30%, for speech recognition from laboratory to public life, practical played a huge role.

    The characteristics of deep learning, influence, talent

    Deep learning on the basis of the traditional learning introduces two marked characteristics.It is has the characteristics of automatic discovery, in the framework of deep learning, encourage the researchers don't go to a lot of complicated characteristics, make a lot of manual design assumptions, but through deep learning model mechanism, drawn from the original signal and the signal we need.Second, in the deep learning, characteristics of the study have layers, follow from simple to complex, from specific to general, so deep learning hierarchical, a high level of abstraction is introduced.

    In terms of influence, everyone can now often see deep learning noun, because in speech recognition, image recognition, natural language processing has made big progress, in 2013, the MIT technology review selected 10 2013 breakthrough technology, deep learning there is no doubt that occupies the first position.

    In terms of talent, many famous deep learning researchers have produced a Facebook and Google technology giants such as soliciting.

    Deep learning application in modeling acoustic model

    Deep learning in terms of speech recognition is mainly used for modeling acoustic model, using better optimization method.But because the voice is strong causal signals continuously, so able to take advantage of the continuing relationship between samples, that there is a great help to improve the accuracy, so the researchers suggest that SEQ, when attaches great importance to the sample link to join the upper model and language pronunciation dictionary information, is of great help to improve the entire model.

    Deep learning model parameters are relatively large size, big the parameters of the model in being even billion level, training as a scale parameter usually need huge amounts of data, and large-scale learning algorithm is very important, so the training of the distributed algorithm is can help to deal with large data.

    Model training later found in the tens of millions of parameters in only 30% of the real play a decisive influence on learning tasks, you can use some way to keep this 30% down and so on the one hand, the model parameters can be compressed, and can improve the effect of real-time online in code.

    Deep learning under the booming of thinking

    The vigorous development of deep learning recently and need some support: powerful learning algorithm, powerful learning facilities, deep learning by simulating classification, filtering, memory and forget the brain mechanism of data analysis, processing, forecasting.Again the strong engine also need good fuel, the fuel from the real scene of huge amounts of data.

    What has accumulated more than 7 billion sample size, has the following characteristics of the three main model:

    1. Deep within DNN: one of the earliest neural network, through a

    simple nonlinear classifier hierarchical overlay, good at

    capturing information.

    2. CNN: the traditional relies on the characteristic of experts in the

    field of design work into science filter, parameters automatically.

    3. RMM/LSTM: shows the human on memory and forgetting mechanism


    According to these characteristics, Ma Zejun think deep study in the future will show the perceived cognitive development trend: the so-called can replace the human's basic sensory perception, such as sight, hearing, to sound, image, character identification and classification.Cognition is the relative to the perception of higher intelligence, memory, forgotten, attention, action and decision.

    Hua-ping luo: nvidia GPU deep learning platform

    Deep learning besides algorithm, big data, also need the GPU operation platform, in operating industry, nvidia GPU + mixed operation model has helped a lot of CPU operating platform to many TOP500 rankings, over the years the company combine the GPU platform and deep learning, done a lot of work.Thus, nvidia China solution architecture and engineering director hua-ping luo titled "nvidia GPU depth learning platform to share.

    GPU computing is on traditional X86 architecture, insert a video card, to accelerate the calculation of the original.GPU from within the birth is the processor parallel lines, architecture, and architecture is not the same as before, now the earliest by the pipeline architecture, a GPU has many lines, now architecture from 2007 changed the traditional pipeline architecture, using gm, core computing architectures, are completed by a general processor.

    Now GPU computing has applied in many fields, not only in scientific research, education, calculate, oil, industry are widely used, and even oil, natural gas, now the GPU cluster has now become a standard computing platform, can make seismic waves, to understand the structure of the underground, found that where there is oil, plus speed CPU increase a lot than before and the key image rendering year-on-year increase a lot, the most important thing is not just speed, accuracy is improved.

    The GPU in the practice of deep learning platform

    Depth study of the engine is the most important is the ability to calculate, deep learning is not a cruel way, in the 80 s have, just because of the limitation of computing power and the lack of data, can't get effective results of deep learning method, now have a GPU, have the big data produced by the Internet, thus make deep learning had better application.

    GPU in the use of deep learning mainly from 2012 at the university of Toronto, using the deep learning method for classification of image recognition for the first time series, the accuracy rate increased by 10%, from now on GPU gradually popularized.

    The characteristics of deep learning, in the aggregate, and the core of the most important is the budget of the matrix, which is floating point arithmetic, the GPU to floating point arithmetic is very good at, calculation ability is very strong, now a GPU computing capacity can reach 1.8 T, as once a few machine cabinet.

    NVIDIA deep learning platform not only provides the GPU graphics, and behind the GPU graphics, needs and system manufacturers, as well as to test and demonstrate, so as to make the run better GPU center.In the software part of my job, for some special applications have developed many applications, can let everybody can make good use of the GPU without complex programming, as well as provided for by cluster GPU cluster management software;But also support the depth of the popular learning popular frameworks.Also, for a few small user or a start-up, has developed a very simple based on the depth of the graphical and external learning learning tool, very simple and easy for you to try.Finally, hua-ping luo also said don't need to understand the GPU can use a GPU server for deep learning.

    It is worth mentioning that nvidia will adopt a new way of communication link, called NVLINK, was invented by independent communication transmission protocol, much faster than traditional PCIE, the general link is 80 GB per second.Memory bandwidth is greatly increased, will take 3 d docking memory, data access to 1 TB, almost four times faster than now.Support LVLINK CPU can use LVLINK directly in the future, through LVLINK connection between the GPU, thus improve the speed of data exchange.

    Yong goodness: intelligent data algorithm and application in the field of security practice

    Alibaba from its inception, has been the security and trust to customers as the first, pay special attention to security, including security system, the technical aspects.At the same time, alibaba to deep learning applications used in security field.Then, alibaba security chief engineer yong brought the good entitled "intelligent data algorithm in practice and application in the field of security.

    Internet security problem is very complicated, yong thought can be divided into three dimensions:

    1. Information risk, if you have a Internet business information

    interaction, avoid the illegal information, if your Internet

    business has interactive products, will ban restricted goods, if

    we can only release information, without your website to be black,

    hang a horse;

    2. Computer and network risks, such as calculation of attack, invasion,

    hang a horse;

    3. No good faith between risk behaviors by the user, the user, we each

    other of fraud, making false operation with each other.

    Large data capacity and develop economies of scale to achieve a virtuous cycle

    Yong said in a speech, safety is a scale, although there are ten thousand different Internet security business, but safety background, the scene is the same, so a set of security capabilities can solve the problem of some business security.

    As an Internet pioneer, if want to find a platform with large data capacity, want to consider two questions:

    1. Huge amounts of data platform?Ali platform has huge amounts of data,

    billions of commodity, total number of images over billions, ali

    cloud, there are millions of sites;

    2. Whether has the ability to handle huge amounts of data?Here there

    are basically three points: the ability of data storage, storage

    capacity, ali's OSS provides high security and reliable

    services;High-speed flat computing power distribution;Sufficient

    and efficient algorithms to extract the essence of huge amounts of


    Under the condition of the traditional algorithm, when the little amount of data, the characteristics of the traditional algorithm effect will be more than the characteristics of the machine learning out, but the amount of data to a lot of time, the characteristic expression of saturated soon, make it difficult to improve in the ability.And machine learning method to learn the features, characteristics can learn more and better characteristics, so the effect also can continue to rise, this is the benefits of big data study, can make you learn in the vast amounts of data continuously play the role of you.In simple terms can have five methods: multimedia DNA text, image recognition, face recognition, recognition of pornographic images, brand recognition.

    Yong said in the speech last, security is a scale efficiency of service, when you have the ability of big data, has the very strong security capabilities, will attract more users, more data, there will be a benign cycle.Security is not to escape, if you are a business on the Internet, you cannot escape the security risks, and risks can be deadly.Finally, if you want to choose a platform or an ecosystem to build your website, you will want to have security capabilities in this platform, has a strong security capabilities in ali platform, has the ability to play safe scale, make the security capacity is more and more big.

    Yong-pan wang: based on the deep study of character recognition

    In the field of character recognition, deep learning also has deep application.Experts from alibaba Shared business division technology yong-pan wang brings you a titled "identification based on the deep learning to share.

    Depth study of the problems existing in the character recognition

    Deep learning character recognition problems of classes depending on the application into three parts:

    1. Electronic documents, mainly long weibo, scan files, this text

    typesetting neat, clear, uniform and font from the perspective of

    character recognition difficulty is very low, but from a user

    perspective requires very high recognition rate, and a complete

    article, errors will bring some different meanings.The user wants

    to formatting, as the output of a paragraph;

    2. Monitoring of the AD, keywords, the representative, has a strong

    industry applications in the advertising industry is more, also

    have a large number of text on taobao commodity main photo.This

    class is characterized by complicated background layout, there may

    be a very strange arrangement, font is also have the character of

    the user, the user requirements for the output is not high, told

    users have so a keyword can;

    3. The form electronically, invoice, delivery note, medical records,

    etc., hope on the line.This kind of data is characterized by complex

    layout, there is not clear, because this kind of image is a lot of

    scanning, and most of them are taken by the user, there is a

    deformation on the shooting.

    The realization of the depth study of the character recognition process

    Yong-pan wang briefly summarizes preliminary summarized three steps:

    1. Layout analysis, users want to formatting output, tell him there

    is a table, what is the output, this meant to him more, that is to

    say the late application, data mining is more helpful.As well as

    analysis, look at the data exactly what degree and what belongs to

    the type;

    2. Positioning, divided into text localization, text

    correction.Usually the background is very complicated, there are

    a lot of interference may be because of the influence of background

    after the text localization, put some wrong regions into the text

    area, or because of the background interference, some area not

    positioning.Text area correction can better find text area;

    3. Through text positioning text block, do next cut and word

    recognition, closed loop correction via voice recognition system.

    How to promote efficiency in the implementation?Here will use some means to improve efficiency of GPU, meet the needs of users to more real-time, such as online calls and so on.

    Finally, the product output is what form?Including two types of access forms: one is the needs of users is the real-time demand can provide online access;Second, the data quantity is large, requiring real-time is a private cloud solution.The advantages of automatic degree of data classification, user is very much, if not quickly grading can produce communication inconvenience.Data output format will be more flexible and can be configured according to user's requirements.

    A study on deep learning Lai Junjie: using the GPU

    Using the GPU can improve the efficiency, but how to use the GPU in-depth study of the research?Nvidia's high performance computing department director Dr Lai Junjie to share his experience and understanding.

    The more people do with the concept of accelerator general-purpose computing, graphics rendering tasks.Traditional GPU is used for image rendering, not in the same way that a map, involves a large number of triangle calculation, a large number of triangles and calculation of the pixels, if you use a very simple basic operation or arithmetic calculation

    point of view, there are many, repetition, simple operation, the operation simple with thread processing in the GPU.

    And difference of the nature of the GPU and CPU without special floating point or the entire line calculation will be performed, the main difference is that a CPU goal let users have a shorter response time, edit documents or web browsing, use the shortest possible time to respond to my mouse, keyboard operation, to achieve this goal is the most important single thread processing power, so at the time of chip design, there are a lot of unit a single thread processing performance.Had a lot of resources for hierarchical prediction or a single CPU thread for parallelism of the work, in general is LU, achieve the public unit of floating point the entire line operation ratio less than the GPU.

    In addition, the GPU computation in the design to ensure that do graphics rendering has very good performance, graphics rendering task and other areas of computing or general calculating difference is not very big.With convolution terms, such as two-dimensional convolution, the input image, the vertical Angle is a matrix, convolution kernels each element and the input image accordingly by adding, in a two-dimensional matrix of output, this is the convolution operation.First used in the graphics rendering of a lot of, this is in the halo rendering screenshots, such as the whole effect is an imitation of the camera are vague, near the far clearer effect.A designers, artists may not require painting behind a blur scenario, it is not necessary, can use the convolution model of things and people draw a convolution operation, to achieve an overall effect.

    Wang Cheng: the cloud platform

    Deep learning of new artificial intelligence is a very beautiful beautiful high performance application point, is the more important application field, the future of artificial intelligence computation may produce more than all the other areas of computational burden.Ali cloud business group of senior experts Wang Cheng take you understanding for this super calculate platform in the cloud.

    Wang Cheng think deep learning is heavily dependent on high performance computing area, and now in the cloud so far do not have this ability.He by quantitative interpretation, if is a picture of the prediction, calculation to 1.45 G, which is 3.45 G single precision floating point, training a picture of a referred to in the preceding paragraph and the consequent is 4.4 G, how about the training of a model.A total of more than 200 pictures, training after a model convergence need 540000 iterations, the whole computation to 500 p.Finally a conclusion, the cloud high performance computing need high performance.

    To this end, the cloud provides deep learning related Docker can be directly applied, if the user's scale is larger, ali can open out the internal resource scheduling system, the system with super calculate system of the country is, rely on the basis of the system is the same.On cloud at the same time get the same security guarantees, and other cloud products comprehensive docking ali cloud security, at the same time between the user's machine and machine, and other network between user's machine is completely isolated, and no later will

be very strict disk erasure, ensure there won't be any data missing.By high performance

computing with ali cloud now all other products, to build Internet users.

Report this document

For any questions or suggestions please email