Hadoop 2.0 on deep learning solutions
Boston,The data science teamIs using sophisticated tools and algorithms to optimize business activities, and the commercial activity is based on the profound dialysis of user data.Machine algorithm, is widely used in data science can help us to identify and use patterns in the data.Dialysis was obtained from the Internet in the large-scale data is a challenging task, therefore, can run on a large scale algorithm is a vital needs.With the explosive growth of the data and cluster tens of thousands of machines, we need to make the algorithm can adapt to running in such a distributed environment.In general machine learning algorithms of distributed computing environment with a series of challenges.
Here, we discuss how to implement and deploy a Hadoop cluster in deep learning (a state-of-the-art machine learning framework).For the algorithm is how to adapt to run in a distributed environment, we provide the specific details.We are running on the algorithm is given in the standard data sets.
The depth of the trust network
Deep trust network (Deep are Networks, DBN) is under the condition of greed and unsupervised limited by iteration and training of Boltzmann machine (the Boltzmann those, RMB) for graphic model.By means of the following can be observed dimensions x and hidden layer between hk connected distributed modeling, DBN are trained to extract the deep dialysis of training data.
The expression 1: DBN distributed
In the following figure, the relationship between input and hidden layer can be observed.From the point of high level, the first layer is training, as a matter for the original input x model.The input data is a sparse binary dimension, which indicates that the data will be classified, for example, a binary digital image.Subsequent layer in front of the passed data (samples or activations) used as the training sample.Layer can be decided by experience, in order to get a better model performance, DBN support any number of layers.
Figure 1: DBN level
The following code snippet shows into the training of the matter.In the provided to the matter of the input data, there are multiple predefined time points.Input data is divided into small batch data, calculate the weights for each layer, the activations and deltas.