Deep learning and topological data analysis of six big
If you have a set of one thousand columns and one million lines of data.No matter from which Angle you look at it - small, medium or large data - you can't see it as a whole.To enlarge or reduce it.Can make it in a screen display fully.Due to the nature of man, if you can see things of global, we will have a better understanding.Is there any way to put the data in one picture, so that you can observe like observation map data?
The deep learning and topological data analysis together can achieve this purpose, and is more than enough.
1, it can in a few minutes to create a data graph, in which each point is a data item or a set of similar items of data.
Based on the correlation of data items and learning mode, the system will be a similar data items together.This will make the representation of data has a unique way, and will make you more clear insight into the data.Visualization of the nodes in the graph consists of one or more data points, and the links between point and point represents high similarity between data items.
2, it shows the data in the model, it is the use of traditional business intelligence can't identify.
Below is a case, show the algorithm is how to through the analysis of user behavior only to identify two different groups of people.Distinguish between the typical features, yellow and blue point: women and men.
If we analyze behavior type, we will find that one group of mostly send information (men), while another group (female) to receive more information.
3, it can identify on the surface of the multilayer segmentation data
Sectional data performance on a variety of levels, from high level classification to group with the same data item.
In the case of a Netflix data set, each data item is a movie.Is the highest level of a group of music, child, diplomacy and adult films.Section contains different levels in section: from India and Hong Kong to thriller and horror movies.In the low level grouping is a TV series, such as the "universal housekeeper", "the office", "doctor who" etc.
4, it can analyze any data: text, images, the sensor data, and even audio data.
Any data can be segmented and understand, if you can show it to digital matrix, in which each line is a data item, the column is a parameter.The following is the most common use cases:
5, if you guide it, it can learn more complex dependencies.
Select a set of data items, group them, algorithm will find all the relevant or similar items of data.Repeat this process a few times, then the neural network can learn the
differences between them, such as Mac hardware, the difference of PC hardware and general electronic text.
20000 belong to 20 different themes article has carried on the preliminary analysis, draw a dense point cloud (left).In use after a few deep learning iteration, the algorithm can classify them, error rate is only 1.2% (right).
It can also learn 6, even without supervision
Deep learning and the encoder to simulate the human brain activity, and can automatically identify high-level model in data set.In Google brain plan, for example, since the encoder by "watch" article ten million YouTube video capture of digital image, succeeded in learning and identify and cat face:
I've been using the topology on data analysis and deep learning, and developed a set of tools, it converts these technologies is a user friendly interface, can let people and discover potential links between observation data.Go to this website.