Learning representation of networks
Learning the distributed representation of data has been proved very successful in many domains such as speech, images, natural languages. In this project, our goal is to learn the distributed representation of network data, which are ubiquitous in real-world and cover various applications. Representing networks into low-dimensional spaces is potentially useful in many applications such as visualization, node classification, link prediction and recommendation. In this project, we proposed a large-scale information network embedding model called the “LINE”, which is suitable for arbitrary types of information networks: undirected, directed, and/or weighted. The method optimizes a carefully designed objective function that preserves the local and global network structures. We also proposed an efficient optimization algorithm, which is able to learn the embedding of a network with millions of vertices and billions of edges in a few hours on a single machine.
Contact:
Jian Tang, Microsoft Research, jiatang@microsoft.com, tangjianpku@gmail.com
Publications:
- Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan and Qiaozhu Mei. LINE: Large-scale Information Network Embedding. In WWW’15. (Most cited paper of WWW’15)
- Jian Tang, Meng Qu and Qiaozhu Mei. PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks. In KDD’15.
- Jian Tang, Jingzhou Liu, Ming Zhang and Qiaozhu Mei. Visualizing Large-scale and High-dimensional Data. In WWW’16. (Best paper nomination 5/727)
- Meng Qu, Jian Tang, Jingbo Shang, Xiang Ren, Ming Zhang, Jiawei Han. An Attention-based Collaboration Framework for Multi-View Network Representation Learning, in Proc. of 2017 ACM Int. Conf. on Information and Knowledge Management (CIKM’17), Singapore, Nov. 2017