基于变分推断和元路径分解的异质网络表示方法
收稿日期: 2020-06-18
网络出版日期: 2021-06-01
基金资助
国家自然科学基金重点项目(61936001);国家自然科学基金(61772096);国家重点研发计划(2016QY01W0200)
A Heterogeneous Network Representation Method Based on Variational Inference and Meta-Path Decomposition
Received date: 2020-06-18
Online published: 2021-06-01
针对异质网络表示中传统元路径随机游走无法准确描述异质网络结构,不能较好地捕捉网络节点内在的真实分布问题,提出基于变分推断和元路径分解的异质网络表示方法HetVAE.该方法先结合路径相似度的思想,设计了一种节点选择策略对元路径随机游走进行改进,再通过引入变分理论对原始分布中的潜在变量进行有效采样.最后,通过设计个性化的注意力机制,对由分解获得的不同子网络的节点向量表示进行加权,再将其进行融合,使最终的节点向量表示具有更丰富的语义信息.通过在DBLP、AMiner、Yelp 这3个真实数据集上进行多组不同网络任务的实验,验证了模型的有效性.在节点分类和节点聚类任务上,与对比算法相比,微观F1值和标准化互信息分别提升了1.12%~4.36%和1.35%~18%,表明HetVAE能够有效地表征异质网络结构,学习出更符合真实分布的节点向量表示.
袁铭, 刘群, 孙海超, 谭洪胜 . 基于变分推断和元路径分解的异质网络表示方法[J]. 上海交通大学学报, 2021 , 55(5) : 586 -597 . DOI: 10.16183/j.cnki.jsjtu.2020.187
Aimed at the problem that the traditional meta-path random walk in heterogeneous network representation cannot accurately describe the heterogeneous network structure and cannot capture the true distribution of network nodes well, a heterogeneous network representation method based on variational inference and meta-path decomposition is proposed, which is named HetVAE. First, combining with the idea of path similarity, a node selection strategy is designed to improve the random walk of the meta-path. Next, the variational theory is introduced to effectively sample the latent variables in the original distribution. After that, a personalized attention machanism is implemented, which weights the node vector representation of different sub-networks obtained by decomposition. Then, these node vectors are fused by the proposed model, so that the final node vector representation can have richer semantic information. Finally, several experiments on different network tasks are performed on the three real data sets of DBLP, AMiner, and Yelp. The effectiveness of the model is verified by these results. In node classification and node clustering tasks, compared with some state-of-the-art algorithms, the Micro-F1 and normalized mutual information (NMI) increase by 1.12% to 4.36% and 1.35% to 18% respectively. It is proved that HetVAE can effectively capture the heterogeneous network structure and learn the node vetcor representation that conforms more with the true distribution.
[1] | CUI P, WANG X, PEI J, et al. A survey on network embedding[EB/OL]. (2017-11-23)[2019-12-22]. https://arxiv.org/abs/1711.08752 |
[2] | 涂存超, 杨成, 刘知远, 等. 网络表示学习综述[J]. 中国科学: 信息科学, 2017, 47(8):980-996. |
[2] | TU Cunchao, YANG Cheng, LIU Zhiyuan, et al. Network representation learning: An overview[J]. Scientia Sinica (Informationis), 2017, 47(8):980-996. |
[3] | PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining-KDD' 14 New York, NY, USA: ACM Press, 2014: 701-710. |
[4] | GROVER A, LESKOVEC J. Node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM Press, 2016: 855-864. |
[5] | ZHU D Y, CUI P, WANG D X, et al. Deep variational network embedding in Wasserstein space[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY,USA: ACM Press, 2018: 2827-2836. |
[6] | TANG J, QU M, WANG M Z, et al. LINE: Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web-WWW '15 New York, NY, USA: ACM Press, 2015: 1067-1077. |
[7] | FU T Y, LEE W C, LEI Z. HIN2Vec: Explore meta-paths in heterogeneous information networks for representation learning[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management New York, NY, USA: ACM Press, 2017: 1797-1806. |
[8] | DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: SACM Press, 2017: 135-144. |
[9] | TANG J, QU M, MEI Q Z. PTE: Predictive text embedding through large-scale heterogeneous text networks[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '15. New York, NY, USA: ACM Press, 2015: 1165-1174. |
[10] | XU L C, WEI X K, CAO J N, et al. Embedding of embedding (EOE): Joint embedding for coupled heterogeneous networks[C]//Proceedings of the Tenth ACM International Conference on Web Search and Data Mining-WSDM '17. New York, NY, USA: ACM Press, 2017: 741-749. |
[11] | CHANG S Y, HAN W, TANG J L, et al. Heterogeneous network embedding via deep architectures[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '15. New York, NY, USA: ACM Press, 2015: 119-128. |
[12] | WANG H W, ZHANG F Z, HOU M, et al. SHINE: Signed heterogeneous information network embedding for sentiment link prediction[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining-WSDM '18 New York, NY, USA: ACM Press, 2018: 592-600. |
[13] | QU M, TANG J, HAN J W. Curriculum learning for heterogeneous star network embedding via deep reinforcement learning[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining-WSDM '18. New York, NY, USA: ACM Press, 2018: 468-476. |
[14] | WANG X, JI H Y, SHI C, et al. Heterogeneous graph attention network[C]//The World Wide Web Conference. New York, NY, USA: ACM Press, 2019: 2022-2032. |
[15] | ZHANG C X, SONG D J, HUANG C, et al. Heterogeneous graph neural network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA: ACM Press, 2019: 793-803. |
[16] | CEN Y K, ZOU X, ZHANG J W, et al. Representation learning for attributed multiplex heterogeneous network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA: ACM Press, 2019: 1358-1368. |
[17] | HU B B, ZHANG Z Q, SHI C, et al. Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33:946-953. |
[18] | SHI C, LI Y T, ZHANG J W, et al. A survey of heterogeneous information network analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1):17-37. |
[19] | SUN Y, HAN J, YAN X, et al. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11):992-1003. |
[20] | KINGMA D P, WELLING M. Auto-encoding variational bayes[EB/OL].(2014-05-01) [2019-12-22]. https://arxiv.org/abs/1312.6114 . |
[21] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. (2014-05-01) [2019-12-22]. https://arxiv.org/abs/1706.03762 . |
[22] | SHI C, HU B B, ZHAO W X, et al. Heterogeneous information network embedding for recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 31(2):357-370. |
/
〈 |
|
〉 |