上海交通大学学报 ›› 2021, Vol. 55 ›› Issue (5): 586-597.doi: 10.16183/j.cnki.jsjtu.2020.187
所属专题: 《上海交通大学学报》2021年12期专题汇总专辑; 《上海交通大学学报》2021年“自动化技术、计算机技术”专题
收稿日期:
2020-06-18
出版日期:
2021-05-28
发布日期:
2021-06-01
通讯作者:
刘群
E-mail:liuqun@cqupt.edu.cn
作者简介:
袁 铭(1996-),男,重庆市人,硕士生,主要研究方向为网络表示学习.
基金资助:
YUAN Ming, LIU Qun(), SUN Haichao, TAN Hongsheng
Received:
2020-06-18
Online:
2021-05-28
Published:
2021-06-01
Contact:
LIU Qun
E-mail:liuqun@cqupt.edu.cn
摘要:
针对异质网络表示中传统元路径随机游走无法准确描述异质网络结构,不能较好地捕捉网络节点内在的真实分布问题,提出基于变分推断和元路径分解的异质网络表示方法HetVAE.该方法先结合路径相似度的思想,设计了一种节点选择策略对元路径随机游走进行改进,再通过引入变分理论对原始分布中的潜在变量进行有效采样.最后,通过设计个性化的注意力机制,对由分解获得的不同子网络的节点向量表示进行加权,再将其进行融合,使最终的节点向量表示具有更丰富的语义信息.通过在DBLP、AMiner、Yelp 这3个真实数据集上进行多组不同网络任务的实验,验证了模型的有效性.在节点分类和节点聚类任务上,与对比算法相比,微观F1值和标准化互信息分别提升了1.12%~4.36%和1.35%~18%,表明HetVAE能够有效地表征异质网络结构,学习出更符合真实分布的节点向量表示.
中图分类号:
袁铭, 刘群, 孙海超, 谭洪胜. 基于变分推断和元路径分解的异质网络表示方法[J]. 上海交通大学学报, 2021, 55(5): 586-597.
YUAN Ming, LIU Qun, SUN Haichao, TAN Hongsheng. A Heterogeneous Network Representation Method Based on Variational Inference and Meta-Path Decomposition[J]. Journal of Shanghai Jiao Tong University, 2021, 55(5): 586-597.
表1
数据集描述
数据集 | 链边关系 (A-B) | A类型节 点的数量 | B类型节 点的数量 | A和B链边 关系的数量 | 标签 数量 | 标签 类别 | 元路径 | A类型节 点平均度 | B类型节 点平均度 | 网络 平均度 |
---|---|---|---|---|---|---|---|---|---|---|
DBLP | Pa-A | 14376 | 14475 | 41794 | 4057 | 4 | APaA | 11.88 | 2.89 | 4.73 |
Pa-C | 14376 | 20 | 14376 | APaCPaA | 718.8 | |||||
Pa-T | 14376 | 8920 | 114624 | APaTPaA | 12.85 | |||||
AMiner | Pa-A | 13978 | 16543 | 52957 | 9726 | 8 | APaA | 4.79 | 3.20 | 2.05 |
Pa-C | 13978 | 2152 | 13978 | APaCPaA | 6.49 | |||||
Yelp | Bu-S | 2614 | 2 | 2614 | 2614 | 3 | BuSBu | 13.79 | 1370.0 | 9.22 |
Bu-St | 2614 | 8 | 2614 | BuStBu | 326.75 | |||||
Bu-U | 2614 | 1286 | 30838 | BuUBu | 23.98 |
表2
节点聚类任务的定量结果
算法 | DBLP | AMiner | Yelp | |||||
---|---|---|---|---|---|---|---|---|
NMI | ARI | NMI | ARI | NMI | ARI | |||
Deepwalk | 0.5841 | 0.4960 | 0.3160 | 0.2227 | 0.2940 | 0.3179 | ||
Node2vec | 0.5401 | 0.4776 | 0.3081 | 0.2219 | 0.0105 | 0.0111 | ||
HIN2vec | 0.0124 | 0.0106 | 0.1670 | 0.0758 | 0.1353 | 0.1708 | ||
Metapath2vec | 0.6395 | 0.6369 | 0.2645 | 0.2083 | 0.3540 | 0.4047 | ||
HERec | 0.6844 | 0.7104 | 0.3230 | 0.2322 | 0.3511 | 0.4018 | ||
HAN | 0.5987 | 0.5929 | 0.0375 | 0.0165 | 0.3635 | 0.4255 | ||
HetVAErw | 0.7742 | 0.8329 | 0.3324 | 0.2234 | 0.3603 | 0.4016 | ||
HetVAEsk | 0.8173 | 0.8664 | 0.3446 | 0.2321 | 0.3593 | 0.4097 | ||
HetVAEcon | 0.7826 | 0.8351 | 0.3239 | 0.2660 | 0.3416 | 0.3917 | ||
HetVAE | 0.8540 | 0.9016 | 0.4025 | 0.3798 | 0.3761 | 0.4399 |
[1] | CUI P, WANG X, PEI J, et al. A survey on network embedding[EB/OL]. (2017-11-23)[2019-12-22]. https://arxiv.org/abs/1711.08752 |
[2] | 涂存超, 杨成, 刘知远, 等. 网络表示学习综述[J]. 中国科学: 信息科学, 2017, 47(8):980-996. |
TU Cunchao, YANG Cheng, LIU Zhiyuan, et al. Network representation learning: An overview[J]. Scientia Sinica (Informationis), 2017, 47(8):980-996. | |
[3] | PEROZZI B, AL-RFOU R, SKIENA S. DeepWalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining-KDD' 14 New York, NY, USA: ACM Press, 2014: 701-710. |
[4] | GROVER A, LESKOVEC J. Node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM Press, 2016: 855-864. |
[5] | ZHU D Y, CUI P, WANG D X, et al. Deep variational network embedding in Wasserstein space[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY,USA: ACM Press, 2018: 2827-2836. |
[6] | TANG J, QU M, WANG M Z, et al. LINE: Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web-WWW '15 New York, NY, USA: ACM Press, 2015: 1067-1077. |
[7] | FU T Y, LEE W C, LEI Z. HIN2Vec: Explore meta-paths in heterogeneous information networks for representation learning[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management New York, NY, USA: ACM Press, 2017: 1797-1806. |
[8] | DONG Y X, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: SACM Press, 2017: 135-144. |
[9] | TANG J, QU M, MEI Q Z. PTE: Predictive text embedding through large-scale heterogeneous text networks[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '15. New York, NY, USA: ACM Press, 2015: 1165-1174. |
[10] | XU L C, WEI X K, CAO J N, et al. Embedding of embedding (EOE): Joint embedding for coupled heterogeneous networks[C]//Proceedings of the Tenth ACM International Conference on Web Search and Data Mining-WSDM '17. New York, NY, USA: ACM Press, 2017: 741-749. |
[11] | CHANG S Y, HAN W, TANG J L, et al. Heterogeneous network embedding via deep architectures[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining-KDD '15. New York, NY, USA: ACM Press, 2015: 119-128. |
[12] | WANG H W, ZHANG F Z, HOU M, et al. SHINE: Signed heterogeneous information network embedding for sentiment link prediction[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining-WSDM '18 New York, NY, USA: ACM Press, 2018: 592-600. |
[13] | QU M, TANG J, HAN J W. Curriculum learning for heterogeneous star network embedding via deep reinforcement learning[C]//Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining-WSDM '18. New York, NY, USA: ACM Press, 2018: 468-476. |
[14] | WANG X, JI H Y, SHI C, et al. Heterogeneous graph attention network[C]//The World Wide Web Conference. New York, NY, USA: ACM Press, 2019: 2022-2032. |
[15] | ZHANG C X, SONG D J, HUANG C, et al. Heterogeneous graph neural network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA: ACM Press, 2019: 793-803. |
[16] | CEN Y K, ZOU X, ZHANG J W, et al. Representation learning for attributed multiplex heterogeneous network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York, NY, USA: ACM Press, 2019: 1358-1368. |
[17] |
HU B B, ZHANG Z Q, SHI C, et al. Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33:946-953.
doi: 10.1609/aaai.v33i01.3301946 URL |
[18] |
SHI C, LI Y T, ZHANG J W, et al. A survey of heterogeneous information network analysis[J]. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(1):17-37.
doi: 10.1109/TKDE.2016.2598561 URL |
[19] |
SUN Y, HAN J, YAN X, et al. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11):992-1003.
doi: 10.14778/3402707.3402736 URL |
[20] | KINGMA D P, WELLING M. Auto-encoding variational bayes[EB/OL].(2014-05-01) [2019-12-22]. https://arxiv.org/abs/1312.6114 . |
[21] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[EB/OL]. (2014-05-01) [2019-12-22]. https://arxiv.org/abs/1706.03762 . |
[22] |
SHI C, HU B B, ZHAO W X, et al. Heterogeneous information network embedding for recommendation[J]. IEEE Transactions on Knowledge and Data Engineering, 2019, 31(2):357-370.
doi: 10.1109/TKDE.2018.2833443 URL |
[1] | 蔡云泽, 张彦军. 基于双通道特征增强集成注意力网络的红外弱小目标检测方法[J]. 空天防御, 2021, 4(4): 14-22. |
[2] | 张靖宜, 贺光辉, 代洲, 刘亚东. 融入BERT的企业年报命名实体识别方法[J]. 上海交通大学学报, 2021, 55(2): 117-123. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||