Journal of Shanghai Jiao Tong University (Science) ›› 2018, Vol. 23 ›› Issue (4): 584-.doi: 10.1007/s12204-018-1957-2
• • 上一篇
YE Feiyue (叶飞跃), XU Xinchen (徐欣辰)
发布日期:
2018-08-02
通讯作者:
XU Xinchen (徐欣辰)
E-mail: xinchenxu8011802@gmail.com
YE Feiyue (叶飞跃), XU Xinchen (徐欣辰)
Published:
2018-08-02
Contact:
XU Xinchen (徐欣辰)
E-mail: xinchenxu8011802@gmail.com
摘要: As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as Graph & Keywordρ. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the Graph & Keywordρ method outperforms the state of the art systems when tested on the Duc2004 data set.
中图分类号:
YE Feiyue (叶飞跃), XU Xinchen (徐欣辰). Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs[J]. Journal of Shanghai Jiao Tong University (Science), 2018, 23(4): 584-.
YE Feiyue (叶飞跃), XU Xinchen (徐欣辰). Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs[J]. Journal of Shanghai Jiao Tong University (Science), 2018, 23(4): 584-.
[1] CHAO S, Tao L. Multi-document summarization viathe minimum dominating set [C]//Proceedings of the23rd International Conference on Computational Linguistics.Beijing: ACM, 2010: 984-992. [2] BHARTI S K, BABU K S, PRADHAN A. Automatickeyword extraction for text summarization in multidocumente-newspapers articles [J]. European Journalof Advances in Engineering and Technology, 2017,4(6): 410-427. [3] MA L, HE T, LI F, et al. Query-focused multidocumentsummarization using keyword extraction[C]//Proceedings of 2008 International Conference onComputer Science and Software Engineering. Wuhan:IEEE, 2008: 20-23. [4] LITVAK M, LAST M. Graph-based keywordextraction for single-document summarization[C]//Proceedings of the Workshop on Multi-sourceMultilingual Information Extraction and Summarization.Manchester, UK: ACM, 2008: 17-24. [5] HONG K, CONROY J M, FAVRE B, et al. Arepository of state of the art and competitivebaseline summaries for generic news summarization[C]//Proceedings of the 9th International Conferenceon Language Resources and Evaluation. Reykjavik,Iceland: ELRA, 2014: 1608-1616. [6] RADEV D R, JING H, STYS M, et al. Centroid-basedsummarization of multiple documents [J]. InformationProcessing & Management, 2004, 40(6): 919-938. [7] ERKAN G, RADEV D R. Lexrank: Graph-based lexicalcentrality as salience in text summarization [J].Journal of Artificial Intelligence Research, 2004, 22(1):457-479. [8] WAN X, YANG J. Multi-document summarization usingcluster-based link analysis [C]//Proceedings of the31st Annual International ACM SIGIR Conference onResearch and Development in Information Retrieval.Singapore: ACM, 2008: 299-306. [9] WAN X, YANG J, XIAO J. Manifold-ranking basedtopic-focused multi-document summarization [C]//Proceedings of the 20th International Joint Conferenceon Artifical Intelligence. Hyderabad, India: MorganKaufmann Publishers Inc., 2007: 2903-2908. [10] WAN X, XIAO J. Graph-based multi-modality learningfor topic-focused multi-document summarization[C]//Proceedings of the 21th International Joint Conferenceon Artificial Intelligence. Pasadena, California,USA: Morgan Kaufmann Publishers Inc., 2009: 1586-1591. [11] CAO Z, LI W, LI S, et al. Improving multi-documentsummarization via text classification [C]//Proceedingsof the 31st AAAI Conference on Artificial Intelligence.San Francisco, California, USA: AAAI, 2017: 3053-3059. [12] HADYAN F, SHAUFIAH BIJAKSANA M A. Comparisonof document index graph using TextRank andHITS weighting method in automatic text summarization[J]. Journal of Physics: Conference Series, 2017,801(1): 012076. [13] XIONG C, LI Y, LV K. Multi-documents summarizationbased on the TextRank and its application in argumentationsystem [C]//Proceedings of the 5th InternationalConference on Emerging Internetworking, Data& Web Technologies. Wuhan, China: Springer, 2017:457-466. [14] YU S, SU J, LI P, et al. Towards high performance textmining: A TextRank-based method for automatic textsummarization [J]. International Journal of Grid andHigh Performance Computing, 2016, 8(2): 58-75. [15] BRITSOM D V, BRONSELAER A, TR′E G D. Usingdata merging techniques for generating multidocumentsummarizations [J]. IEEE Transactions on Fuzzy Systems,2015, 23(3): 576-592. [16] BARRIOS F, L′OPEZ F, ARGERICH L, et al. Variationsof the similarity function of TextRank for automatedsummarization [EB/OL]. (2016-02-11). [2017-10-23]. https://arxio.org/pdf/1602.03606.pdf. [17] AL-HASHEMI R. Text summarization extraction system(TSES) Using extracted keywords [J]. InternationalArab Journal of E-Technology, 2010, 1(4): 164-168. [18] LIN C Y. ROUGE: A package for automatic evaluationof summaries [C]//Proceedings of Workshop on TextSummarization Branches Out. Barcelina, Spain: ACL,2004. [19] WANG D, ZHU S, LI T, et al. Integrating documentclustering and multidocument summarization[J]. ACM Transactions on Knowledge Discovery fromData, 2011, 5(3): 1-26. [20] KULESZA A, TASKAR B. Determinantal point processesfor machine learning [J]. Foundations andTrends? in Machine Learning, 2012, 5(2/3): 123-286. [21] DAVIS S T, CONROY J M, SCHLESINGER JD. OCCAMS — An optimal combinatorial coveringalgorithm for multi-document summarization[C]//Proceedings of the 2012 IEEE 12th InternationalConference on Data Mining Workshops. Brussels, Belgium:IEEE, 2012: 454-463. |
[1] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(6): 757-767. |
[2] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 190-201. |
[3] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(2): 240-249. |
[4] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 7-14. |
[5] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 24-35. |
[6] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 99-111. |
[7] | . [J]. J Shanghai Jiaotong Univ Sci, 2022, 27(1): 121-136. |
[8] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 577-586. |
[9] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 587-597. |
[10] | . [J]. J Shanghai Jiaotong Univ Sci, 2021, 26(5): 670-679. |
[11] | SHI Lianxing (石连星), WANG Zhiheng (王志恒), LI Xiaoyong (李小勇) . Novel Data Placement Algorithm for Distributed Storage System Based on Fault-Tolerant Domain[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 463-470. |
[12] | ZHAN Zhu (占竹), ZHANG Wenjun (张文俊), CHEN Xia (陈霞), WANG Jun (汪军) . Objective Evaluation of Fabric Flatness Grade Based on Convolutional Neural Network[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 503-510. |
[13] | LIU Ziwen (刘子文), XIAO Lei (肖雷), BAO Jinsong (鲍劲松), TAO Qingbao (陶清宝) . Bearing Incipient Fault Detection Method Based on Stochastic Resonance with Triple-Well Potential System[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 482-487. |
[14] | MA Qunsheng (马群圣), CEN Xingxing (岑星星), YUAN Junyi (袁骏毅), HOU Xumin (侯旭敏). Word Embedding Bootstrapped Deep Active Learning Method to Information Extraction on Chinese Electronic Medical Record[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 494-502. |
[15] | SHAN Rui (山蕊), JIANG Lin (蒋林), WU Haoyue (吴昊玥), HE Feilong (贺飞龙), LIU Xinchuang (刘新闯). Dynamical Self-Reconfigurable Mechanism for Data-Driven Cell Array[J]. J Shanghai Jiaotong Univ Sci, 2021, 26(4): 511-521. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||