上海交通大学学报(英文版) ›› 2011, Vol. 16 ›› Issue (5): 567-570.doi: 10.1007/s12204-011-1190-8

• 论文 • 上一篇    下一篇

Bit Stream Oriented Enumeration Tree Pruning Algorithm

 QIU Wei-dong 1(邱卫东),   JIN Ling1 (金  凌),   YANG Xiao-niu 2(杨小牛),   YANG Hong-wa 2(杨红娃)
  

  1. (1. School of Information Security Engineering, Shanghai Jiaotong University,
    Shanghai 200240, China; 2. National Science and Technology on Communication Information
    Security Control Laboratory, No. 36 Institute of China Electronics Technology Group Corporation, Jiaxing 314033, Zhejiang, China)      
  • 收稿日期:2011-03-10 出版日期:2011-10-29 发布日期:2011-10-20
  • 通讯作者: QIU Wei-dong (邱卫东), E-mail: qiuwd@sjtu.edu.cn

Bit Stream Oriented Enumeration Tree Pruning Algorithm

 QIU Wei-dong 1(邱卫东),   JIN Ling1 (金  凌),   YANG Xiao-niu 2(杨小牛),   YANG Hong-wa 2(杨红娃)   

  1. (1. School of Information Security Engineering, Shanghai Jiaotong University,
    Shanghai 200240, China; 2. National Science and Technology on Communication Information
    Security Control Laboratory, No. 36 Institute of China Electronics Technology Group Corporation, Jiaxing 314033, Zhejiang, China)     (1. School of Information Security Engineering, Shanghai Jiaotong University,
    Shanghai 200240, China; 2. National Science and Technology on Communication Information
    Security Control Laboratory, No. 36 Institute of China Electronics Technology Group Corporation, Jiaxing 314033, Zhejiang, China)  
  • Received:2011-03-10 Online:2011-10-29 Published:2011-10-20
  • Contact: QIU Wei-dong (邱卫东), E-mail: qiuwd@sjtu.edu.cn

摘要: Abstract:  Packet analysis is very important in our digital life. But
what protocol analyzers can do is limited because they can only process data
in determined format. This paper puts forward a solution to decode raw
data in an unknown format. It is certain that data can be cut into packets
because there are usually characteristic bit sequences in packet headers.
The key to solve the problem is how to find out those characteristic
sequences. We present an efficient way of bit sequence enumeration. Both
Aho-Corasick (AC) algorithm and data mining method are used to reduce the
cost of the process.

关键词: pattern matching, data mining, frequent set, frequent sequence,
association rule

Abstract: Abstract:  Packet analysis is very important in our digital life. But
what protocol analyzers can do is limited because they can only process data
in determined format. This paper puts forward a solution to decode raw
data in an unknown format. It is certain that data can be cut into packets
because there are usually characteristic bit sequences in packet headers.
The key to solve the problem is how to find out those characteristic
sequences. We present an efficient way of bit sequence enumeration. Both
Aho-Corasick (AC) algorithm and data mining method are used to reduce the
cost of the process.

Key words: pattern matching, data mining, frequent set, frequent sequence,
association rule

中图分类号: