Home > -English-, P2P, Telecom > Survey on P2P Traffic Identification

Survey on P2P Traffic Identification

We have talked about VOIP legal monitoring and source location. In H.323 , softswitch or IMS VOIP network, it can possibly be done through signaling analysis. But as to P2P VOIP, especially encrypted P2P VOIP such as SKYPE, it is very difficult to identify P2P voice traffic.

Traffic classification and traffic identification can be useful in both ISP and enterprise environment, as well as in various occasions:

  • Network planning and design
  • Security policy such as legal monitoring, blocking
  • QOS policy such as rate limitation, prioritization
  • Pricing

Now there are two kinds of P2P traffic identification algorithms: transport layer based or payload based.

Transport layer algorithms use only IP or TCP/UDP header. Usually, these kinds of algorithms are quick, may support online identification, robust to packet losses or reordering, may support for asymmetric routing, can classify p2p or non-p2p traffic including encrypted p2p or P2P with unknown protocols. But, it is difficult to accurately identify the exact P2P application using only transport layer algorithm.

There are three sorts of transport layer traffic identification methods:
# Port based. Different P2P application uses different default port number, so it is simple and quick to detect P2P applications according to there default port. But as more and more P2P application uses random ports, or well-known port such as 80 or 443, it is improper to detect P2P traffic only according to the port number. Alok Madhukar and Carey Williamson pointed out 30%~70% Internet traffic would be classified as “unknown” using only port-based analysis[1]. ( It’s an overview with more details)
# Connection based analysis was first proposed by Karagiannis[2]. Thire analysis is flow-based, focusing on the connection level patterns of P2P applications,such as simultaneous use of TCP and UDP, particular connection patterns for {IP, port} pairs, etc. Combining connection based and port based analysis together, we can get very good false positive and false negative ratio (under 10%)
# Statistics based [5]. In the recent two or threes years, many researches are based on statistical flow attributions, such as distribution of source and destination IP address and port number, packet length, packet interval, flow duration and etc. They use random variables analytical model, neural network, machine learning techniques such as Bayesian classifier, and other algorithms formerly used in IDS. Some paper shows high accuracy ( 83%).[6].

Payload-based solutions is based on application-level signatures of various P2P application which are got through protocol analysis, so it cannot detect P2P with unknown protocols but it may still identify encrypted P2P traffic. This technique has been presented in many NIDS such as Bro and Snort. The main drawback of this kind of approaches is the computing power needed to classify the traffic, since it requires to looking deep into the packet payload. Payload-based algorithms are very accurate with low false positive and false negative ratios, but it may not support asymmetric routing and usually are not robust to packet losses, for loss of few packets may miss the entire signature. S. Sen, O. Spatscheck, and D. Wang had a good paper on payload-based solutions.

[1]Alok Madhukar ,Carey Williamson, A longitudinal Study of P2P Traffic Classification
[2]T. Karagiannis, A. Broido, M. Faloutsos, and K. Klaffy,“Transport Layer Identification of P2P Traffic,” Proceedings of the 4th ACM SIGCOMM Conference on Internet Measurement (IMC 2004), pp. 121-134, Italy, October 2004.
[3]S. Sen, O. Spatscheck, and D. Wang, “Accurate, Scalable In-Network Identification of P2P Traffic using Application Signatures,” Proceedings of the 13th International World Wide Web Conference, pp. 512-521, NY, USA, May 2004.
[4] A. McGregor, M. Hall, P. Lorier, J. Brunskill, “Flow Clustering Using Machine Learning Techniques”, Passive & Active Measurement Workshop 2004, France, April, 2004.
[5] Manuel Crotti, Maurizio Dusi, Francesco Gringoli, Luca Salgarelli, Traffic Classification through Simple Statistical Fingerprinting_, ACM SIGCOMM Computer Communication Review Volume 37, Number 1, January 2007
[6] Denis Zuev and Andrew W. Moore ,Traffic Classification using a Statistical Approach

Categories: -English-, P2P, Telecom Tags: , , , , ,
  1. October 1st, 2009 at 06:46 | #1

    Hi, Everything dynamic and very positively! :)
    Thank you
    AlexAxe

  2. September 27th, 2009 at 01:14 | #2

    sbin.cn – da best. Keep it going!
    Elcoj

  3. June 28th, 2009 at 07:59 | #3

    Hi, I was looking around for a while searching for Management System And Information Security Policy and I happened upon this site and your post regarding on P2P Traffic Identification | Telecom,Security & P2P, I will definitely this to my Management System And Information Security Policy bookmarks!

  4. March 14th, 2009 at 04:59 | #4

    Hello, I was looking around for a while searching for transport layer security and I happened upon this site and your post regarding on P2P Traffic Identification | Telecom,Security & P2P, I will definitely this to my transport layer security bookmarks!

  5. March 8th, 2009 at 17:02 | #5

    Hi there, I was looking around for a while searching for network security policies and I happened upon this site and your post regarding on P2P Traffic Identification | Telecom,Security & P2P, I will definitely this to my network security policies bookmarks!

  6. January 25th, 2009 at 13:00 | #6

    Hi there, I was looking around for a while searching for Enterprise Information Security Policy and I happened upon this site and your post regarding on P2P Traffic Identification | Telecom,Security & P2P, I will definitely this to my Enterprise Information Security Policy bookmarks!

  1. No trackbacks yet.
*