Title :
Efficient memory layout for packet classification system on multi-core architecture
Author :
Shaikot, S.H. ; Min Sik Kim
Author_Institution :
Sch. of Electr. Eng. & Comput. Sci., Washington State Univ., Pullman, WA, USA
Abstract :
Packet classification is primarily used by network devices, such as routers and firewalls, to do additional processing such as packet filtering, and Quality-of-Service (QoS) for a specific subset of network packets. In decision tree based packet classification system, packets are classified by searching in the tree data structure. Tree search presents significant challenges because it requires a number of unpredictable and irregular memory accesses. Since packet classification is per-packet operation and memory latency (caused by cache and TLB misses) is considerably high, any technique that can reduce cache and TLB misses can be useful in practice for improving lookup time in packet classification. In this paper, we present an efficient memory layout for the tree data structure which ensures the movement of data optimally among the different levels of the memory hierarchy on general purpose processors. In particular, for a given node size, the number of accessed cache lines (and memory pages) is minimized by our proposed memory layout resulting in less number of cache and TLB misses. This reduction directly contributes in improving the look up performance. The decision tree laid out in the proposed layout can also exploit the strong computing power of multi-core architecture by leveraging data- and thread-level parallelism. Experimental results on two different state-of-the-art processors show that significant performance improvements (40-55% faster) and near-linear speedup (3.8× on quad cores) on multi-core architecture is achievable by applying our proposed memory layout for the packet classification tree data structure.
Keywords :
cache storage; decision trees; multi-threading; multiprocessing systems; table lookup; tree data structures; tree searching; QoS; TLB misses; accessed cache lines; data-level parallelism; decision tree based packet classification system; firewalls; general purpose processors; memory accesses; memory hierarchy; memory latency; memory layout; memory pages; multicore architecture; near-linear speedup; network devices; network packets; node size; packet filtering; per-packet operation; performance improvements; quality-of-service; routers; state-of-the-art processors; thread-level parallelism; tree data structure; tree search;
Conference_Titel :
Global Communications Conference (GLOBECOM), 2012 IEEE
Conference_Location :
Anaheim, CA
Print_ISBN :
978-1-4673-0920-2
Electronic_ISBN :
1930-529X
DOI :
10.1109/GLOCOM.2012.6503501