روشي كارا براي پياده‌سازي موازي الگوريتم دسته بندي بسته درخت سلسله‌مراتبي بر روي واحد پردازش گرافيكي

عنوان به زبان ديگر

An Efficient Method for Parallel Implementation of H-Trie Packet Classification Algorithm on GPU

پديد آورندگان

رفيعي، ميلاد دانشگاه بوعلي سينا همدان , عباسي، مهدي دانشگاه بوعلي سينا همدان , نصيري،‌ محمد دانشگاه بوعلي سينا همدان

تعداد صفحه

از صفحه

181

تا صفحه

196

كليدواژه

دسته‌بندي بسته‌‌ , واحد پردازش گرافيكي , كودا , سلسله مراتب حافظه , پيچيدگي , كارايي , الگوريتم درخت سلسله‌مراتبي

چكيده فارسي

دستهبندي بستهها، پردازشي اساسي در پردازندههاي شبكهاي است. در اين فرآيند، بستههاي ورودي از طريق تطبيق با مجموعهاي از فيلترها به جريانهاي مشخص طبقهبندي ميشوند. پياده‌سازي‌هاي نرم‌افزاري الگوريتمهاي دستهبندي با وجود هزينه كم‌تر و توسعه‌پذيري بيش‌تر نسبت به پياده‌سازيهاي سخت‌افزاري، سرعت پايين‌تري دارند. در اين مقاله، از قابليت پردازش موازي پردازنده‌هاي گرافيكي براي تسريع الگوريتم درخت سلسله‌مراتبي دستهبندي بستهها، استفاده نموده و سناريوهاي متفاوتي را بر اساس معماري حافظه‌هاي سراسري و اشتراكي آن‌ها پيشنهاد مينماييم. نتايج پياده‌سازي اين سناريوها، ضمن تأييد پيچيدگيهاي زماني و حافظهاي محاسبهشده، نشان ميدهد كارايي سناريوهايي كه مجموعه فيلتر را به‌صورت زيردرختهايي كوچك‌تر يا مساوي حافظه اشتراكي تقسيم و به آن كپي ميكنند كم‌تر از سناريويي است كه كل ساختار داده را در حافظه سراسري نگه ميدارد. كارايي اين سناريوها، با كاهش تعداد زيردرختها و فيلترهاي تكراري افزايش مييابد علاوه بر اين، سناريويي كه بتواند درخت سلسله‌مراتبي و مجموعه فيلترهاي متناظر را، بدون افراز در حافظه اشتراكي جاي دهد برترين سناريو است. نتايج آزمـايش نـشان ميدهد كه نرخ گذرداد حاصله در اين سناريو نسبت به روشهاي موجود بر روي يك GPU يكسان تا 1/2 برابر بهبود مييابد.

چكيده لاتين

Abstract: Packet classification is a fundamental process in network processors. In this process, input packets are classified into distinct set of flows via matching against a set of filters. Software implementation of packet classification algorithms, though having lower cost and more scalability as compared with hardware implementations, are slower. In this paper, we use parallel processing capabilities of the graphical processors to accelerate Hierarchical-Trie packet classification algorithm and propose different scenarios based on the architecture of their global and shared memories. Results of implementing these scenarios, conforming computed time and memory complexities, show that the performance of the scenarios that divide the filter set into sub-trees, equal to/ smaller than the shared memory and copy them to it, is lower than that of a scenario which keeps the total data structure in the global memory. The performance of these scenarios increases by decreasing the number of sub-trees and duplicated filters. Moreover, a scenario that can keep hierarchical tree and corresponding filters in shared memory, without any partitioning, is the best scenario. The experimental results show that, on a same GPU, this scenario attains a throughput of approximately 2.1 times compared to the

سال انتشار

1395

عنوان نشريه

مهندسي برق دانشگاه تبريز

فايل PDF

7447435

عنوان نشريه

مهندسي برق دانشگاه تبريز

لينک به اين مدرک

https://search.isc.ac/dl/search/defaultta.aspx?DTC=8&DC=1008464