Print This Page
Email This Page
 

Publications

Curriculum related Papers       Application/Experience       Tools/Architecture

Postgraduate Coursework Program using Network Processors at Osaka University
We introduce a postgraduate coursework program where students learn to use network processors. This is part of the Master's curriculum for Osaka University's Graduate School of Information Science and Technology (IST). First we briefly explain the Intel IXP1200 network processor and its evaluation kit, the ENP-2505, which are used in this program. We next describe the content of the network-processor coursework in detail.

A Network Project Course Based on Network Processors
A difficult problem in networking courses is to find hands-on projects that have the right balance between the level of realism and complexity. This is specially true for projects that focus on the internal functionality of routers and other network devices. We developed a capstone course called "Network Design and Evaluation" that uses a network processor-based platform for networking projects. This platform is more realistic than traditional approaches based on software emulation environments or PC-based routers running Unix, but it is significantly less complex to work with than real commercial routers or even PCbased routers. We are currently teaching this course for the third year, and our experience has been extremely positive. Students enjoy the realism of the platform and not only learn a lot about the internal operation of the network, but also about network configuration and management.

Curriculum       Experience/Application related Papers       Tools/Architecture

Performance Evaluation and Improvement of Algorithmic Approaches for Packet Classification

Packet classification is crucial to the implementation of several advanced services that require the capability to distinguish traffic in different flows, such as firewalls, intrusion detection systems and many QoS implementations. Although hardware solutions, such as TCAMs, provide high search rate, they do not scale to large rule-sets. Instead, some of the most promising algorithmic research embraces the practice of leveraging the data redundancy in real-life rule-sets to improve high performance packet classification. In this paper, we provide a general framework for discerning relationships and distinctions of the design-space of existing packet classification algorithms. We deeply studied several best-known algorithms, such as RFC, HiCuts and HyperCuts according to this framework and suggest for each algorithm an improved scheme. All algorithms we studied, along with their improved version, are objectively accessed using both real-life and synthetic rule-sets. The C source codes we wrote for these algorithms are publicly shared on our web-site.

FPL-3: towards language support for distributed packet processing
The FPL-3 packet filtering language incorporates explicit support for distributed processing into the language. FPL-3 supports not only generic headerbased filtering, but also more demanding tasks, such as payload scanning, packet replication and traffic splitting. By distributing FPL-3 based tasks across a possibly heterogeneous network of processing nodes, the NET-FFPF network monitoring architecture facilitates very high speed packet processing. Results show that NET-FFPF can perform complex processing at gigabit speeds. The proposed framework can be used to execute such diverse tasks as load balancing, traffic monitoring, firewalling and intrusion detection directly at the critical highbandwidth links (e.g., in enterprise gateways).

S2I: a Tool for Automatic Rule Match Compilation for the IXP Network Processor
In this paper we propose a software architecture that facilitates the use of the IXP1200 network processor in network monitoring applications that perform intrusion detection. The proposed software architecture consists of a simple, yet very efficient run-time infrastructure that distributes work inside the IXP1200, and the S2I compiler, a tool that generates efficient C code from human readable input files. The proposed software architecture achieves hand-coded speed, and facilitates the employment of the IXP1200 network processor in network intrusion detection systems.

Considering Processing Cost in Network Simulations
In many network simulations and models the cost of processing a packet is considered negligible or overly simplifies. The functionality of routers is steadily increasing and complex processing of packet payloads is being implemented (deep packet classification, encryption, and content transcoding). We show two examples where processing cost can contribute to a significant portion of the overall packet delay. To enable a more precise consideration of processing delay, we present a tool called NPEST (Network Processing Estimator). NPEST is a framework on top of which packet processing functionality can be implemented and simulated using an actual processor simulator. NPEST can be programmed in C and greatly simplifies the implementation and simulation process as compared to using network processor simulators. The results derived from NPEST can either be used directly or be aggregated to processing statistics for network simulations. We present such results for two prototype applications: IP forwarding and IP security. We also show a comparison between the results obtained from NPEST and an Intel IXP1200 network processor.

Design, Implementation, and Evaluation of Active Video-Quality Adjustment Method for Heterogeneous Video Multicast
By introducing video-quality adaptation mechanisms into intermediate network equipment using active network technologies, we can provide users with video distribution services that take into account client-to-client heterogeneity in terms of available bandwidth, performance of client systems, and user preferences for video quality.

Implementation and Evaluation of Video-Quality Adjustment for Heterogeneous Video Multicast
By introducing video-quality adaptation mechanisms into intermediate network equipments using active network technologies, we can provide users with video distribution services taking into account client heterogeneity in terms of available bandwidth, performance of client systems, and user’s preferences about video quality. In this paper, we implement the low-pass filter, a quality adjustment technique for real-time multicasting of MPEG-2 video, on an Intel IXP1200 network processor-based network node. We applied the filter to video streams passing through the node and evaluated its practicality and applicability in term of accuracy of video rate adaptation, variation of video quality, and filtering throughput. From the result of evaluation experiments, we demonstrate that the implemented video-quality adjustment mechanism has sufficient rate adaptation capability, and that the low-pass filter is able to accelerate with parallel processing.

Lightweight Network Support for Scalable End-to-end Services
Some end-to-end network services benefit greatly from network support in terms of utility and scalability. However, when such support is provided through service-specific mechanisms, the proliferation of one-off solutions tends to decrease the robustness of the network over time. Programmable routers, on the other hand, offer generic support for a variety of end-to-end services, but face a different set of challenges with respect to performance, scalability, security, and robustness. Ideally, router-based support for end-to-end services should exhibit the kind of generality, simplicity, scalability, and performance that made the Internet Protocol (IP) so successful. In this paper we present a router-based building block called Ephemeral State Processing (ESP), which is designed to have IP-like characteristics.

Fast Classification of XML with Network Processors
As XML documents become steadily more prevalent for data storage and communication, there is a growing need for routing XML messages in a network. Ideally one would be able to route XML packets at line speed. An important technology for complex packet routing are socalled Network Processors, which have a set of small ”microengines” classifying incoming packets. This paper describes an attempt to implement XML packet classification on the Intel IXA network architecture.

Network Enhanced Distributed Services
Programmable network elements can provide open access to in-network capabilities. This allows migration of processing from complex endpoints to simpler distributed network elements. This decreases system complexity as compared with a pure end-to-end architecture. Flexible service design and deployment may leverage these network elements through network-based security, network-based sorting, the sequencing of messages from multiple sources to a single destination, transaction processing, and in-stream execution of program “capsules” [1] injected into the network.

Building a Robust Software-Based Router Using Network Processors
Recent efforts to add new services to the Internet have increased interest in software-based routers that are easy to extend and evolve. This paper describes our experiences using emerging network processors---in particular, the Intel IXP1200---to implement a router. We show it is possible to combine an IXP1200 development board and a PC to build an inexpensive router that forwards minimum-sized packets at a rate of 3.47Mpps. This is nearly an order of magnitude faster than existing pure PC-based routers, and sufficient to support 1.77Gbps of aggregate link bandwidth.

Evaluating Network Processors in IP Forwarding
This paper evaluates the performance of emerging network processors---in particular, designs that employ multiple hardware contexts to hide memory latency---in constructing IP routers. Such processors are designed to forward minimum-sized IP packets at line speeds, with the advantage (over ASIC-based solutions) of being programmable. However, programming such network processors involves two challenges. The first is how to effectively employ the multiple contexts in a way that fully utilizes the memory bandwidth. The second is how to allow the network processor to be programmed dynamically (so it can support new functionality) without violating the processor's tight timing constraints. This paper addresses both of these challenges on a prototype board that uses the IXP1200 network processor.

Realizing Network Emulator System with Intel IXP1200 Network Processor
In this paper, we show the design and implementation issues of a network emulator system with Intel IXP1200 network processor. We first we introduce the architecture of Intel IXP1200 and its characteristics briefly, including the hardware I/O processing mechanisms for packet forwarding. We next describe required functions for the network emulator system such as buffering algorithms, link delay/loss emulation mechanisms and so on, and show implementation issues for those functions.

Queue Management for QoS Provision Build on Network Processor
Network processor is a kind of programmable processor performing network computing with special design and optimization. To some extent, it can be viewed as a tight-coupled multi-processor system due to its architecture of multiple in-chip processors, buses and other key components. Network processor has the property of combination of the flexibility of software and the high performance of hardware. The design and development of networking systems using network processors is an emerging field that offers numerous challenges and opportunities. This paper provides the design and implementation of queue management module combining buffer management and packet scheduling for QoS provision that uses Intel’s network processor and follows the relative differentiated service model.

A Model for the Integration of Buffer Management and Packet Scheduling
Buffer management and packet scheduling are two of the key mechanisms for QoS provision and have tight relationship since they representing enqueuing and dequeuing manipulation respectively. This paper proposes a novel model called QVI (Quality Vector Index) for the integration of buffer management and packet scheduling. QVI follows proportional differentiation service model and makes the dropping and scheduling decisions through a synthetic objective function considering loss rate, delay and bandwidth simultaneously.

Integrated Performance Evaluation Criteria for Network Traffic Control
Performance evaluation criterion is one of the most important issues for design of network traffic control mechanisms and algorithms. Due to multiple performance objectives of network traffic control, performance evaluation criteria must include multiple performance metrics executed simultaneously, which is called integrated performance evaluation criteria. In this paper, we analyze various performance metrics of network traffic control, and propose three integrated performance evaluation criteria.

Modeling and performance analysis of QoS-aware load balancing of Web-server clusters
This paper introduces mechanisms to correlate contents and priorities of incoming HTTP requests used for server process scheduling with the load balancing policies for Web-server clusters. This approach enables both load balancing and Web quality of service (QoS). Another contribution is a modeling and analysis technique based on stochastic highlevel Petri net methods for QoS-aware load balancing. We propose an approximate analysis technique to reduce the complexity of the model.

Dynamic Partial Buffer Sharing scheme: Proportional Packet Loss Rate
In this paper, we propose a new packet loss control mechanism – Dynamic Partial Buffer Sharing scheme. With DPBS scheme, the thresholds are dynamically adjusted in run-time based on packet loss behavior. So that DPBS scheme avoids parameter-setting problem, and it can achieve desired relative packet loss ratio easily. In addition, DPBS scheme can achieve higher buffer utilization for its adaptation to network fluctuation. Simulation results show that DPBS has better performance than SPBS scheme under same traffic conditions.

Analysis and Research on Network Processor (paper in Chinese)
Nowadays the most salient trend with network is the increase in data rates while there is a significant effort in developing new protocals and services. However, the traditional network devices which are custom logic based or pure software based, could hardly satisfy both performance and flexibility requirements. To overcome this obstacle, the parallel and programmable network processors have been involved into processing paths of routers (switch). Besides network processors which are built on ASIP technology and optimized for network applications, possess the characteristic of hardware and software solution simultaneously. Network processors extend the classic store-and-forward pattern, which makes room for complex QoS control and payload processing. This article introduces the related research works from two aspects, system and application. Then we analyze the system issues and challenges of network processors. What's more, we also speculate on the future evolution of network processors and associate researches.

A Fast Packet Classification Algorithm Based on Classifier’s Characteristic Applying to Multi-fields
Performing classification quickly on multi-fields is known to be difficult, and has poor worst-case performance. In this paper, we present a solution to the problem of rapidly classifying packets. Our approach is mainly based on Classifier’s Characteristic, which named PCBCC (Packet Classification Based on Classifier Characteristic), and has characteristic of high speed, multi-dimensions and modest memory requirements targeting to network processor. Besides, we have implemented this algorithm on Intel IXP1200 network processor and performance evaluation results are provided.

Reseach on Performance Evaluation Criteria for IP Network Traffic Control (paper in Chinese)
Performance evaluation criterion is one of the most important issues for design of network traffic control mechanisms and algorithms. Due to multiple performance objectives of network traffic control, performance evaluation criteria must include multiple performance metrics executed simultaneously; these are called integrated performance evaluation criteria, the research purpose of this paper.

Attacking DDoS at the Source
Distributed denial-of-service (DDoS) attacks present an Internet-wide threat. We propose D-WARD, a DDoS defense system deployed at source-end networks that autonomously detects and stops attacks originating from these networks. Attacks are detected by the constant monitoring of two-way traffic flows between the network and the rest of the Internet and periodic comparison with normal flow models. Mismatching flows are rate-limited in proportion to their aggressiveness.

A Taxonomy of DDoS Attacks and DDoS Defense Mechanisms
This paper proposes a taxonomy of distributed denial-of-service attacks and a taxonomy of the defense mechanisms that strive to counter these attacks. The attack taxonomy is illustrated using both known and potential attack mechanisms. Along with this classification we discuss important features of each attack category that in turn define the challenges involved in combating these threats. The defense system taxonomy is illustrated using only the currently known approaches. The goal of the paper is to impose some order into the multitude of existing attack and defense mechanisms that would lead to a better understanding of challenges in the distributed denial-of-service field.

A Source Router Approach to DDoS Defense
Distributed denial-of-service attacks present a great threat to the Internet, and existing security mechanisms cannot detect or stop them successfully. The problem lies in the distributed nature of the attacks, which engages the power of a vast number of coordinated hosts. The response to the attack needs to be distributed also, but cooperation between administrative domains is hard to achieve, and security and authentication of participants incur high cost. We propose a DDoS defense system deployed at source-end networks that autonomously detects and stops the attacks originating from those networks. Attacks are detected by monitoring two-way traffic flows between the network and the rest of the Internet. Monitored flows are periodically compared with predefined models of normal traffic, and those flows classified as part of DDoS attack are rate-limited. We evaluate the performance of our system in a realistic testbed.

On the Performance of Multithreaded Architectures for Network Processors
With the ever-increasing performance and flexibility requirements seen in today’s networks, we have seen the development of programmable network processors. Network processors are used both in the middle of the network, at nodes composing the backbone of the Internet, as well as at the edges of the network in enterprise class routers, switches, and host network interfaces. The application workloads for these processors require significant processing capacity but also exhibit packet-level parallelism.

A Benchmarking Methodology for Network Processors
Due to the heterogeneity of network processor architectures and constantly evolving network applications, it is currently a challenge to compare or evaluate network processors. In this paper, we present principles of a benchmarking methodology that aim to facilitate the realistic evaluation and comparison of network processors.

Workloads for Programmable Network Interfaces
Network equipment vendors are increasingly incorporating a programmable microprocessor on network interfaces to meet the performance and functionality requirements of present and emerging applications in parallel with market demand. This study identifies some properties of programmable network interface (PNI) workloads and their execution characteristics on modern high-performance microprocessor architectures including aggressive superscalar, fine-grain multithreaded, and simultaneous multithreaded architectures.

INTEL IXP1200 NETWORK PROCESSOR AND DIGITAL VIDEO BROADCASTING
The objective of this Bachelor's Thesis was to explore Intel IXP1200 network processor, which is the best-known network processor at the moment. Its internal design and programming principles were to be examined. Digital Video Broadcasting was the other research subject from the network processor point of view. Example projects were used with the IXP12EB Ethernet Evaluation Kit, which is an example system with 10/100 Mbps and gigabit Ethernet ports, implemented on top of Intel IXP1200 network processor. The study shows how to use the evaluation system.

Implementing Address Assurance in the Intel IXP Router
Many desirable network security features require new functionality in routers. Using programmable routers is an attractive approach to testing such security functionality and may serve as an easy path to deploying it. This paper describes an Intel IXP implementation of a protocol designed to help combat IP source address spoofing. It describes the problem of IP spoofing, presents a protocol that helps handle spoofing, and discusses why the Intel IXP 1200 router is an attractive platform for developing the protocol. The paper also presents the design of the IXP implementation and gives performance results that confirm the suitability of the IXP for this task.

Curriculum       Experience/Application       Tools and Architecture related Papers

A Re-configurable Component Model for Programmable Nodes
Recently developed networked services have been demanding architectures that accommodate an increasingly diverse range of applications requirements (e.g. mobility, multicast, QoS), as well as system requirements (e.g. specialized processing hardware). This is particularly crucial for architectures of network systems where the lack of extensibility and interoperability has been a constant struggle, hindering the provision of novel services. It is also clear that to achieve such flexibility these systems must support extensibility and re-configurability of the base functionality subsequent to the initial deployment. In this paper we present a component model that addresses these concerns. We also discuss the application of the component model in network processor based programmable networking environments and discuss how our approach can offer a more deployable flexible and extensible networking infrastructure.

NETKIT: A Software Component-Based Approach to Programmable Networking
While there has already been significant research in support of openness and programmability in networks, this paper argues that there remains a need for generic support for the integrated development, deployment and management of programmable networking software. We further argue that this support should explicitly address the management of run-time reconfiguration of systems, and should be independent of any particular programming paradigm (e.g. active networking or open signaling), programming language, or hardware/ operating system platform. In line with these aims, we outline an approach to the structuring of programmable networking software in terms of a ubiquitously applied software component model that can accommodate all levels of a programmable networking system from low-level system support, to in-band packet handling, to active networking execution environments to signaling and coordination.

A Globally-Applied Component Model for Programmable Networking
We argue that currently developed software frameworks for active and programmable networking do not provide a truly generic approach to the development, deployment, and management of services. Furthermore, current systems are typically targeted at a particular level of the programmable networking design space (e.g. at low-level, in-band, packet forwarding; or at high level signaling) and /or at a particular hardware platform. In addition, most existing approaches, while they may address the initial configuration of systems, neglect dynamic reconfiguration of running systems. In this paper we present a reflective component-based approach that addresses these limitations. We show how our approach is applicable at all system levels, can be applied in heterogeneous hardware environments (specifically, commodity PC-based routers and network processor-base routers), and supports both initial configuration and dynamic reconfiguration. We especially address the latter point; we show the viability of our approach in (re)configuring services on an Intel IXP1200 network processor-based router.

Genesis Kernel on IXP1200
There has been growing interest in network processor technologies that are capable of processing network traffic at line rate. Programming network processors, in order to install new data paths at run time, is a challenge. A good programming model is determined by physical constraints, performance requirements and complexity. Network processors are architecturally different from general-purpose processors. They are designed to maximize packet-processing throughput by exploiting greater hardware parallelism and hiding memory latency. Yet network processors burden the programmers with architectural details and certain inflexibilities.

The Genesis Kernel: A Programming System for Spawning Network Architectures
Currently, the design, deployment and refinement of new network architectures is a manual, ad-hoc and time-consuming process. We present the design, implementation and evaluation of the Genesis Kernel, a programming system that automates the life cycle process for the creation, deployment, management, and architecting of network architectures. We discuss our experiences in building a spawning network that is capable of creating distinct virtual network architectures on-demand.

Programmable Networks
A number of important innovations are creating a paradigm shift in networking leading to higher levels of network programmability. These innovations include the separation between transmission hardware and control software, availability of open programmable network interfaces and the accelerated virtualization of networking infrastructure. The ability to rapidly create, deploy and manage new network services in response to user demands is a key factor driving the programmable networking research community. The goal of programmable networking is to simplify the deployment of network services, leading to networks that explicitly support the process of service creation and deployment. This chapter examines the state-of-the-art in programmable networks.

Integration of Scheduling Real-Time Traffic and Cell Loss Control for ATM Networks
In this paper, new integrated schemes of scheduling real-time traffic and cell loss control in high speed ATM networks are proposed for multiple priorities based on variable queue length thresholds for scheduling and the Partial Buffer Sharing policy for cell loss control. In our schemes, the queues for buffering arriving cells can be constructed in two ways: one individual queue for each user connection, or one physical queue for all user connections. The proposed schemes are considered to provide guaranteed QoS for each connection and cell sequence integrity for virtual channel/path characteristics.

Speed up the Responsiveness of Active Queue Management System
As an enhancement mechanism for the end-to-end congestion control, AQM (Active Queue Management) can keep smaller queing delay and higher throughput by purposefully dropping the packets at the intermediate nodes. Comparing with RED algorithm, although the PO (Proportional-Integral) controller for AQM designed by C. Hollot improves the stability, it seems unscientific to tune the controller parameters through trial-error, moreover the transient performance of the PI controller is not perfect, such as the regulating time is too long. In order to overcome this drawback, in this paper, the PID (Proportional-Integral-Differential) controller is proposed to speed up the responsiveness of AQM system.

Active SANs: Hardware Support for Integrating Computation and Communication
This paper explores the view that the SAN network infrastructure can be an active computational entity capable of supporting certain classes of data intensive computations effectively during communications. The performance is achieved via the use of Field Programmable Gate Arrays (FPGAs) in the network interfaces (NIs).

The Utility of Feedback in Layered Multicast Congestion Control
Layered multicast is a common approach for dissemination of audio and video in heterogeneous network environments. Layered multicast schemes can be classified into two categories - feedback-based and feedback-free - depending on whether or not the scheme delivers feedback to the sender of the multicast session. Advocates of feedback-based schemes claim that feedback is necessary to match the heterogeneous receiver capabilities efficiently. Supporters of feedbac-free schemes believe that feedback introduce significant complexity and that a moderate amount of additional layers can balance any benifit that feedback provides.

Addressing Heterogeneity and Scalability in Layered Multicast Congestion Control
In this paper, we design SIM, a protocol that integrates three distinct mechanisms – Selective participation, Intra-group transmission adjustment, and Menu adaptation – to solve the general multicast congestion control problem. We argue that only a solution that includes elements of each mechanism can scale and adapt to heterogeneity in network and receiver characteristics.