A modest proposal to help speed up & scale up the linux networking stack
Linux has some of the fastest networking code available. In side-by-side tests it significantly outperforms commercial systems like Windows XP or Solaris. But system performance is increasingly tied to network performance. Low speed peripherals like printers long ago moved off the I/O bus and onto an ethernet. Today it looks like primary storage is about to make the same jump -- 10Gb ethernet is likely to be faster, cheaper and far more flexible than device specific interconnects like SATA. And the southbridge isn't the only PC part being replaced with a network. In order to deal with steadily increasing memory bandwidth but constant memory latency, cpu designers are being forced to evolve from simple uniprocessors to multi-core multiprocessors. Techniques like cluster and grid computing can make effective use of these new compute resources but require that the northbridge, and everything else on the die, be viewed as a network. If networking is to play an increasingly central role in kernel operation, is the current kernel network architecture up to the job? We have doubts. Although the best of its breed, the linux kernel networking architecture closely follows a design first laid out in the early 80s (as do Windows, Solaris, MacOS, ...). While this design met the needs of its day, and many days beyond, it has some fundamental flaws. By re-architecting a small part of the networking code, network device drivers and the network buffer model, it's possible to get dramatic increases in performance (we have measured factors of 2X to 4X improvement in network intensive grid and web application benchmarks) and in system scalability (e.g., the new architecture has *no* locks in any part of the packet send/receive path). It is easy to convert existing drivers to the new model and the result is both simpler and smaller (e.g., a 30% reduction in size for the 2.6.11 E1000 driver). And since the new architecture is completely backward compatible, conversion can be done incrementally. This talk will describe the new architecture and the implementation of it that we currently have running in 2.4 and 2.6 kernels. We'll show detailed performance studies of the existing and new code. We'll talk about how it could be deployed and some of the benefits it could bring to both applications and kernel internals.
I've been hacking on TCP/IP for far longer than I care to remember. I spent 25 years at Lawrence Berkeley National Laboratory as head of the network research group then two years at Cisco as chief scientist before leaving to co-found Packet Design LLC. I'm currently chief scientist at two Packet Design spinouts, Packet Design Inc. and Precision I/O.Coauthor: Bob Felderman not in Database
Coauthor: CTO not in Database
Coauthor: Precision I/O Inc. not in Database