BPF Analysis II

| Filter Rule | Processing Time | | — | | no filter | 469.231535129662 us | | “icmp” | 524.893505343705 us | | “icmp and host 127.0.0.1” | 367.231761850006 us | | “icmp and host 127.0.0.1 and ‘ip[6] = 64’” | 598.401917125821 us | | “icmp and host 127.0.0.1 and ‘ip[6] = 64’ and ‘ip[2:2] > 1’” | 649.767235335359 us | Processing delay sampling (needs smoothing, of course): The generated OPCODES:...

June 12, 2010 · 2 min · Hagen Paul Pfeifer

Urgent Pointer Standard and Real World Implementation

Stumbling over the Urgent Pointer code in @tcp_recvmsg()@ and reading some specs. Urgent data allows the sender to signal the receiver that “urgent data” of some form has been placed into the packet. The receiver on the other hand must deal with this condition and is forced by himself to handle this condition. If not handled the data is silently ignored. Therefore, it must be negotiated at a higher level that urgent data is transmitted and properly handled at the receiver side....

June 12, 2010 · 2 min · Hagen Paul Pfeifer

Skipping Adobe Flash

The posted flash advisory list is really long so I tried to update the player. But unfortunately Adobe skipped their 64 bit “support”:http://labs.adobe.com/technologies/flashplayer10/64bit.html (what a piece of software is this anyway - in 2010?) which actually means I had no change to run flash any more! 32 bit combat mode - no thank you, buggy software no thank you. I installed the dev version of “firefox”:http://nightly.mozilla.org/webm/ with webm support which works great....

June 12, 2010 · 1 min · Hagen Paul Pfeifer

BPF Optimizer

I started to analyse the BPF optimizer. Several options helped me: “-d” to dump the generated instructions and “-O” to disable the packet-matching code optimizer (normally only useful if you suspect a bug in the optimizer). So my modified kernel (I will post the kernel patch after I reworked the tracing ring buffer implementation) and the tcpdump possibilities we are now in the ability to analyse exactly how the optimizer works....

June 10, 2010 · 2 min · Hagen Paul Pfeifer

BPF Filter Complexity versus Execution Time

Today after some time-killing IETF debates I started to analyze the in-kernel BPF filter execution time for different BDP filters. Starting with no filter, which is translated into a simple @BPF_RET|BPF_K@ OPCODE till some more complex instructions. The average execution time lies somewhere at 300ns for no filter and somewhere above 350ns for a simple ICMP filter with 17 CPU instructions on my x86_64 (excluding call overhead). The next image illustrates this (statistically sampled data):

June 10, 2010 · 1 min · Hagen Paul Pfeifer

IETF TCPM Historicize

Lars Eggert posted today a Draft where RFC1106, RFC1110, RFC1145, RFC1146, RFC1263, RFC1379, RFC1644 and RFC1693 are declared as historic documents. But RFC1146 - TCP Alternate Checksum Options - was not superseded by a new standard nor is he defective by any sense. These both arguments are normally the statement why a RFC is declared as historic. In my eyes this is not true for this standard. In the debate Lars argued that the already assigned code points do remain assigned....

June 9, 2010 · 2 min · Hagen Paul Pfeifer

Network Stack Hash Table Memory Consumption

I stumbled across the default hash size for the different hash tables used in the network subsystem. Hash tables are used as a efficient container for different network subsystems - compared to let say list containers. The optimal complexity of a hash table is O(1), this means access in constant time, no matter how many entries are in the container. The optimum is a theoretical value and requires a hash bucket fill level from maximum ~60 percent as well as good hashing algorithms....

June 8, 2010 · 2 min · Hagen Paul Pfeifer

BPF Opcode Analysis

The following paragraphs explain the correlation between filter rules provided by any PCAP based filter program, the resulting intermediate OPCODE representation and the kernel side interpretation. The most brilliant logic within PCAP is not the sniffing functionality nor the dump file format, it is rather the optimization logic. To eliminate useless calculations, to generate efficient instruction, to skip possible IPv4 options, jump over IPv6 extension headers and so on. At the same time the optimizer must be able to eliminate useless/duplicate expressions like “IP AND IP” (this is the most trivial example, but it can be quite complex)....

June 8, 2010 · 2 min · Hagen Paul Pfeifer

Back In Munich

Three wonderful days in Paris/France: nice people, great food and gorgeous historic sites. A really lovely place on earth!

June 6, 2010 · 1 min · Hagen Paul Pfeifer

TCP Minimum RTO

Actually $user posted a regression that he measured a RTO of less then 200ms via tcpdump. Normally this is not possible because Linux bounds the minimum to 200ms. So lets see what the actual trace offers. “RFC 2988”:http://tools.ietf.org/html/rfc2988 specifies that the minimum TCP Retransmission Timeout (RTO) SHOULD be 1 second. The relative large value was selected to keep TCP conservative and avoid spurious retransmissions. The RFC was written back in 2000 and things changed....

June 4, 2010 · 2 min · Hagen Paul Pfeifer