Skeleton Parser For DCCP and TIPC

Netsend RX and TX code-paths for the DCCP and TIPC protocols are now re-implemented. They where adjusted because of the new parser engine within netsend. svn diff -r 177:178 https://svn.berlios.de/svnroot/repos/netsend/trunk For the small patch.

July 12, 2008 · 1 min · Hagen Paul Pfeifer

CLI options parsing

I reworked the netsend commandline parsing code. It now support a more network specific user input. See the following screenshot for netsend help: Usage: netsend [OPTIONS] PROTOCOL MODE { COMMAND | HELP } OPTIONS := { -T FORMAT | -6 | -4 | -n | -d | -r RTTPROBE | -P SCHED-POLICY | -N level -m MEM-ADVISORY | -V[version] | -v[erbose] LEVEL | -h[elp] | -a[ll-options] } PROTOCOL := { tcp | udp | dccp | tipc | sctp | udplite } MODE := { receive | transmit } FORMAT := { human | machine } RTTPROBE := { 10n,10d,10m,10f } MEM-ADVISORY := { normal | sequential | random | willneed | dontneed | noreuse } SCHED-POLICY := { sched_rr | sched_fifo | sched_batch | sched_other } priority LEVEL := { quitscent | gentle | loudish | stressful }

July 1, 2008 · 1 min · Hagen Paul Pfeifer

rdtscll cleanup netsend

We removed the rdtscll instruction support in netsend: userland header files doesn’t define the macro anymore (sanitized header files) functional aspect is limited (ACPI modes and the increased time/cycle time) nobody used it ;-)

June 12, 2008 · 1 min · Hagen Paul Pfeifer

Point Of Interest

Through some additional performance measurements I realize some interesting plateau. Two points are of interest, first at ~200 byte and the seconds at 600 byte (the x scale is denoted as DWORD size (uint32_t) Also quit interesting: the long duration to “warm” the cache, tlb, etc .. (BTW: we talk about microseconds)

June 7, 2008 · 1 min · Hagen Paul Pfeifer

mlockall versus Latency

New measurements for the mentioned, increased latency if you lock pages physically. As you can see, the violet line (without memory locking) reflect a superior latency behavior (lesser is better). On the other side, there is one negative peek with mlockall() - but it shouldn’t. After all: mlockall() prevent worst case scenario - the principal task for real-time application. On the other hand, it introduces a small overhead - but why?...

May 4, 2008 · 1 min · Hagen Paul Pfeifer

mlock and SCHED_FIFO

mlock() allows to lock the (current or further used) address space to physical memory. It therefor disables paging for the selected memory (or all for mlockall()). Real-time programs often uses the ability to fix their memory to avoid unpredictable situations. Think about a welding- or laser robot with out-swapped memory … mlockall() is the big brother of mlock(): you specify that all currently (MCL_CURRENT) or future touched (MCL_FUTURE) pages are locked....

May 4, 2008 · 1 min · Hagen Paul Pfeifer

recv return constraints

If you call recv() to fetch data, the syscall will block if nothing happens. But what are the constraints for tcp_recvmsg() to return? This posting focus on the major factors (other aspects like OOB data, signals, non-blocking IO is ignored in this posting): Timeout Amount of data tcp_recvmsg() first calculate the timeout for this function via sock_rcvtimeo(). This refers to return noblock ? 0 : sk->sk_rcvtimeo - in the common therefore sk->sk_rcvtimeo....

May 4, 2008 · 1 min · Hagen Paul Pfeifer

MSG_WAITALL and recv

The recv() function has a really rare argument: MSG_WAITALL. It tells that the syscall should not return before length bytes are read. The problem is that normally nobody knows how much data is send by the peer node. So if you rely on a particular amount of data and the data isn’t send, this call blocks infinity! On the other hand, a programmer must also handle this kind of failure, because a simple read() of a socket can also block forever....

May 2, 2008 · 2 min · Hagen Paul Pfeifer

SIMD++, SSE4 and SIMD16

The SSE4 programming reference is out - a opportunity to study the improvements. Besides 47 (plus 7 additional for the nehalem microarchitecture) new instructions which mainly focus on multimedia acceleration. MPSADBW (Sum of Absolute Differences), PHMINPOSUW Minimum Search (find minimum uint16_t from eight elements) (if you invite the source you had an fast max() ;-), ROUND (round floating point types) and other instructions too. Dot product matrix calculation, load hint instruction (MOVNTDQA) to store aligned data in a small data-set, packed integer format conversions (convert in wider data types), IEEE 754 Compliance operations....

April 26, 2008 · 1 min · Hagen Paul Pfeifer

GCC Optimization Aliasing

In talks I often noted that in the a big part, pointer casts are dump and futile. The C Standard stated out that a cast to another type raise an “undefined behavior”: float a = 23.f; uint32\_t b = \*(uint32\_t \*)&a; Depending on the current gcc version this leads to different results in b. Sometimes a is interpreted as a uint32_t type (that what you want - probably) - sometimes the result is 0;...

April 24, 2008 · 2 min · Hagen Paul Pfeifer