Slashdot Mirror


User: thorpej

thorpej's activity in the archive.

Stories
0
Comments
11
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 11

  1. Re:Are they girls at these parties? on NetBSD Celebrates Its 10th Anniversary · · Score: 1

    Well... the party in San Francisco is being held at the 21st Amendment (a nice restaurant and brewery down by Pac Bell Park). We're going to have some great food, some great beer, and eat a NetBSD birthday cake.

    Basically, a group of friends who have known and worked with each other for a decade now are going to get together and have a great time.

    It's a been a quite a past 10 years ... and I personally am looking forward to hacking NetBSD for the next 10!

  2. Re:Linux has better zerocopy TCP, and here's why on Zero-Copy TCP and UDP Output in NetBSD · · Score: 1

    Regarding sendfile(2)'s semantics ... thanks for correcting me on that. (I must admit it has been a while since I looked at specific sendfile(2) implementations.)

    Regarding Tx DMA alignment for cheap PCI cards... I have written a fair number of Ethernet drivers over the past several years, and I can't think of any chip that required any more than 4-byte alignment of the DMA buffer. The ones that spring to mind are the RealTek 81x9, Xircom X3201, and the VIA "Rhine" series... oh, and e.g. the Alchemy Au1000 built-in Ethernet MAC also requires 4 byte alignment.

    But by far the common case is that the chip can DMA from an arbitrary byte location in memory. I don't really consider this a "special" feature. Rather, I consider chips that can't do this to be "crippled".

    (Note, it IS fairly common for Ethernet chips to have stupid limitations on the *receive* side, specifically 4-byte alignment of the Rx buffer, which means the IP header ends up misaligned after the 14-byte Ethernet header. This is truly annoying, since software has to copy data to fix it up.)

    Regarding the 3 things you have to do to do zero-copy. NetBSD's virtual memory system has a generic framework for handling "loaning" of pages from one VM object to another. The uvm_loan facility is currently used by pipe(2) and socket writes (new with my changes). That said:

    1. Yes, we support full correctness of the written data. If another thread (or the same thread) touches the page before the kernel is finished with it, the loan is considered broken, and a copy-on-write fault is taken to resolve the situation.

    2. This is really the same question as (1). The pages aren't marked as "write in progress", per se, but are marked as "loaned" (loans can be used for things other than just outbound I/O, although that is the most obvious use of the facility).

    3. Yes, there is TLB maniuplation traffic. This is why you pick a threshold for using the loaning facility. Doing it for small writes would be stupid, since the copy would be less overhead than the TLB traffic. That said, even in an MP system, it's not too bad, since NetBSD uses explicit barrier operations for low-level VM operations (so that the machine-dependent "pmap" module can coalesce TLB operations if it would be beneficial to do so). In any case, the expense of TLB shootdown traffic is largely an implementation issue.

  3. Re:Overlapped IO / Win32 on Zero-Copy TCP and UDP Output in NetBSD · · Score: 1

    POSIX defines an asynchronous I/O interface, "aio", and it is implemented in several Unix variants. It pretty much has the semantics you describe.

  4. Re:what about zero copy on receive? on Zero-Copy TCP and UDP Output in NetBSD · · Score: 2, Insightful

    A zero-copy receive path is a significantly harder problem to solve.

    Basically, devices DMA into host memory. These buffers must be preallocated, since you never know when a packet might arrive, and when it does, it needs to go into memory immediately, since the temporary storage in the Ethernet MAC itself is quite small.

    When the data arrives, we still don't know which application it is for. We have to parse headers, etc. to determine that. And once we do, we have a buffer that is:

    1. Not page-aligned.
    2. Not page-rounded.

    This makes it very difficult to "page flip" the buffer into userspace.

    The Trapeeze/IP project at Duke implemented a zero-copy receive for FreeBSD, but it required special modificaitons to the firmware on the Alteon ACEnic Gig-E interfaces they were using. Those interfaces aren't even manufactured anymore, and there are essentially no Gig-E interfaces on the market today which allow you to hack the firmware in such a way. So, their solution pretty much can't be used unless you have full control over the hardware that's going into your device (i.e. it's pretty much of use only to people building embedded systems from scratch).

    Therefore, in the absense of another solution, you are forced to perform at least one copy on the receive side: from the interface's receive buffer into a page-aligned/page-sized buffer in the socket. Once you have that, you *can* page-flip into user space, however, and since the copy across the protection boundary is usually more expensive than a copy within the same address space, so there's still some benefit that can be realized.

  5. Re:Linux has better zerocopy TCP, and here's why on Zero-Copy TCP and UDP Output in NetBSD · · Score: 1

    Yes, Linux does have sendfile(2) while NetBSD does not. No argument there. However, sendfile(2) has some issues:

    1. It only works for sending complete files, and I seem to recall that it closes the file at the end of the transaction. This doesn't work for e.g. Samba servers, which need to send chunks of files.

    2. sendfile(2) doesn't work for data sourced from somthing other than a file. Consider a database server which maintains a memory-resident cache. Or consider piping output from a command, say, dump(8), over the network. Or consider the case of an iSCSI target device, which might have mmap'd a chunk of disk/file blocks to serve on-demand.

    My change works for those 2 (important!) cases above.

    That is not to say that NetBSD won't get a sendfile(2)-like mechanism in the future (it will probably get a splice(2) system call, which allows you to hook together 2 arbitrary file descriptors, one source, one sink -- essentially a generalization of sendfile(2)).

    It's also worth noting that the NetBSD zero-copy TCP/UDP transmit path doesn't require anything special from a device driver; the driver simply needs to be able to DMA from arbitrary memory addresses. And even if a device can't do this, you have still reduced memory bandwidth consumption by eliminating the copy from user space to kernel space.

  6. Mr Kettle, meet Mr Pot! on IP Theft in the Linux Kernel · · Score: 4, Flamebait

    Check out the very first revisions of the Linux compatibility module in FreeBSD. It looks quite a lot like the NetBSD Linux compatibility module of the same vintage, which was written by Frank van der Linden and committed to the NetBSD source tree (which was the first public release of that code) -- yet all the files say Soren Schmidt at the top.. Amazing!

  7. Re:Erm... on NetBSD Ported to AMD x86-64 (Sledgehammer) · · Score: 2

    It's quite common for system software bring-up to happen on simulators before the chips come out. In a lot of ways, its a much more desirable approach than *having* to deal with buggy, early-rev. hardware.

  8. Re:Alpha != Athlon on NetBSD/Alpha goes multiprocessor · · Score: 2

    NetBSD/i386 MP does work, but is not in the main source trunk yet -- it's on a development branch, waiting for some things to be ready for prime-time before it's merged. MP support for NetBSD/i386 will be in the NetBSD 1.6 release. The i386 port had a bit more of a challenge than I did on the Alpha port -- interrupt handling for MP systems is *totally* different than UP systems on the IA-32.

  9. Re:I really don't think you meant multiuser on NetBSD/Alpha goes multiprocessor · · Score: 5

    Actually, it is just what I meant. While uniprocessor kernels for NetBSD/alpha have run well for a number of years, I had only been able to get multiprocessor kernels into single-user mode previously. If you have never done low-level debugging of a multiprocessor capable kernel, then you probably don't know just how big of a milestone this is.

  10. Re:Name suggestion: FRESH on SSH Claims Trademark Infringement by OpenSSH · · Score: 1

    With regard to the suggested alternative name "FRESH" for "OpenSSH", I would like to point out that there is already another free SSH protocol implementation called "FreSSH". See http://www.fressh.org/.

  11. Re:No, he doesn't have to do so on SSH Claims Trademark Infringement by OpenSSH · · Score: 1

    With regard to the suggested name "FreSH" as an alternative to the name "OpenSSH", I would like to point out that there is already another free SSH protocol implementation called "FreSSH". See http://www.fressh.org/.