Slashdot Mirror


User: TTK+Ciar

TTK+Ciar's activity in the archive.

Stories
0
Comments
133
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 133

  1. Re:GPLv3 should bing a provision on fork limitatio on Free Software Foundation Begins Rewriting the GPL · · Score: 1

    Why do people think forks are bad?
    When two people have two different goals why should we try and force them to work together?

    Hobbesian bias; division of authority and focus is considered "bad" by those who buy into Hobbesian philosophy. Hobbes posited that having everyone work in the same direction, under the same governing authority was best, and anything that detracted from that unity was bad.

    Hobbesian bias is extremely popular, and explains a great many seemingly nonsensical attitudes in politics, IT, religion, et al.

    Personally, though, I agree with you -- different people developing a project in slightly different directions to more-optimally satisfy significantly different needs can be desireable. Hobbesian bias in such situations is little more than obstructionism.

    -- TTK

  2. Re:Great subject.... on Introverts Have More Brain Activity? · · Score: 1

    In some ways I agree with you.

    I agree with your assessment of our representatives. I agree also that the average joe has dubious tastes and marginal intelligence. Having either of these doofs anywhere near my tax money is totally unfair to me, too.

    If one accepts the tenets of democratic government, however, then having average joes in charge (or, perhaps more perfectly, having a population of politicians in charge in which intelligence is represented in the same proportion as it exists in the governed population) is completely fair to the average joe majority. At least in theory, the resulting government is similar to what we'd get if every citizen had the opportunity to examine and vote on every aspect of governance.

    Now, it's unfair even so to the average joes of the country because his representatives are pampered, privileged playboys, and therefore unlike him. But making them highly intelligent pampered, privileged playboys would only make those representatives even less like him.

    If you are wanting to ditch the notion of representative government (which is what it sounds like; correct me if I'm wrong), then getting intelligent, competent, and benign people to run the country starts to sound like a better idea. The main problem I have with it is that such systems seem to always degenerate into governments as bad as (or worse than) what we have now. At least by voting in new people every four years we keep things from getting too bad .. which is a damn scary thought, considering how bad things have already gotten.

    In short, yes, representative government sucks, and by your own arguments direct democratic government also sucks, but do you have a better system to propose? I'm personally a proponent of self-government -- social and financial organization in the absence of centralized authorities (ie, anarchy), but it's terribly unpopular. Most citizens desire a government that will take care of things for them and give them "free" goods and services. The democratic republic government seems to be better for that than the alternatives we've seen tried.

    All I'm saying is that if we're determined to stick with representative government, then making the representatives much smarter than the people they're obstensibly representing is a step in the wrong direction.

    -- TTK

  3. Re:Great subject.... on Introverts Have More Brain Activity? · · Score: 1

    There should be an IQ requirement to hold office.

    I'm not so sure that's a good idea.

    A democratic republic is a government by the people, for the people, and the people understandably want (and deserve) representatives who accurately represent their attitudes, their values, and their perspectives.

    The average joe's IQ is, by definition, 100 .. but how can a politician with an IQ of, say, 140 possibly share the same perspectives and values as that average joe? The more intelligent mind finds more patterns, makes different connections, perceives different meanings in the world around them. People who are significantly more intelligent effectively live in a different world from everyone else.

    People in a democratic republic aren't supposed to be governed by politicians who live in a different world. The whole idea is that the politicians are drawn from the same world, same country, and in some cases, same state, same county, or even same city. This results in an approximation of a government run by the same people that it governs, just with a lot fewer hands to count when a vote comes up. The people need to relate to their representatives. They need to identify with them. They need to feel a kinship.

    Now, maybe they only feel that kinship because they're too dumb to realize that the rich, pampered, millionaire aristocrats from which we pull our candidates have as little in common with the average joe as java does with javascript, but that's a different problem, and not one which can be solved by electing intelligent rich, pampered, millionaire aristocrats.

    High intelligence makes for better doctors, engineers, and scientists .. but, just in my opinion, not representatives. It's a problem that requires different ingredients to solve.

    -- TTK

  4. Re:can you say, "circumstantial evidence"? on Google Searches Used in Murder Trial? · · Score: 1

    Have you ever heard of the term "circumstantial evidence"? Google it.

    Or, even better, Research It! :-)

    -- TTK

  5. OCA and PG scratching each others' backs on Human-Powered Internet Archive Book Project · · Score: 2, Insightful

    The focuses of OCA and PG are really quite different: PG is most interested in preserving the essential information of a book (ie, its text), while OCA's interest is in preserving the form of the book (ie, its fonts, pages format, coloration, even down to the yellowing of the pages). That having been said, there's a lot each can do for the other (and has!).

    The Archive has archived most of PG's material, because even though the Books department of The Archive is focussed mostly on preserving books, The Archive as a whole is interested in preserving just about any information it can, and the PG data is definitely of interest.

    When the The Archive's Scribe software processes the book images into its various format (jpg, djvu, pdf, flippy, et al), it OCR's the book's text. This text then becomes part of generating some of the other formats. It will be really trivial for PG to obtain this text for any book it wants to incorporate into their dataset.

    qv: intlepisode00jamearch. The interesting files here are intlepisode00jamearch.txt which is just the OCR'd text, and intlepisode00jamearch_djvu.xml which is the OCR'd text with layout information (which has been useful to me in developing software which auto-corrects some OCR errors -- where the text is on the page often offers valuable hints for choosing the right heuristic for guessing the right text).

    A quick side note on the differences between Google's and OCA's efforts that I haven't seen talked about much -- Google's main advantages in their bookscanning efforts are their wealth and fame, while The Archive's main advantages are experience, familiarity, and scanning technology.

    Traditional book-scanning technologies are expensive and slow (which makes doing a lot of books, fast, that much more expensive, because you have to hire more people to do more books in parallel), but Google has enough money to throw at the problem that this is less of an issue. Google's fame means they can bring powerful partners onboard with a smile and a handshake, including some of the most prestigious libraries in the nation.

    The Archive has been involved in scanning books and making them available online for several years now (qv The Million Books Project). This experience has shaped the processes used in the acquisition and scanning of books, as well as the technology used in their storage, indexing, and presentation. Furthermore, libraries around the world have grown familiar with The Archive over the years. That, and The Archive's good track record, make it a powerful rallying point for partnerships and alliances, and have given it more experience in facilitating such relationships. Finally, partially due to the limits of existing book-scanning solutions, and partially due to The Archive's limited budget, it has facilitated the development of two independent low-cost, reliable, high-quality book-scanning systems: The Scribe (developed in-house at The Archive) and the Kirtas Robot (developed at Kirtas, a Canadian company).

    Many of the books scanned for the Million Book Project using traditional scanning methods are really lousy, sometimes to the point of being unreadable. These new scanning systems dramatically improve the quality of the end product, while equally dramatically reducing the cost-per-page. This means that more scanning systems can be purchased for more libraries (avoiding the per-library capital outlay problem), and more books can be scanned more quickly within a given budget.

    Obviously, Google and OCA can benefit from co-operation, as each has a lot to offer the other. I'd be surprised if Google didn't join the OCA, eventually, if for no other reason that to gain access to the books of the >100 OCA

  6. Infuriating on How Would You Improve SQL? · · Score: 3, Insightful

    Out of all the annoying issues, I've pulled out the most hair over unique id's, and INSERT vs UPDATE.

    Most SQL implementations give you some way of assigning unique id's to newly INSERT'ed rows. It would be nice if there were a standardized way, but that's a side issue. Once rows have unique id's, you can identify rows to be updated by id. This is very fast and simple.

    Except .. in order to find out what id the DBMS has assigned a row, I usually have to follow my INSERT with a SELECT, to read the id column. Slow and annoying. Sometimes the DBMS I am working with takes a few seconds to perform the INSERT, and ten minutes to perform the SELECT. That takes it beyond an optimization issue, and into a workability issue.

    Also, if I do not yet know if a data record has been INSERT'ed, and I need to either UPDATE the existing record or INSERT a new one (say, with just a new timestamp), then I need to either attempt an UPDATE and then fall back on INSERT if the UPDATE fails (ew!) or attempt a SELECT and either INSERT or UPDATE depending on whether it returned any rows (ew!).

    If SQL came up with a standardized way to associate unique id's with newly INSERT'ed rows, it would be very, very nice if the id column(s) assigned were returned to the client in the same packet as the message confirming that the INSERT succeeded. Nearly zero additional overhead, neat, fast, and easy.

    To solve the UPDATE/INSERT issue, I'm less sure. Say, for instance, that I have a daemon which periodically scans the filesystems in a cluster of machines, and it wants to UPDATE the "exists" column of a given row identified by a ( hostname, mountpoint, path, filename) tuple with the current time, if that row already exists, or INSERT a whole new row for that file if it does not exist. Perhaps there could be a "WRITE" command which is just like INSERT but overwrites a row if it already exists? That seems like the wrong solution, too. In the meantime, I play with caches of hashes to unique id's and lose more hair.

    -- TTK

  7. More than one .. on Building a Massive Single Volume Storage Solution? · · Score: 1

    We have a few redbox racks running in SF now. The Archive grows by about 25TB or 30TB a month, and all of our new storage is Petabox racks of redboxes. We are also retiring some of our aging whitebox systems and replacing them with redboxes, copying their contents over onto the newer media (and md5-doublechecking the contents before we let go of the old box).

    -- TTK

  8. Re:Petabox on Building a Massive Single Volume Storage Solution? · · Score: 1

    Capricorn pre-installs our setup (archive.org) on the redboxes they sell us (slightly customized Debian, reiserfs3 on each disk, no filesystem abstraction, wrapped in rsync modules for access and Alexa UDP locator for indexing), but last I heard they will negotiate with customers to install whatever OS they want, and I presume with any abstraction solution they want (RAID at the per-node level, or not; OpenAFS for cluster-wide abstraction, whatever). So if you want FreeBSD/OpenAFS, that's doable, or if you want Windows, I'm sure they'll accomodate you there too.

    The disks in the redboxes may appear to miss the optimal space-per-unit-price ratio, but really when you factor in physical compactness, data-per-unit-heat, these disk's robustness, and data-per-complete-system, they're very very good for a mass data warehousing solution, where physical space and power/cooling requirements count. The VIA C3's and VIA motherboards are extremely low-power systems, but iirc the disks make up the majority of each node's wattage appetite. It's a good system, and CR (the chief hardware engineer, and founder of Capricorn) has reason to be proud.

    BTW, if someone has the desire and money to build a data warehousing cluster, but not the expertise, I'm available for contracting gigs.

    -- TTK

  9. Re:mod parent up on Choosing Interconnects for Grid Databases? · · Score: 1

    Since you're one of the second type of responder

    Oh, am I now? Funny, before I made the post to which you replied, I made this other post first, because I thought I didn't emphasize this alternative enough there.

    You're off-base, though, in claiming that hiring a professional network engineer would cost more money than developing the necessary expertise in-house. The time and resources it takes to learn how to do it yourself will cost your business more in the long run. If it's a labor of love, though, that's a different matter -- picking up new skills because you want to know how to do things yourself is highly commendable (especially if you can do it on someone else's dime).

    -- TTK

  10. mod parent up on Choosing Interconnects for Grid Databases? · · Score: 1

    CC could have worded it more nicely, but his underlying message is spot-on. If you don't have the necessary expertise in-house, hiring a professional is faster and less expensive than growing the expertise yourself, and you'll probably end up with a better-running system.

    -- TTK

  11. This is really application-specific on Choosing Interconnects for Grid Databases? · · Score: 5, Informative

    In my own experience, fully switched Gig-E was sufficient for operating a high performance distributed database. The bottlenecks were at the level of tuning the filesystem and hard drive parameters, and memory pool sizes. But that was also a few years ago, when the machines were a lot less powerful than they are now (though hard drives have not improved their performance by all that much).

    Today, high-end machines have no trouble maxing out a single Gig-E interface, but unless you go with PCI-Express or similarly appropriate IO bus, they might not be able to take advantage of more. That caveat aside, if Gig-E proved insufficient for my application today, I would add one or two more Gig-E interfaces to each node. There is software (for Linux at least; not sure about other OS's) which allows for efficient load-balancing between multiple network interfaces. 10Gig-E is not really appropriate, imo, for node interconnect, because it needs to transmit very large packets to perform well. A good message-passing interface will cram multiple messages into each packet to maximize performance (for some definition of performance -- throughput vs latency), but as packet size increases you'll run into latency and scheduling issues. 10Gig-E is more appropriate for connecting Gig-E switches within a cluster.

    The clincher, though, is that this all depends on the details of your application. One user has already suggested you hire a professional network engineer to analyze your problem and come up with an appropriate solution. Without knowing more, it's quite possible that single Gig-E is best for you, or 10Gig-E, or Infiniband.

    If you're going to be frugal, or if you want to develop expertise in-house, then an alternative is to build a small network (say, eight machines) with single channel Gig-E, set up your software, and stress-test the hell out of it while looking for bottlenecks. After some parameter-tweaking it should be pretty obvious to you where your bottlenecks lie, and you can decide where to go from there. After experimentally settling on an interconnect, and having gotten some insights into the problem, you can build your "real" network of a hundred or however many machines. As you scale up, new problems will reveal themselves, so incorporating nodes a hundred at a time with stress-testing in between is probably a good idea.

    -- TTK

  12. Re:Why PDF? on Yahoo Competes with Google in Book Scanning · · Score: 1

    You're right, sorta. The djvu format is better than PDF for scanned books in most respects. Looks better, compresses better (and compresses by default), decompresses + renders faster while using less memory, more easily transformed to/from other formats due to availability of high-quality open source and free tools, etc. The Internet Archive's books collection has several books archived in djvu format.

    The downside is that most users do not have a djvu reader installed on their computers, and even though it's trivial to download and install djview for free, most people will not bother. The Internet Archive more or less solves this problem with a java applet which turns users' web browsers into djvu readers. This should work for other content providers as well, except nobody knows about it, so everyone stops at "oh no, nobody has a viewer installed". The end.

    On a slightly different note, though, PDF isn't that bad. It's an open format, and even though most people seem to think Acrobat is the only viewer, there are others like xpdf, which is faster, more stable, and easier to use than Acrobat (though not as fully-featured).

    -- TTK

  13. Re:Companies should Get Original on Yahoo Competes with Google in Book Scanning · · Score: 1

    You mean "First the Internet Archive, then Google, then Yahoo". The Million Books Project predates Google's bookscanning efforts by a few years.

    -- TTK

  14. Re:The big disadvantage on Clustering vs. Fault-Tolerant Servers · · Score: 1

    Clustering has a MAJOR problem going with it. Clustering requires applications to be written specifically to support clustering. All sorts of libraries have been written to "make this process easier", but one thing's for sure : it will require a recompile

    This is not true at all for many of the most-common cluster applications. Framework software exists which "gangs together" a pool of servers, each of which can run ordinary, non-cluster-aware software. No need to write code, no need for a recompile. Please qv: keepalived, for one example.

    -- TTK

  15. Absolutely right on Clustering vs. Fault-Tolerant Servers · · Score: 3, Informative

    Clustering provides you with Fault Tollerant OS/Applications. A single server with tons of redundant bits, doesn't help you if the OS or Applications that it servers get borked.

    This is dead-on correct. For example, if a CGI hits a problematic state where it eats a lot of memory putting the server into a state where it's swapping, then it takes longer to service each http transaction, which means each more httpd transactions queue up, which means more memory gets allocated which means more swapping .. rendering the machine useless for a little while (until a sysadmin or a bot notices the state and either restarts the httpd or kills a few select processes). If we were running this on one mammoth server with lots of redundant bits, then 100% of our web service capacity would be down in the interim. But since we run a pool of ten http servers under keepalived/IPVS, we only lose 10% of our capacity during that time.

    Other reasons I've traditionally preferred clustering: easy to incrementally scale up infrastructure (no big buy-in in the beginning to get the server which can be expanded), fully parallel resources (an independent memory bus, an independent IO bus, two independent CPU's, an independent network card, and a few independent disks for each server, as opposed to a mammoth shared bus on a leviathan crossbar, which will inevitably run into contention), and more flexibility in how resources are divided amongst mutually exclusive tasks.

    One of those reasons is getting less relevant -- point-to-point bus technologies like LightningTransport and PCI-Express are inexpensively replacing the "one big shared bus" with a lot of independent busses, transforming the server into a little cluster-in-a-box. It is a positive change IMO, and shifts the optimal setup away from the huge cluster of relatively small machines, and towards a more moderately-sized cluster of more medium-sized cluster-in-a-box machines.

    The price of licenses is, IME, rarely an issue (in my admittedly limited career -- I don't doubt that it's relevant to many companies) because the places I've worked for have tended to use primarily free-as-in-beer (and often free-as-in-speech) open source solutions. What is more of an issue, IME, is the necessity of staffing yourself with cluster-savvy sysadmins and software engineers. Those of that ilk tend to be a bit rare and expensive, and difficult to keep track of. It takes a distributed systems professional to look at a distributed system and understand what is being seen, and this makes it easy to bend the spec or juggle the schedule on the sly, or run skunkworks projects outright. By contrast, the insanely redundant, mondo-expensive uberserver was created and programmed by very smart hardware and software specialists so that your IT staff doesn't need to be so specialized. This makes useful talent easier to acquire, and understanding the system closer to the reach of mere mortals.

    Just my two cents
    -- TTK

  16. Have to disagree with the author on a few points on Trouble With Open Source? · · Score: 1

    The author makes some good observations, but twists them to appear more all-encompassing than they are. Also, he states some partial or total untruths.

    Any software that they write, irrespective of whether it is during or outside normal working hours, legally belongs to their employer.

    This may be true in England, but it certainly is not true in America. In America, software developed by the employee outside of business hours, using only the personal property of the employee (no company computer, software, or networks involved) is the personal property of the employee, barring any contractual stipulations to the contrary.

    Self-employed and contract software engineers are not usually bound by employer's IP rights but are unlikely to be strongly motivated to write OSS code unless they can earn a living from doing so, and the unpaid volunteer nature of OSS development tends to rule out this possibility.

    This may only be anecdotal, but I am, at times, a self-employeed engineer (though, currently trying to juggle an employeeship with the Archive while running my contracting gig part-time). If anything, being self-employed has encouraged me to open-source parts of my work, since I am then free to use it when employed by another company (just as I would use any other third-party OSS library or tool). Tools I develop as an employee on company time are not similarly available to me when I am employed by a different company. Often I have wished I could reach back into the Flying Crocodile codebase and use some of the message-passing code developed there, but I can't, because I don't own it. When The Sausalito Group dissolved and the VP and I co-founded Hardpoint Intelligence to support TSG's stranded clients, I had to redevelop all of the necessary technology from scratch before we could legally provide our services -- except for that technology which was already open-sourced (MySQL, Apache, Perl, Linux).

    we seem to have forgotten that peer review is, or should be, part of the normal software engineering process anyway

    Of course it is, and every single company I've worked for as a programmer (except TSG, where I was the only programer) used peer review as a means of double-checking code before it was deployed. But in no case were there ever more than four engineers performing this review, and more often it was only one or two engineers. Even a relatively obscure OSS project can attract more peer reviews and bugfixers than this -- when I wrote the Orcus ICB client in 1997, six fellow OSS developers leapt in, finding and fixing bugs and adding secondary features. The only time in my entire career I've gotten similar support from an employer was when I was project lead at Flying Crocodile, and had five engineers working for me. Most engineers have to work in the industry for many years before they are eligible for a project lead position, whereas any competent engineer with a cool project can attract comparable (in my case, superior) manpower no matter where they are in their career.

    and good software needs a strong architectural vision which the community-based method of software development does not foster.

    There was a slashdot article a while back which promoted a study someone did which agrees with my anecdotal experience, that most large OSS projects have one or a few core engineers who share a vision, develop the architecture, and write the "meat" of the code, and then anywhere from a few dozen to several hundred non-core engineers who fix bugs and add secondary features (much like what happened with Orcus, just on a larger scale). The non-core engineers are in flux, and will wax and wane with popular interest in the project, but the core engineers persist and change little. (And when there is a change in the core group, you can bet slashdot will run an article about it!) ;-)

    there would appear to be a distinct lack of imagination in OSS projects. The open source community has so far

  17. This is true for Slackware! on Slackware Linux 10.2 Released · · Score: 2, Informative

    lack of good automatic package management, [..] lack of all the advanced stuff like Project Utopia

    By omitting nonessential bells and whistles, Patrick Volkerding doesn't have to waste his time and energy QA'ing them. He puts more QA hours into features essential to the operation of a production server, instead. This is of critical importance. QA effort cannot entirely eliminate the bugs and incompatabilities within and between packages, but the more hours are spent doing it the closer the distribution can get to this ideal form. Stability and security are the most essential characteristics of a production server.

    lack of newbie-friendly administration tools

    Don't need them. You may be right that their absence has prevented newcomers from adopting Slackware, though. It would be nice if more companies based their services on Slackware machines -- their services would be more robust, my skills would be more in demand :-) and it would result in more third-party QA'ing of Slackware packages. But I can't bring myself to care too much because the more popular Slackware has become over the years, the more packages Patrick has agreed to incorporate into the distribution to satisfy a wider audience. "More packages" is bad because ...

    the relatively small selection of official packages

    "More packages" is bad because the number of relations between packages increases in proportion to the square of the number of packages, and the number of incompatabilities between packages is proportional to this number of relations. The smaller the package set, the more effective Patrick's QA hours are at weeding out incompatabilities in the distribution as a whole. In fact I think Slackware has gotten somewhat overbloated with packages, and would welcome a little trimming of the fat. (Of course, what I consider fat might be necessary to someone else's business, so perhaps it's best that this is left up to Patrick, who gets a more gestalt picture.)

    As an aside, I suspect what is hurting Slackware's wider adoption the most are its de-emphasis on desktop environments (it actually does pretty well at this, just not as well as some other distributions) and the popular misconception that the newest possible version of software is necessarily the best. In my experience, the decision to press a distribution into production service is often driven by what the IT elite at the company have running on their desktops. (This is more true in small companies, and less true in larger companies, where issues like availability of support by contract are more important. Though, here too Slackware comes up short.) Since Slackware holds little appeal to the desktop user, it does not take advantage of this vector. Also, since Patrick follows the sound, traditional practice of selecting for inclusion only those versions of software which are stable, the software which ships with Slackware is usually not the newest. If you look at the Slackware changelog, you can see various notes of the form "foo version x.y.13 exhibited such-and-such problems, reverted back to foo version x.y.12". Which is the way it should be done.

    Inserting gratuitous plug here for my Code of Engineering.

    -- TTK

  18. Re:see no evil, hear no evil, talk no evil.. on Five Reasons Not to Use Linux · · Score: 1

    Are all of those CAD packages compatible with AutoCAD? I mean *really* compatible? AutoCAD is the standard.

    No, not all of them are. Some of them feature certain compatibilities with AutoCAD, like translation to/from different specification languages, but frankly I glossed over those when I was shopping for CAD software for myself. AutoCAD compatibility wasn't important to me. Someone else will have to speak to this, or you can look yourself.

    2. Revit. [..] So does Linux offer a similar package?

    I haven't used Revit, but I have used BRL-CAD, and BRL-CAD can do most of what you have mentioned. BRL-CAD has oodles of features, but they aren't as application-specific as Revit's sound, and they probably aren't as well integrated with the user interface.

    As it has been pointed out in another post in this thread, CAD started on UNIX, and there are still top-quality commercial CAD tools available for UNIX. I do not know if any of them are available for Linux. It was not my intention to claim that the free / open-source CAD tools available for Linux are better than the $5K CAD tools available for Windows, merely that they are there, and that they are useful to the professional engineer.

    If you really want to spend money on your software, I invite you to look at the commercial CAD tools available for Linux via freshmeat.net -- some of them look very nice, judging by their screen shots and features lists. I have not personally used them because I found what I was looking for in BRL-CAD, which is very well-suited to modelling laminated composite structures (and spinning them around, looking at them from various angles, etc -- which is not very useful to me for my application, but would be more useful if I were modelling a house or landscape).

    -- TTK

  19. Re:see no evil, hear no evil, talk no evil.. on Five Reasons Not to Use Linux · · Score: 3, Informative

    Where's the CAD/CAM software?

    Well, aside from the 43 CAD packages (some free, some open source, some commercial) trivially accessible through freshmeat.net, there is also BRL-CAD, the recently open-sourced CAD software used by the Army Research Laboratory to model and upgrade the Abrams battletank, and other systems.

    There is also CAM software available, CNCsr being one example, used for control of CNC (Computer Numeric Control) devices (lathes, mills, routers, plasma cutters, etc).

    There are other, highly valid criticisms of this author's thesis, but the lack of engineering tools isn't one of them. The main source of Linux's strength, IMO, is that it is used by professionals (mainly engineers) to get real work done, and this use drives the direction of its development, and the development of the software running on the platform. In many cases, it is the same engineers using the software that develop the software. This naturally results in software which is highly suited to practical everyday (albeit specialized) use.

    -- TTK

  20. Re: Reliable TCP/IP stack? on Best TCP/IP Stack Implementation? · · Score: 1

    The symptoms are weird select() failures (falsely indicating that socket fd's have data available for reading), connect() failures, and spontaneously dropped connections. I have been able to reliably replicate these problems with a program which simply forks off 32 child processes and enters a select()/read()/accept() loop, while each child process opens several SOCK_STREAM connections to the parent and writes data to them as select() indicates the connections are available for writing.

    The perl source is here, if you can stomach the rather disgusting code structure (or lack thereof -- I wrote it as throw-away code, and haven't gotten around to rewriting it). The TCP failures were first seen in a native C program, so it's not a perl issue. I've replicated it under 2.2.16, 2.4.18, 2.4.21, and 2.4.25-1.

    -- TTK

  21. Re:Reliable TCP/IP stack? on Best TCP/IP Stack Implementation? · · Score: 1

    Thank you. I've been using select(), and will give epoll() a try (which appears to be available only under 2.6.x).

    -- TTK

  22. Reliable TCP/IP stack? on Best TCP/IP Stack Implementation? · · Score: 2, Informative

    To me the best network stack is one that can handle many simultaneous open sockets without problems. Performance is of secondary importance after robustness. I understand a stack will at least stall out when it tries to do more than the hardware can support, but it should pick right back up where it left off when sufficient resources are available again.

    I love Linux, and I've standardized on it as my platform of choice, but I have run into some problems with 2.4's network stack when >1000 sockets were simultaneously open and active, problems that don't go away until the system is rebooted. I've devised workarounds, but I'd rather not.

    I still need to stress-test 2.6 .. been putting it off because I don't trust early minor-revision releases, they tend to be buggy. But from what I've read it's about ready for consideration.

    But is there something better? What is the most scalable, reliable TCP/IP stack out there? Is there something that will let me open 10,000 sockets and hammer at them all at once without coming apart like wet tissue paper?

    Since I'm going to be stress-testing 2.6, I'll probably do FreeBSD and Solaris10 at the same time. Does anyone have other contenders to suggest? Not necessarily something that screams like a mofo on one socket or five, but rather something that will never, ever misbehave.

    -- TTK

  23. disclaimer on Wayback Archives as a Law Tool · · Score: 3, Informative

    I do not speak for The Archive. The above post should not be considered to reflect the official position of The Archive. It is purely my own personal opinion, and it was uttered under the influence of painkillers (I had my wisdom teeth yanked out of my jaw Wednesday, qv my Slashdot journal entry). Else I probably would have refrained -- talking about this at all while there's a court case pending was probably a really stupid idea, and I (usually) know better.

    -- TTK

  24. Re:And this is a big deal why? on Wayback Archives as a Law Tool · · Score: 5, Interesting

    For one its not quite as verifiable. Who is to say, for example, that someone with access to the Wayback servers couldn't put their own content and dates on there, and then use that as "evidence" for some suit?

    I don't know how (if?) its regulated, any insights into this?

    I work at The Archive. There are only two people, three at most, with the expertise and access to pull something like this off, and if someone tried Brad would almost definitely notice. There are checks in place to detect bitrot in the web archive, and altering older ARCs to include new information would be detected as bitrot and flagged for closer attention. They would then be compared against the copies in our sister organization's data cluster in Europe, and possibly also compared against the copies in the datacenter in Egypt.

    To make it work, you'd pretty much have to get Brad to play along, and he is fanatical about the integrity of the web data. I don't think you could pay him enough to do it, and he doesn't have any sons or daughters you could kidnap for blackmail.

    How one would go about demonstrating all of this in court, though, I do not know. IANAL.

    -- TTK

  25. Re:Internet, meet Hobbes on U.N. To Govern Internet? · · Score: 1

    I see, so it is ok for a 'libertarian' to force their belief that they own 'what they discover' on all others, whereas it is not ok for a socialist to force their belief on others that society owns what they discover. This is blatant hypocrisy.

    You have misquoted me, sir. I said it would be immoral for me to try to force my beliefs on anyone else. You have dishonestly inserted "[socialist]" into your quotation, where it is inappropriate.

    -- TTK