Slashdot Mirror


User: Animats

Animats's activity in the archive.

Stories
0
Comments
14,273
First seen
Last seen
Profile
(view on slashdot.org)

Comments · 14,273

  1. Re:There are three types of files. on ZFS Gets Built-In Deduplication · · Score: 1

    There's more than there used to be due to the rise of the small-database-as-library, such as sqlite.

    Yes, and they tend to use common libraries, which need to be file-system aware if their data is to recover properly across crashes.

    There are programs which are database-like but not ACID-like databases, like ".zip" file managers. Those should open for exclusive use when writing. Reading a .zip file being written by another program is generally disappointing.

  2. Re:There are three types of files. on ZFS Gets Built-In Deduplication · · Score: 1

    The main corner case in your suggested "unit file" implementation is where someone is overwriting a file too large for the filesystem to contain two copies of it. You have to truncate when this happens to fit the new one, you can't just keep the old one around until it's replaced.

    At that point, the program creating the huge file should get an I/O error, and the old copy should be intact. If you're creating files that big, you usually check the available space before writing the file, as installers have done for many years now. You may have to delete the old file first. UCLA Locus did this in the 1980s, incidentally, and some of that machinery went into AIX. (Locus had unusual file system semantics. If you started to overwrite a file, you created a new file which shared blocks with the old one in a copy-on-write sense. The new file appeared to other readers when you closed the file or called "commit()". If you called "revert()" or the program crashed or the network disconnected, the file reverted to its previous state.)

    As for your "managed files" case, that won't work for all database approaches.

    Torvalds has written about how databases should talk to file systems. Databases and file systems need to know something about each other. There's posix_fadvise() and fsync() as well as the open modes, and the use of any of those generally indicates that it's a managed file.

  3. There are three types of files. on ZFS Gets Built-In Deduplication · · Score: 5, Interesting

    I'd argue that file systems should know about and support three types of files:

    • Unit files. Unit files are written once, and change only by being replaced. Most common files are unit files. Program executables, HTML files, etc. are unit files. The file system should guarantee that if you open a unit file, you will always read a consistent version; it will never change underneath a read. Unit files are replaced by opening for write, writing a new version, and closing; upon close, the new version replaces the old. In the event of a system crash during writing, the old version of the file remains. If the writing program crashes before an explicit close, the old file remains. Unit files are good candidates for unduplication via hashing. While the file is open for writing, attempts to open for reading open the old version. This should be the default mode. (This would be a big convenience; you always read a good version. Good programs try to fake this by writing a new file, then renaming it to replace the old file, but most operating systems and file systems don't support atomic multiple rename, so there's a window of vulnerability. The file system should give you that for free.)
    • Log files Log files can only be appended to. UNIX supports this, with an open mode of O_APPEND. But it doesn't enforce it (you can still seek) and NFS doesn't implement it properly. Nor does Windows. Opens of a log file for reading should be guaranteed that they will always read exactly out to the last write. In the event of a system crash during writing, log files may be truncated, but must be truncated at an exact write boundary; trailing off into junk is unacceptable. Unduplication via hashing probably isn't worth the trouble.
    • Managed files Managed files are random-access files managed by a database or archive program. Random access is supported. The use of open modes O_SYNC, O_EXCL, or O_DIRECT during file creation indicates a managed file. Seeks while open for write are permitted, multiple opens access the same file, and O_SYNC and O_EXCL must work as documented. Unduplication via hashing probably isn't worth the trouble and is bad for database integrity.

    That's a useful way to look at files. Almost all files are "unit" files; they're written once and are never changed; they're only replaced. A relatively small number of programs and libraries use "managed" files, and they're mostly databases of one kind or another. Those are the programs that have to manage files very carefully, and those programs are usually written to be aware of concurrency and caching issues.

    Unix and Linux have the right modes defined. File systems just need to use them properly.

  4. Re:In Defense of Artificial Intelligence on IT Snake Oil — Six Tech Cure-Alls That Went Bunk · · Score: 4, Interesting

    Having taken several courses on AI, I never found a contributor to the field that promised it to be the silver bullet -- or even remotely comparable to the human mind.

    Not today, after the "AI Winter". But when I went through Stanford CS in the 1980s, there were indeed faculty members proclaiming in print that strong AI was going to result from expert systems Real Soon Now. Feigenbaum was probably the worst offender. His 1984 book, The Fifth Generation (available for $0.01 through Amazon.com) is particularly embarrassing. Expert systems don't really do all that much. They're basically a way to encode troubleshooting books in a machine-processable way. What you put in is what you get out.

    Machine learning, though, has made progress in recent years. There's now some decent theory underneath. Neural nets, simulated annealing, and similar ad-hoc algorithms have been subsumed into machine learning algorithms with solid statistics underneath. Strong AI remains a long way off.

    Compute power doesn't seem to be the problem. Moravec's classic chart indicates that today, enough compute power to do a brain should only cost about $1 million. There are plenty of server farms with more compute power and far more storage than the human brain. A terabyte drive is now only $199, after all.

  5. It's GMail's long-term storage that;s the problem on An Inbox Is Not a Glove Compartment · · Score: 1

    It's not an inbox problem. It's a GMail long-term storage problem. It was settled in United States v. Councilman that the Electronic Communications Privacy Act applied to messages in "temporary storage". This decision

    Also, this was a search with a court-issued search warrant. The question being litigated is whether the service provider has to tell the customer about the warrant.

  6. Using satellite imagery on Find DARPA's Balloons, Win $40K · · Score: 1, Informative

    Here's a sample image. Yes, that's from orbit.

    Each satellite images about 1 million km^2 per day, so in 250 days, they can image the entire planet at high resolution. But they'll do the populated parts of the US more often (they can aim the cameras for each pass), so they will pick up many of the balloons.

    Microsoft Bing is buying all the data, so it's going on line. The data rate is about 50GB/hour. Start programs looking for red dots.

  7. "Infinite Music" is coming. Musicians will hate it on The Golden Age of Infinite Music · · Score: 2, Informative

    We already have a form of "infinite music" - DJing. But so far, DJs can't do very much to the music. They can play with timing and mixing, and maybe do some scratching, but that's about it.

    Now look at Vocaloid 2. Load up a singer model, a lyrics file, and a MIDI file, and out comes reasonably good music. (It's in Japanese; this was the #1 program for sale on Amazon Japan for a while.)

    Currently, building a singer model for Vocaloid requires about a week of work by the singer. Working backwards from existing music to a vocal tract model and a style model isn't yet available. But as machine learning techniques progress, that problem should be solved.

    When a DJ has the option to play any song with any musicians, then we'll have infinite music. The day may come when musicianship will be an archaic art like calligraphy and oratory.

    (Even better, the RIAA can't stop it. These are "covers", even though they're machine-generated. You have to pay the small statutory royalty to the composer, and you owe the musician and the recording company nothing.)

  8. "Lost Decade" - Not on Microsoft's Lost Decade · · Score: 4, Insightful

    Microsoft's revenues nearly tripled from $23B to $58B on Ballmer's watch.

    And this was a "lost decade?"

    General Motors had a lost decade. Microsoft did not.

  9. Wind usually not a problem. on What Happened To the Bay Bridge? · · Score: 2, Informative

    It's surprising that they had trouble there. That's a big, stiff truss span, with lots of cross-bracing. Those usually don't have serious wind problems. (The Tay Bridge disaster was, of course, one involving a truss bridge. But it was badly designed and very badly fabricated.) The worst case for wind is a long, narrow, thin span. The Tacoma Narrows Bridge collapsed through that kind of failure, and the Golden Gate Bridge was vulnerable to it. In 1951, during high winds, the Golden Gate Bridge deflected enough that one side of the roadbed was 11 feet higher than the other. Stiffening trusses were added under the span. (These are big trusses, each over 20' high, but the bridge is so huge that few people noticed the retrofit.)

    In the 1989 quake, the Bay Bridge had an upper deck section break at the joint between the high truss span and the lower spans. That was an impedance mismatch - the two sections oscillated in different ways, and the stress at the transition point was enough to break bolts. When the Bay Bridge was designed in the 1930s, those problems weren't well understood, and could not yet be simulated.

    The problem seems to be that the quick fix for the crack was underdesigned. That was recognized within days, and a second fix was under construction.

    The damaged eyebar could be replaced, but that requires fabricating a new eyebar and some specialized tooling to take off the load from that whole eyebar chain during repair. This span will be torn down in a few years, when the new span is finished, so that may not be worth it.

  10. What engineering is really about. on What Happened To the Bay Bridge? · · Score: 4, Interesting

    No, Slashdot is mostly made up of computer janitors.

    I do get that feeling now and then.

    Many years ago, I went to a serious engineering school. There, the final exam in a course in structural engineering was this:

    At the final exam, each student had to design a link to attach two pins some distance apart. There were obstacles between the pins and the link had to go around then. The design was to be for a specified grade of aluminum and had to support a specified load. Students knew in advance what the exam would be, except for where the obstacles would be. For the exam, you sat at a drafting table, and turned in a drawing.

    The link you designed was then machined out of aluminum by a machinist. It was put in a testing machine and placed under the specified load. If the link broke, you failed the course.

    If the link didn't break, it was weighed. Lower weights yielded higher grades for the course.

    This is how good structural engineers are trained. (I'm not one. I was in EE/CS, and we had a different make-or-break exam.)

  11. More promising approaches have been tried on Fixing Bugs, But Bypassing the Source Code · · Score: 1

    There are more promising approaches, mostly involving some form of checkpointing. The idea is that when an error is detected, you go back to the previous checkpoint at which things were going well, determine what input caused the problem, reject that input, and continue forward from there. In some cases, you have a second, different program checking the outputs from the first. This sort of thing has been used in telephony, and Tandem, before HP acquired it, was big on this sort of thing.

    The clever thing to do is to collect the failing cases and, from them, build a model of what triggers the bug. There's been work on automated crash dump analysis that does this sort of thing, at both HP and Microsoft.

  12. Re:Block posts to Usenet via Google on jQuery Dev Bemoans Overwhelming Spam On Google Groups · · Score: 1

    I'm not sure where Usenet ever got into the discussion, actually. Neither TFS nor TFA ever mentions it.

    The original article was "Query Dev Bemoans Overwhelming Spam On Google Groups". Google Groups are mostly just a front-end to USENET. The Jquery group, though, isn't exported to USENET, having been created from the Google Groups side.

  13. Download Microsoft "autorun" and turn stuff off on Who Installs the Most Crapware? · · Score: 4, Informative

    Autorun, by Mark Russinovich at Microsoft, gives you a complete checklist of everything that's started at bootup or login. With checkboxes that turn it off. This is worth running just to see what's in there. You may turn too much off and break something, but you can run Autorun again and turn it back on.

    There's plenty of stuff worth turning off, like those useless programs that keep polling to see if Adobe Acrobat or Sun Java came out with a new version. Some of those programs are too aggressive, too. Adobe's poller seems to try to re-associate PDF files with Acrobat, after I'd changed the ".pdf" association to launch Sumatra PDF.

    It's annoying that even legitimate updaters seldom schedule themselves as periodic tasks, which Windows does well and which have no overhead when they're not running. No, they have to have their own little executable in memory.

  14. What's science done for us lately? on Study Says US Needs Fewer Science Students · · Score: 1, Insightful

    What's science done for us lately? The high-energy physics people aren't any closer to a clear theory of how physics works down at the bottom than they were thirty years ago. They're just confused in a different way. There hasn't been a major breakthrough in nuclear power since the first nuclear plant came on line over half a century ago. The rocket scientists are doing worse than they did in the '50s and '60s. Aircraft are about the same size as 30 years ago, and a little slower. On the medical side, life expectancy hasn't gone up by much in fifty years, although more of the problems of old people can be patched for a while now. Materials are a little better; plastics are slightly better than in the 1950s, and we have carbon fiber golf clubs now. Big deal. Yes, computers and phones are much better. Semiconductors are far better.

    Business has recognized this, and doesn't put money into basic R&D any more. The big wins aren't there. Maybe science, like oil, has peaked. We've made the easy discoveries.

    So why put more money into science?

  15. Block posts to Usenet via Google on jQuery Dev Bemoans Overwhelming Spam On Google Groups · · Score: 3, Informative

    Maybe the answer is to block posts to USENET that come in via Google. That seems to be the source of the trouble.

    Looking at the newsgroup "comp.lang.python", all the spam seems to be coming in via "posting.google.com" with GMail return addresses. Bulk-created phony gmail accounts are such a source of spam that they should be blocked until Google gets their act together. At this point, we have to view GMail like Hotmail, another free email account system made useless by spammers.

    Hotmail is widely blocked. Next, Gmail?

  16. Re:Will anyone care? on Film Studios May Block DVD Rentals For One Month · · Score: 1

    Does anyone actually "line up" to see a movie when it's first released anymore?

    Standing in line is so last-cen.

    I remember going to the last Star Wars movie (not the cartoon) on the day it opened, almost a decade ago. We were expecting long lines, and so was the theater; they had extra security guards and ushers. There was no line, the theater wasn't full, and the movie sucked. And this was in the theater complex next to SGI and Google HQ.

    I haven't had to stand in line to get into a theater in this decade. Movie theaters used to deliberately create lines by selling tickets at too few windows. They don't dare do that now. Now, if you see a line at a theater, you think, "Hey, let's go home and download it instead".

  17. Re:Not stereoscopic on Android Phone Turned Into Virtual Reality Goggles · · Score: 2, Interesting

    This rig isn't stereoscopic and therefore isn't a pair of "virtual reality goggles" in the classic sense.

    In outdoor scenes, you can't tell anyway. Stereoscopy only matters for objects out to a meter or so. The change in relative position of near and distant objects is a more powerful cue than stereoscopy. (You don't have enough information from Google StreetView images to do that anyway,)

    When Jaron Lanier demoed his original virtual reality system to me in the 1980s, he mentioned that once one of the two SGI machines driving the two head-mounted displays had gone down, so they just piped the same image to both displays and nobody noticed.

    I'd like to see a modern VR system with a frame rate of 120FPS or better and a lag of no more than two frame times between input from head motion to image display. That might actually not suck.

  18. Re:HA is a solution in search of a problem. on What is the Current State of Home Automation? · · Score: 1

    The reason there's no 'good' home automation products is because there's not enough demand, pure and simple. At the end of the day, HA is 99% bling and maybe 1% utility.

    Agreed. The place where building automation is underused, though, is in places that have meeting rooms - schools, universities, churches, offices, and hotels. Meeting rooms need the full set of sensors for HVAC control - movement, heat, C02, CO, temperature, and humidity. These are in place in many buildings today, but not in enough of them.

    Meeting rooms have the property that the people load changes suddenly. A completely empty room can have its lights turned out, or at least very low, and the air change rate can be cut very low. When people enter a room, the lights come up, dampers open, and blower speeds increase. The CO2 measurement is an indicator of people load; when CO2 goes up, blower speeds have to go up to increase the air change rate. CO content indicates smokers, which means more air has to be drawn from the outside. And, of course, the system is measuring these parameters for outside air, so outside air can be used to heat, cool, humidify, or dehumidify, as required.

    This alone makes meeting rooms much more pleasant, while cutting energy use. But there's no "bling factor" to this. It's invisible to almost everyone.

    The next time you enter a conference room with a group in a modern building, listen for blowers winding up to speed and the whir of motors moving dampers. Somewhere, a microcontroller and network are quietly doing their job.

  19. Popular with phishers on Geocities Shutting Down Today · · Score: 1

    Geocities was very popular with phishers who needed hosting on a domain too popular to blacklist. We maintain a list of major domains being exploited by active phishing scams, and Geocities is in the #2 position for length of time on the list. Over the last few months, the number of phishing sites hosted on Geocities has slowly declined. Today, on Geocities' last day, there is only one left.

    With Geocities out of action, Piczo.com (hosting/social networking for teens) and Fortunecity.com (general-purpose free hosting) become the top hosting services favored by phishers. Most of the Piczo phishing sites seem to be aimed at getting Habbo login credentials. There is apparently a whole racket which breaks into Habbo accounts to steal virtual furniture.

    (We finally have all the big players off that list. When we started, Yahoo, Microsoft, Google, and eBay were all on that list. They've all been fixed. The "short URL" sites are now all very aggressive about killing off phishing links; they don't want to get on spam blacklists. Most of the remaining sites on the list are modest sites run by people who have no idea what's going on with their site. The oldest entry on that list, hoseo.ac.kr, is a Korean university. Someone broke into their email system last year and put a phishing site on port 8080. Their webmaster mailbox is full, but we've tried to reach them by other means and may eventually reach someone with a clue.)

  20. Router fairness on A Possible Cause of AT&T's Wireless Clog — Configuration Errors · · Score: 4, Informative

    TCP measures round trip time, and doesn't need packet loss to tell it that the round trip time is long. The retransmit interval will go up appropriately. TCP will behave reasonably with a long round trip time. If you're trying to do a bulk transfer, there's nothing wrong with this. The problem comes when short messages and bulk transfers are sharing the same channel. The short messages can spend too much time in the queue.

    The solution is reordering the packets, not dropping them. That's what "fair queuing" is about. It may be worthwhile to implement fairness at the port-pair level, rather than the IP address level, at entry to the air link. Then low-traffic connections will get through faster.

    "Quality of service" can help, but it's not a panacea. The network layer can't tell which of the TCP connections on port 80 is highly interactive and which is a bulk download, other than by traffic volume.

    (I used to do this stuff.)

  21. Why are there still game retailers? on Game Retailers Facing Digital Distribution Transition · · Score: 4, Interesting

    Record stores are dead. Video rental stores other than a few major chains are dead. Why should game stores stick around? The only one near me is a tiny one next to a Cartridge World (ink, not ammo).

    The A titles may still justify some shelf space at WalMart, but I don't see any remaining need for standalone game stores.

  22. What would happen if Microsoft turned it off on What If They Turned Off the Internet? · · Score: 5, Insightful

    All it would take is one really bad Windows Update to turn off 70% of the Internet.

    Question for Homeland Security: who has access to the master signing key for Windows Update? Who does the background check on those people?

  23. The opposition on Mandatory H1N1 Vaccine For NY Health Workers Suspended · · Score: 0, Flamebait

    Most of the opposition in New York seems to be coming from some nutcase who runs an embroidery firm, organizes GOP "Tea Parties", and rants about vaccines and autism.

    Vaccines are safer than most over the counter medications. The US already has over 1000 swine flu deaths this year, and we're not even into winter yet. Getting vaccinated is definitely a statistical win. Getting medical personnel vaccinated is essential; they are going to encounter infected patients, and they can transmit the disease to others weakened by other illnesses.

    General Charles Krulak (former Commandant of the United States Marine Corps, and one of the best ones) wrote this about the USMC mandatory anthrax vaccination program: "As we continue to broaden this program, I want to make you aware of a phenomenon we have observed: reluctance to take the anthrax vaccination is inversely proportional to the distance the marine is from the fighting hole. No marines engaged in Desert Thunder refused the vaccine." No nonsense there.

  24. Another bogus materials-science article on NCSU's Fingernail-Size Chip Can Hold 1TB · · Score: 4, Insightful

    This is yet another of those articles where somebody did something vaguely promising in materials science, and it's immediately being touted as if it were a product.

    They're not talking about a "chip" at all. The material they've produced sounds more like something that might work as a disk surface. "Under these conditions the Ni-MgO system behaves as a perfect paramagnet." It's not clear what you'd use as a read/write head, even if they can create a surface of "nanodots".

  25. The dumbing down of Google on Google Partners With Twitter For Search · · Score: 3, Insightful

    Over the past two years, it seems that Google has been redesigning their search system for dumber and dumber users. They now seem to be targeting the room-temp IQ crowd.

    Google used to just suggest spelling corrections. Now, it applies them. If you don't want spelling correction, you must put the search term in quotes. This leads to results like the one for "ndia intellectual property", where NDIA is the National Defense Industrial Association. Google gives back mostly results about "India", not "NDIA". This happens on all searches where the term searched is near a common word.

    Then there's the missing word problem. It used to be that if you searched for several words, all the words had to be present. That's no longer true. Google will return results it likes that don't contain some of the words. If you want to insist that a word be present, you have to quote it.