Why just one wiki for Africa?... This is where Africa usually gets the shaft: it's treated as a whole; any effort usually benefits the populous/popular countries
In the case of Swahili, I think they're a lot closer to the true reason when mentioning Internet access. It's not that no one has Internet access at all - you'd be surprised who has an email address and what places have an Internet café. But it costs maybe 1,000 Tanzanian schillings (~ $.75) per hour. Tanzania's GDP per capita is $700, so an hour of Internet access costs the "mean person" 40% of his money for that day. I think that GDP figure's deceptive because many of the tribespeople don't even use money during an average day, so let's quadruple it. An hour of Internet access takes 10% of your money for the day. You're still not going to be sitting down at the computer pumping out wiki article after wiki article. The people who can afford to are all fluent in English. It's an official language of Kenya, Tanzania, and Uganda. Many of the schools teach in it, and people are eager to practice using it.
On the other hand, after OLPC gets into East Africa (not soon, I fear), there will be many, many people with plenty of computer time. They'll be able to download articles, modify them offline, and upload new revisions later. If they find a Swahili wikipedia valuable, it will take off.
It doesn't even make sense. Nor through inaction permit any human being to be harmed? What does that mean? You have to modify it to stop harm? The code isn't allowed not to prevent harm? You're obligated to use this program if it would in any way prevent harm to someone?
It's a poor paraphrase of Asimov's First Law of Robotics (Asimov, the science fiction writer). Weighing myriad courses of action for the least harm - or least possible harm, or least expected value of harm (probability), or least average harm (among many people), or however you interpret it in more complex situations - is a tremendously difficult thing. In his books, imprinting it into the minds of robots paralyzed them by indecision in many situations. So the idea of successfully prosecuting someone for not following it is laughable. I don't think these people understand his books or the legal system.
They compare it to the law against commercial use of Amateur Radio:
He says some might think an attempt to prevent military use might be "too idealistic" and would not work in practice, but he references the world of ham radio, whose rules specify that the technology is not to be used commercially. "Surprisingly enough, this rule is respected by almost every ham operator."
...but that law is relatively unambiguous. If you're proposing through the amateur bands the exchange of goods or services for money, you're engaging in commerce and thus breaking the law. I can't think of any situation in which this First Law of Robotics-like clause is easy to interpret - there's always the argument that by killing this person, you prevented him/her from killing others or causing some greater harm.
The problem isn't that floating point numbers are inherently problematic, the problem is that we typically use them by converting base-10 numbers to them, doing a bunch of calculations and then converting them back to base 10. [...] Bottom line: If you care about getting the same results you'd get in base 10, do your work in base 10. This is why financial applications should not use floating point numbers.
Base-10 is one problem; floating-point is another. If you are certain that base-10 floating point is what you want for financial calculations, use IEE 754r floating-point numbers when it's finalized. But floating point numbers, by design, only maintain a certain number of significant figures. In many financial calculations (say, current bank balances; possibly not long-term stock market estimates), you always want precision down to the nearest cent or tenth of a cent, regardless of how big the number is. That's a job for fixed-point, base-10 arithmetic. (Luckily, libraries for this exist, and IEE 754r also introduces standardized fixed-point base-10 numbers.)
You might be able to get away with floating-point if the mantissa's wide enough for your largest number[*], but why mess around? Fixed-precision arithmetic is easier to understand, so it's the right tool for the job. Save floating-point numbers for the people who need them, who put in the effort to tweak formulas to minimize errors like catastrophic cancellation, and who use them appropriately:
knowingly accept an unknown precision and accuracy (might be okay for an OpenGL-based game)
use interval arithmetic to flag problems (say, scientific calculations that can be tweaked and rerun if a problem is revealed)
actually prove that they're within defined error bounds (the only way for real-time critical applications, like life-sustaining systems or ones that affect a lot of money)
[*] - Maybe; I don't remember all the properties of IEE 754 floating-point numbers off the top of my head, which is exactly my point.
though for 64bit goodness you'll probably have to throw another flag in there.
That's more important. If I had an Athlon-64 with 12 GiB of RAM, I'd much rather use 64-bit addressing to cleanly use the whole thing rather than segmentation games with chunks of 2, 3, or 4 GiB. (32-bit = 4 GiB; Linux uses...I think the top GiB for the kernel.)
Here's a question: is it right for us to stop it? This appears to be a natural weakness of Tasmanian devils. The article states:
Pearse noted that inbreeding, and the resulting lack of genetic diversity, may make Tasmanian devils particularly susceptible to this type of infection.
and so an unsuccessful species is dying out, as has happened many times in the past. Now humans are around to stop it (the government is quarantining them; there's even talk of cloning, should the entire population die), but is it beneficial to tamper with nature in this way?
That's because in general, public transit in the USA is at a 3rd-world standard.
We've got a long way to go before we can match the third world's quality of public transportation. In Kenya, there are swarms of matatus (15- and 25-passenger minibuses, not much larger than a van) everywhere. (Equivalently, Tanzania has dala-dalas.) I'm not sure about their exact fuel efficiency, but I'd suspect that (when they're properly maintained) it's significantly better than a full bus. And they're always full or nearly so, so the per-passenger efficiency is good.
In Nairobi, they are going to stop issuing new 15-passenger matatu licenses because they're causing too much congestion. Think about that for a second - there are so many of these van-sized vehicles, each with 15 people in it, that they're filling up the roads. They need more 25-passenger ones. We could not get so many people to use public transportation in the United States.
Matatus are convenient. They have routes everywhere, and there are enough of them on the road that you can just start walking where you want to go and within five minutes one will be by to pick you up.
Downsides: many matatu drivers are suicidal, and matatus transporting fat Americans would necessarily be lower-capacity.
Hopefully, the engineer who designed this hybrid drive has, at a minimum, integrated an LCD counter and a tiny speaker into the drive. The counter shall display the running total of the number of writes to the flash memory. The tiny speaker shall beep like crazy when the total exceeds 99900.
More likely, they'd use SMART to let you monitor the flash's health programmatically, just as they use for the rest of the drive. This is far from the only designed limitation of modern hard drives, and SMART is a nice system for tracking them. Makes sense to keep using it.
As someone else mentioned, it'd be need to be more than a simple counter - the 100,000 writes lifetime estimate is per-sector. I don't know if it's practical or necessary to maintain such a large counter array. Decent wear leveling may be enough to ensure this limit is not reached.
My point is that there is no real advantage building extra tools to search for simple errors, while still not being able to combat real problems, like infinite loops, incorrect conditions, class cast exception, null pointer exception. ATM has relevance to all of the above-mentioned bug types, so there.
I got that. My point was that you don't need to determine if an arbitrary program is error-free in all of those categories to have a useful tool. If static checkers can point out a bunch of errors in a huge program and a bunch of places they're suspicious of, then they are useful. And they can. Did you follow that Stanford Checker link? It detected all of the classes of errors you mentioned. Sure, it missed many more errors of those classes, and actual humans had to give it some project-specific knowledge for that to happen, but it seems like a good investment - it saves more time than it takes.
Hey, I didn't say these bug detector tools were completely useless, but I would rather see people write good unit tests and think through their code than rely on bug detectors.
I'd rather see them put the time into whatever tool seems to be helping them most at the moment. And then when it reduces their debugging time, they can use the surplus to try other things. You seem to be making a lot of assumptions - that doing something perfectly is useful but doing it well is not, that the time available for program verification is constant, etc.
One would think that out of all people, IBM staff would be familiar with the ATM or the Halting Problem.
First, I don't see the relevance, so I suspect you don't understand the halting problem[*]. You've probably heard it phrased as in the Wikipedia article: "a general algorithm to solve the Halting problem for all possible program-input pairs cannot exist. We say that the halting problem is undecidable over Turing machines." You've probably heard arguments reducing other problems to the Halting problem and concluded that automated reasoning about programs is impossible.
By some word juggling and de Morgan's laws, Wikipedia's statement is equivalent to "there exist program-input pairs for which the halting problem cannot be solved with a general algorithm." I'd say those should be quite rare if you're doing "reasonable" things.
The common examples of undecidable programs involve "finding counterexamples to famous conjectures in number theory". I'm guessing your programs don't do that. There generally should be straightforward indicators of your programs' progress - input file pointers, size of internal data structures, iterators. If IBM Research put all their effort into the Halting problem and their program still couldn't tell if your program terminates, it might be because your program is screwed up. Thus, "our program can't tell if your program halts for all input" probably means "rethink your algorithm", just as "your program does not halt for input X" does. The same goes for other properties which can be reduced to the halting problem.
It'd still be a hard thing to do, though. I think most existing checker tools examine only a function at once, and I'm not aware of any attempts to do anything so sophisticated even at that level.
the bugs that this article is talking about are the simples ones.
Simple but common, which make them great bugs for static analysis. But if you want fancier examples, look at these bugs the Stanford Checker (now Coverity) found in the Linux kernel.
Also wouldn't this 'static bug detection' be unnecessary if Java was a strong typed language? The idea of casting is of-course a powerful one, but it is this idea that is probably responsible for the most non-business related bugs in the code
Java is a strongly-typed language. If you cast something incorrectly, you get a ClassCastException. The runtime knows the type of every object. You may have meant "totally statically-typed language". In any case, the answer's still no - the System.gc() example that you found too simple is an obvious counterexample.
In any case, I would rather see people do something than nothing, so I guess bug detectors better than no bug detectors, but in reality I would rather have the developers write good unit-tests.
They're another way to find bugs, and the bugs they find are not a subset of those found by unit tests. There are a lot of classes of bugs that can't be found easily with unit tests (race conditions!). There are a lot of environments in which it is difficult to write unit tests (embedded code, kernel code, GUI code). And fundamentally unit tests require you to come up with the input that breaks it. If there's a case you never considered at all, they just won't have the same value as a second pair of eyes on your algorithm, which these static analysis tools effectively are.
Does this sort of reasoning sound familiar?
"Does it work in the normal case? Yup. If the list members, commas, and trailing null come to BUFSIZE-1 characters? Yup. BUFSIZE? Yup.... What if an individual element overflows this smaller buffer? Oh, it can't, because that's over log(INTMAX) by
Now, the only way this would be interesting would be if the worm / virus / trojan installed the virtualization software, moved the existing OS to a virtual machine and faked the names of all the interfaces (NIC, IDE controller, etc). If you can do that, VMWare really wants to talk to you.
Can someone please mod this guy down for not reading the article? It says:
"The idea behind Blue Pill is simple: your operating system swallows the Blue Pill and it awakes inside the Matrix controlled by the ultra thin Blue Pill hypervisor. This all happens on-the-fly (i.e. without restarting the system) and there is no performance penalty and all the devices," she explained
There's no need to fake the hardware; presumably it just lets the operating system use the real stuff.
I've said it before, and I'll say it again: it's time to start encrypting everything. Just one question...anyone out there familiar with the current legality of crypto in Canada?
That's insufficient. The NSA is doing traffic analysis, which is not affected by encryption and authentication. An encrypted connection to a busy shared tunnel server would help somewhat, but even so, if they can see all the traffic on the network, they can make the connection between "encrypted data from A->B at time t" and "new connection from B->C at time t+e" for small values of e, especially if there are full A<->B and B<->C conversations for similar timespans.
To defeat their analysis, you'd need something much more sophisticated - say, a mesh of customers who don't want to be monitored. They'd have to tunnel data through randomly chosen[*] peers with multiple hops, send some noise to impair analysis, and even add delays before relaying. It'd be a lot harder for them to tell that node A asked node B to relay stuff if nodes C and D are also having a conversation with node B with similar timing. And you'd certainly need to consider what happens if some nodes are infiltrated. It's a hard problem, and even if done well, there's a price to pay for the added security.
[*] maybe with some locality; there's a security/speed trade-off.
the point is that this principle was known and advocated by R. T. Jones starting in the early 1950s, but it took this long for the dim-wits to realize that, well, maybe building the optimum craft is a good idea even if it looks funny
RTFA. It didn't take them this long to realize that. You might as well have pointed out that Leonardo da Vinci drew an airplane in the 15th century and asked why the dimwits aren't getting around to building one until now. My answer would be the same - they have built them before now; this is a refinement.
I'm not sure when the first oblique-wing aircraft was made, but another poster pointed out that NASA built the AD-1 swivelling, all-wing aircraft in 1982. Presumably its the same plane that the article describes here:
This is not the first attempt at an oblique-wing aircraft. SpaceShipOne creator Burt Rutan designed a switch-wing plane with NASA in 1979. But the slanted wings made the craft hard to fly -- when the pilot pulled the nose up, the plane would roll to one side.
It turns out that you can't just prove that the shape is efficient at supersonic speeds; you have to actually innovate to address problems like this. Northrop Grumman says that they can not only do that, they can step it up a notch by building a long-flying, swivelling, all-wing, unmanned stealth bomber capable of flying at Mach 2. That was not proposed in the early 1950s, much actually less flown in combat. Yeah, it's an incremental improvement. So's the newest 64-bit AMD chip you're drooling over, or whatever sort of shiny thing it is that you actually like.
Baldrson:
Even so they can't get around to it until 15 years in the future. The situation would be laughable if it weren't so tragic.
RTFA. They're planning to have a design in a year and a half. They're planning to actually fly the thing in 2010. They're planning to have mass produced it and flown it in combat in 2020. That's a little different than "they can't get around to it until 15 years in the future".
I bought a pill of adderol once from a friend of mine in my sophomore year at college. I had linear algebra and EM physics finals the next morning. I've never concentrated that hard in my life. I was going from about 11:00pm to 7:00am straight (with regular smoke breaks) at the library, and my linear final was at 7:50. I nailed it too.
As long as we're trading anecdotes, I skipped class for six weeks before my linear algebra final, then nailed it. [*] No drugs, no studying. For whatever reason (my natural talent in mathematics? low standards? the professor letting us use TI-89s to check our work?), I found the class and test really easy.
On the other hand, E&M was the real deal. Challenging material, demanding (but great) professor. I went to class, I studied, and I was proud when I got As on those tests.
My point is that anecdotal evidence is worthless. You felt more focused while studying. But was your studying actually more effective? Or were your finals simply as easy for you as my linear one was for me? What grade would you have gotten if you hadn't taken any drug? What grade would you have gotten if you'd taken a placebo? It's impossible to know.
Has anyone actually done any real scientific studies of the effects of these pills on healthy people? Our brains are complicated. While it seems reasonable at first to say you felt more focused, therefore you were more focused, therefore you were more effective, that's actually quite a leap. There are many drugs out there that will make you feel more effective, then discover afterward that your work was crap. Does a pill that turns an ADD patient into a "normal" person turn a normal person into a superperson? If even more of some chemical in our brains makes us even more focused and intelligent, why didn't natural selection increase the dosage? What's the catch?
[*] Okay, 98/100...forgot to normalize an eigenvector...though MathWorld says now that they don't have to be normalized, so I want my two points back.
Re:Useless to all but theoraticians
on
The Art of SQL
·
· Score: 1
This book might be good for THEORY, but for actually getting useful and applicable information, the review leaves me wondering who would be a worthwhile reader.
That's funny, because I was just thinking it's odd that this book has no theory in it at all. At least in the review I saw no mention of the definition of ACID, the compromises at different transaction isolation levels, Codd's 12 rules for relational databases, Codd's original notation for relational algebra and relational calculus (of which SQL is an approximation), or normal forms.
And it turns out that this theory is useful and applicable. If you haven't caught on yet, I'm disappointed by this omission. A lot of people write horrible systems because they do not understand transactions, how to normalize a database schema, or why constraints are so important.
SQL is implemented differently in all of the environments I have encountered it (yeah, I'm not a PRO, just a hacker, so don't hate on me.) Those environments are MS SQL, MySQL, FoxPro, and MS Access. I think I messed around with PostgreSQL. Maybe a few others.
Point is nothing is really transferable and even basic syntax varies widely as do optimizations and 'the best way to do x'
If you need specifics on RDBMS implementations, look at this comparison website. It's not that long, and it basically fills in the gaps left by this book.
You can usually write standard SQL statements and run them on PostgreSQL, MS SQL Server, Oracle, and DB2. You can certainly come up with Oracle statements that don't run on PostgreSQL - e.g., by using their alternate syntax for left joins that predates standardization - but presumably this book teaches you the standard stuff. That's all you need in most situations, and it's all they can give without you without having to update the book every six months.
Microsoft Jet SQL (of Access fame) has a few cosmetic differences in syntax. (IIRC, quoting is different.) If that's enough to seriously set you back, you'd be in trouble even if the book did duplicate all the examples for you.
MySQL is the only real oddball, and even they are starting to learn that this SQL thing is useful after all. If you want to work with older MySQL installations, get a book on MySQL, throw out any knowledge you have of how to do things properly, and give up on portability altogether. Peculiarities in its performance characteristics made projects like phpBB do bizarre things like mantain parallel table structures for each forum in a messageboard. That's totally against the relational model, and there are lots of consequences...
No, please replace UML in everyones minds with User Mode Linux so never again will the horror that is Unified Modeling Language be forced upon us.
I actually like Unified Modeling Language when used appropriately.
If you hate UML, you've probably only seen class diagrams. I hate them, too. I find that they tend to get out of sync with reality and aren't that easy to read anyway (why put the complete list of methods in with the inheritance structure? too much information!). Tools that generate code from them only make the synchronization worse - better to generate diagrams from code, since the code is what definitely changes as needed. I much prefer doxygen output - it's always in sync and has easy-to-read inheritance diagrams with textual lists of methods and full documentation right there which is kept with the source code.
But class diagrams aren't all there is. Take a look at the other UML diagram types on this reference card. In particular, I find UML sequence diagrams are the best way to describe race conditions. State diagrams, activity diagrams, and use case diagrams have their place, too.
Its hard to train them to learn Objective-C or any other language they are used to since all of their CS skills are bound to a single language.
Then they have no CS skills. In fundamental areas like the design and analysis of data structures and algorithms, language is almost[*] totally irrelevant. I'm not sure what CS you could possibly know that is bound to a single language.
I'm told there are actual good people in India, but the few I've dealt with (Java and C programmers) are like this. They sort of know a language or two, but there's nothing else there. There are also plenty of people like this in the United States, unfortunately.
[*] - There are some differences, like that you can't mutate stuff in a pure functional language, at least without a crazy optimizer. But certainly not between, say, C# and Java - both imperative, same GC system, VM model, object model. Same language, slightly different syntax.
The article said: Men involved in fight clubs often carry bottled-up violent impulses learned in childhood from video games, cartoons and movies, said Michael Messner, a University of Southern California sociology and gender studies professor.
DragonWriter said: It is poor (though typical) reporting that these types of claims are reported simply as "so-and-so says", but it saves journalist from having to have any knowledge of or do any research in the field they are covering, they can simply find the nearest person with a degree or job in a superficially relevant field, and get a quote, and go home for the day. If they are particularly ambitious, they'll get two conflicting quotes from different experts, to show "balance".
PCM2 said: I see. And so, in your opinion, not-poor reporting would presumably involve the reporter spending the next six years getting an advanced degree in psychiatry and then stating his own opinion?
I can't speak for DragonWriter, but I'd like to see evidence for a wild claim like that. Perhaps a reference to a peer-reviewed study. I don't have much respect for sociologists or gender studies professors. I can't think off-hand how to perform a well-controlled experiment that would determine if what this guy said is true, so I bet he couldn't either.
My company has a successful MediaWiki installation, and I love it. All our technical teams (engineering, QA, system administration) are using it.
I've put into it design documentation, instructions for accessing our other services (e.g. Subversion repositories), troubleshooting tips, sequence diagrams of various race conditions, you name it. I try to periodically dump everything in my notes directory into the wiki. The effort of cleaning it up means I'll understand it later, having it on the wiki server means it's backed up regularly, and as a bonus, other people see it and don't need to ask me as many questions, so I can spend more time developing. And it gives people a way to still get answers when I'm off bicycling through Africa.
But collaboration technology like MediaWiki or bugzilla only works when people use it. There are always some people who won't play with others. If I put information on the wiki, they'll come bug me for it anyway. If I tell them it's on the wiki, they still won't read it. If I give them information verbally and specifically ask them to put it on the wiki, they won't do it. And then they wonder why I ignore their emails...
> So that if the filesystemn crashes you can keep right on, er... what now?
You can:
keep right on trucking, maybe. If it's the NFS driver, then your network filesystem went down. Okay, you can still use the rest of the system. If it's the mouse driver, the network stuff still works. There are lots of areas of the kernel that might not be critical for what you're doing.
produce a decent bug report - "the vfat filesystem process crashed" - and be confident that it's a bug in either the filesystem process or the microkernel. [1] The TCP driver couldn't have possibly corrupted anything in the filesystem driver unless there's a bug in the microkernel. In fact, you can even restart that process and try different things to reproduce it, without rebooting the whole system. Maybe even script actions. As is, you can kinda sorta do stuff like that with Xen or UML, but this way's easier.
develop the kernel stuff more easily, for the same reasons that make bug reports easier.
[1] - Someone did bring up the point that you can do stuff like tell the hardware (in your non-privileged device driver) to do a DMA transfer to memory you don't own. So the memory protection is weak without applying it to DMA transfers, which apparently some hardware does.
> Check out mkLinux and L4Linux. The efforts are made. The userland linux service on a microkernel is a reality.
I've never understood the point of systems that plop a monolithic kernel on top of a microkernel. Sure, if there's a bug in the monolith, the microkernel doesn't crash. But who cares? There was only one thing running on top of it, and it isn't running properly anymore.
The point of a microkernel design is to compartmentalize things like filesystem drivers and the TCP layer, as the article said. How does this approach accomplish that?
> I see that micorkernel systems are terribly slow and not significantly more stable than monolithic systems.
> I don't have any experience with QNX, so I can't debate the performance of that OS.
Might I ask what microkernels you are familiar with? If they are all research systems, then the lack of stability you observed doesn't surprise me. Research systems don't get the many eyes of a system like Linux. If their VM system is faulty, they'll still fail. But using a microkernel design is a fool-proof way of preventing the NFS driver from corrupting memory throughout the kernel, so there are obviously ways in which it improves stability. A system which had both of these aspects would be more stable than one that has only one or neither.
volatile - causes a read or write out to main memory, ie, not the local CPU cache.
Not even that, actually.
In C, it tells the compiler that the read or write to memory can't be reordered. If you do a read, it has to get it from memory right then, rather than reusing one from before that it might have stuck in a register. It doesn't tell the CPU anything about synchronizing its cache or executing the instruction in order, however. You've gotta have both.
In Java, it actually depends on the version of the language. third edition (Java 1.5, I believe). second edition (the first is the same; so Java 1.1-1.4). It appears to say what you said, but I don't buy it. Look at this article by a bunch of Java synchronization experts on double-checked locking. In particular, this sentence:
The consensus proposal extends the semantics for volatile so that the system will not allow a write of a volatile to be reordered with respect to any previous read or write, and a read of a volatile cannot be reordered with respect to any following read or write.
This change might have made it into the third edition. The second and first read like it provides this guarantee, but if these guys say not, then I'm not going to be depending on that without reading all of the chapter on thread interactions (not just the one section on volatile), reading everything they say, and doing some experiments. If that means my software runs 0.5% slower because I have more synchronization overhead than I need, then so be it.
Java will give each thread its own cache of variables to prevent deadlocking on concurrent modifications.
This is nonsense. Java has no per-thread cache of variables. I don't even know what you are misdescribing. I can only guess that it's one of these things:
Each processor has a cache of memory. It's for speed, not preventing concurrency bugs. In fact, it frequently causes bugs - when one processor reads a memory location to which another has recently written, the cache can mean that you get stale data. Of course, there are ways for handling this correctly (memory barriers) but they're worthless if you don't understand them and use them properly. The short version is that, in Java, if you're don't have a particular lock that you're consistently holding when you're accessing any particular memory location that ever could be touched by more than one thread, your code is almost certainly wrong.
Java has the concept of thread-local data, but it's not a cache, it's not transparent (if you don't know what it's called, you can't possibly be using it), and it's generally a last resort for working around code that was designed to use globals back before threading. (In C, errno is the canonical example.)
Java's memory allocators tend to put small objects into blocks allocated in larger chunks per-thread, which I guess you could call a per-thread cache of free memory. Again, strictly for speed.
... you should always strive to push locks down as far in the code as possible....
Increasing the complexity of your locking and making it more likely you'll cause deadlocks.
From your first sentence, I suspect you don't know what deadlocks are. Textbook definition: thread A has resource 1 and is waiting for resource 2; thread B has resource 2 and is waiting for resource 1. Consequently, both threads hang forever. Typically the resources are two mutexes (a.k.a binary semaphore, a.k.a the things you synchronize on in Java). If you have two threads that can acquire the same two mutexes in opposite orders, you have a potential deadlock.
The more mutexes you have, the more likely you'll cause a deadlock if you don't know what you're doing. If you have one big mutex for your whole system, deadlocking on mutexes is impossible. You should probably stick to that design until you research concurrency a little more.
Oh, come on, name me one major hollywood movie with more realistic IT in it.
More realistic than that confused nonsense? That's easy. Firewall did pretty well. (If you haven't seen it, stop reading now; I'll probably spoil it.) They used Ethereal (right after I'd spent a week at UNH-IOL looking at that screen). Some sort of pattern-based IDS. I think all of the technology mentioned in the film was real, down to the dog-collar GPS tracking system. The scanner/iPod hack was implausible, but not impossible. So if you forgive them a couple plot devices (like that "diagnostic" in which they scroll through all the account balances on console), they did pretty well.
There isn't. Just skimming the list, I see Afrikaans, Swahili, Kongo, Somali, and Luganda.
In the case of Swahili, I think they're a lot closer to the true reason when mentioning Internet access. It's not that no one has Internet access at all - you'd be surprised who has an email address and what places have an Internet café. But it costs maybe 1,000 Tanzanian schillings (~ $.75) per hour. Tanzania's GDP per capita is $700, so an hour of Internet access costs the "mean person" 40% of his money for that day. I think that GDP figure's deceptive because many of the tribespeople don't even use money during an average day, so let's quadruple it. An hour of Internet access takes 10% of your money for the day. You're still not going to be sitting down at the computer pumping out wiki article after wiki article. The people who can afford to are all fluent in English. It's an official language of Kenya, Tanzania, and Uganda. Many of the schools teach in it, and people are eager to practice using it.
On the other hand, after OLPC gets into East Africa (not soon, I fear), there will be many, many people with plenty of computer time. They'll be able to download articles, modify them offline, and upload new revisions later. If they find a Swahili wikipedia valuable, it will take off.
It's a poor paraphrase of Asimov's First Law of Robotics (Asimov, the science fiction writer). Weighing myriad courses of action for the least harm - or least possible harm, or least expected value of harm (probability), or least average harm (among many people), or however you interpret it in more complex situations - is a tremendously difficult thing. In his books, imprinting it into the minds of robots paralyzed them by indecision in many situations. So the idea of successfully prosecuting someone for not following it is laughable. I don't think these people understand his books or the legal system.
They compare it to the law against commercial use of Amateur Radio:
...but that law is relatively unambiguous. If you're proposing through the amateur bands the exchange of goods or services for money, you're engaging in commerce and thus breaking the law. I can't think of any situation in which this First Law of Robotics-like clause is easy to interpret - there's always the argument that by killing this person, you prevented him/her from killing others or causing some greater harm.
Base-10 is one problem; floating-point is another. If you are certain that base-10 floating point is what you want for financial calculations, use IEE 754r floating-point numbers when it's finalized. But floating point numbers, by design, only maintain a certain number of significant figures. In many financial calculations (say, current bank balances; possibly not long-term stock market estimates), you always want precision down to the nearest cent or tenth of a cent, regardless of how big the number is. That's a job for fixed-point, base-10 arithmetic. (Luckily, libraries for this exist, and IEE 754r also introduces standardized fixed-point base-10 numbers.)
You might be able to get away with floating-point if the mantissa's wide enough for your largest number[*], but why mess around? Fixed-precision arithmetic is easier to understand, so it's the right tool for the job. Save floating-point numbers for the people who need them, who put in the effort to tweak formulas to minimize errors like catastrophic cancellation, and who use them appropriately:
[*] - Maybe; I don't remember all the properties of IEE 754 floating-point numbers off the top of my head, which is exactly my point.
That's more important. If I had an Athlon-64 with 12 GiB of RAM, I'd much rather use 64-bit addressing to cleanly use the whole thing rather than segmentation games with chunks of 2, 3, or 4 GiB. (32-bit = 4 GiB; Linux uses...I think the top GiB for the kernel.)
Here's a question: is it right for us to stop it? This appears to be a natural weakness of Tasmanian devils. The article states:
and so an unsuccessful species is dying out, as has happened many times in the past. Now humans are around to stop it (the government is quarantining them; there's even talk of cloning, should the entire population die), but is it beneficial to tamper with nature in this way?
We've got a long way to go before we can match the third world's quality of public transportation. In Kenya, there are swarms of matatus (15- and 25-passenger minibuses, not much larger than a van) everywhere. (Equivalently, Tanzania has dala-dalas.) I'm not sure about their exact fuel efficiency, but I'd suspect that (when they're properly maintained) it's significantly better than a full bus. And they're always full or nearly so, so the per-passenger efficiency is good.
In Nairobi, they are going to stop issuing new 15-passenger matatu licenses because they're causing too much congestion. Think about that for a second - there are so many of these van-sized vehicles, each with 15 people in it, that they're filling up the roads. They need more 25-passenger ones. We could not get so many people to use public transportation in the United States.
Matatus are convenient. They have routes everywhere, and there are enough of them on the road that you can just start walking where you want to go and within five minutes one will be by to pick you up.
Downsides: many matatu drivers are suicidal, and matatus transporting fat Americans would necessarily be lower-capacity.
More likely, they'd use SMART to let you monitor the flash's health programmatically, just as they use for the rest of the drive. This is far from the only designed limitation of modern hard drives, and SMART is a nice system for tracking them. Makes sense to keep using it.
As someone else mentioned, it'd be need to be more than a simple counter - the 100,000 writes lifetime estimate is per-sector. I don't know if it's practical or necessary to maintain such a large counter array. Decent wear leveling may be enough to ensure this limit is not reached.
I got that. My point was that you don't need to determine if an arbitrary program is error-free in all of those categories to have a useful tool. If static checkers can point out a bunch of errors in a huge program and a bunch of places they're suspicious of, then they are useful. And they can. Did you follow that Stanford Checker link? It detected all of the classes of errors you mentioned. Sure, it missed many more errors of those classes, and actual humans had to give it some project-specific knowledge for that to happen, but it seems like a good investment - it saves more time than it takes.
I'd rather see them put the time into whatever tool seems to be helping them most at the moment. And then when it reduces their debugging time, they can use the surplus to try other things. You seem to be making a lot of assumptions - that doing something perfectly is useful but doing it well is not, that the time available for program verification is constant, etc.
First, I don't see the relevance, so I suspect you don't understand the halting problem[*]. You've probably heard it phrased as in the Wikipedia article: "a general algorithm to solve the Halting problem for all possible program-input pairs cannot exist. We say that the halting problem is undecidable over Turing machines." You've probably heard arguments reducing other problems to the Halting problem and concluded that automated reasoning about programs is impossible.
By some word juggling and de Morgan's laws, Wikipedia's statement is equivalent to "there exist program-input pairs for which the halting problem cannot be solved with a general algorithm." I'd say those should be quite rare if you're doing "reasonable" things.
The common examples of undecidable programs involve "finding counterexamples to famous conjectures in number theory". I'm guessing your programs don't do that. There generally should be straightforward indicators of your programs' progress - input file pointers, size of internal data structures, iterators. If IBM Research put all their effort into the Halting problem and their program still couldn't tell if your program terminates, it might be because your program is screwed up. Thus, "our program can't tell if your program halts for all input" probably means "rethink your algorithm", just as "your program does not halt for input X" does. The same goes for other properties which can be reduced to the halting problem.
It'd still be a hard thing to do, though. I think most existing checker tools examine only a function at once, and I'm not aware of any attempts to do anything so sophisticated even at that level.
Simple but common, which make them great bugs for static analysis. But if you want fancier examples, look at these bugs the Stanford Checker (now Coverity) found in the Linux kernel.
Java is a strongly-typed language. If you cast something incorrectly, you get a ClassCastException. The runtime knows the type of every object. You may have meant "totally statically-typed language". In any case, the answer's still no - the System.gc() example that you found too simple is an obvious counterexample.
They're another way to find bugs, and the bugs they find are not a subset of those found by unit tests. There are a lot of classes of bugs that can't be found easily with unit tests (race conditions!). There are a lot of environments in which it is difficult to write unit tests (embedded code, kernel code, GUI code). And fundamentally unit tests require you to come up with the input that breaks it. If there's a case you never considered at all, they just won't have the same value as a second pair of eyes on your algorithm, which these static analysis tools effectively are.
Does this sort of reasoning sound familiar?
There's no need to fake the hardware; presumably it just lets the operating system use the real stuff.
That's insufficient. The NSA is doing traffic analysis, which is not affected by encryption and authentication. An encrypted connection to a busy shared tunnel server would help somewhat, but even so, if they can see all the traffic on the network, they can make the connection between "encrypted data from A->B at time t" and "new connection from B->C at time t+e" for small values of e, especially if there are full A<->B and B<->C conversations for similar timespans.
To defeat their analysis, you'd need something much more sophisticated - say, a mesh of customers who don't want to be monitored. They'd have to tunnel data through randomly chosen[*] peers with multiple hops, send some noise to impair analysis, and even add delays before relaying. It'd be a lot harder for them to tell that node A asked node B to relay stuff if nodes C and D are also having a conversation with node B with similar timing. And you'd certainly need to consider what happens if some nodes are infiltrated. It's a hard problem, and even if done well, there's a price to pay for the added security.
[*] maybe with some locality; there's a security/speed trade-off.
Baldrson said:
RTFA. It didn't take them this long to realize that. You might as well have pointed out that Leonardo da Vinci drew an airplane in the 15th century and asked why the dimwits aren't getting around to building one until now. My answer would be the same - they have built them before now; this is a refinement.
I'm not sure when the first oblique-wing aircraft was made, but another poster pointed out that NASA built the AD-1 swivelling, all-wing aircraft in 1982. Presumably its the same plane that the article describes here:
It turns out that you can't just prove that the shape is efficient at supersonic speeds; you have to actually innovate to address problems like this. Northrop Grumman says that they can not only do that, they can step it up a notch by building a long-flying, swivelling, all-wing, unmanned stealth bomber capable of flying at Mach 2. That was not proposed in the early 1950s, much actually less flown in combat. Yeah, it's an incremental improvement. So's the newest 64-bit AMD chip you're drooling over, or whatever sort of shiny thing it is that you actually like.
Baldrson:
RTFA. They're planning to have a design in a year and a half. They're planning to actually fly the thing in 2010. They're planning to have mass produced it and flown it in combat in 2020. That's a little different than "they can't get around to it until 15 years in the future".
Maybe so, but you've totally missed the point of my post. It's subtle, but try rereading the paragraph that starts with the words "my point is".
As long as we're trading anecdotes, I skipped class for six weeks before my linear algebra final, then nailed it. [*] No drugs, no studying. For whatever reason (my natural talent in mathematics? low standards? the professor letting us use TI-89s to check our work?), I found the class and test really easy.
On the other hand, E&M was the real deal. Challenging material, demanding (but great) professor. I went to class, I studied, and I was proud when I got As on those tests.
My point is that anecdotal evidence is worthless. You felt more focused while studying. But was your studying actually more effective? Or were your finals simply as easy for you as my linear one was for me? What grade would you have gotten if you hadn't taken any drug? What grade would you have gotten if you'd taken a placebo? It's impossible to know.
Has anyone actually done any real scientific studies of the effects of these pills on healthy people? Our brains are complicated. While it seems reasonable at first to say you felt more focused, therefore you were more focused, therefore you were more effective, that's actually quite a leap. There are many drugs out there that will make you feel more effective, then discover afterward that your work was crap. Does a pill that turns an ADD patient into a "normal" person turn a normal person into a superperson? If even more of some chemical in our brains makes us even more focused and intelligent, why didn't natural selection increase the dosage? What's the catch?
[*] Okay, 98/100...forgot to normalize an eigenvector...though MathWorld says now that they don't have to be normalized, so I want my two points back.
That's funny, because I was just thinking it's odd that this book has no theory in it at all. At least in the review I saw no mention of the definition of ACID, the compromises at different transaction isolation levels, Codd's 12 rules for relational databases, Codd's original notation for relational algebra and relational calculus (of which SQL is an approximation), or normal forms.
And it turns out that this theory is useful and applicable. If you haven't caught on yet, I'm disappointed by this omission. A lot of people write horrible systems because they do not understand transactions, how to normalize a database schema, or why constraints are so important.
If you need specifics on RDBMS implementations, look at this comparison website. It's not that long, and it basically fills in the gaps left by this book.
You can usually write standard SQL statements and run them on PostgreSQL, MS SQL Server, Oracle, and DB2. You can certainly come up with Oracle statements that don't run on PostgreSQL - e.g., by using their alternate syntax for left joins that predates standardization - but presumably this book teaches you the standard stuff. That's all you need in most situations, and it's all they can give without you without having to update the book every six months.
Microsoft Jet SQL (of Access fame) has a few cosmetic differences in syntax. (IIRC, quoting is different.) If that's enough to seriously set you back, you'd be in trouble even if the book did duplicate all the examples for you.
MySQL is the only real oddball, and even they are starting to learn that this SQL thing is useful after all. If you want to work with older MySQL installations, get a book on MySQL, throw out any knowledge you have of how to do things properly, and give up on portability altogether. Peculiarities in its performance characteristics made projects like phpBB do bizarre things like mantain parallel table structures for each forum in a messageboard. That's totally against the relational model, and there are lots of consequences...
I actually like Unified Modeling Language when used appropriately.
If you hate UML, you've probably only seen class diagrams. I hate them, too. I find that they tend to get out of sync with reality and aren't that easy to read anyway (why put the complete list of methods in with the inheritance structure? too much information!). Tools that generate code from them only make the synchronization worse - better to generate diagrams from code, since the code is what definitely changes as needed. I much prefer doxygen output - it's always in sync and has easy-to-read inheritance diagrams with textual lists of methods and full documentation right there which is kept with the source code.
But class diagrams aren't all there is. Take a look at the other UML diagram types on this reference card. In particular, I find UML sequence diagrams are the best way to describe race conditions. State diagrams, activity diagrams, and use case diagrams have their place, too.
Then they have no CS skills. In fundamental areas like the design and analysis of data structures and algorithms, language is almost[*] totally irrelevant. I'm not sure what CS you could possibly know that is bound to a single language.
I'm told there are actual good people in India, but the few I've dealt with (Java and C programmers) are like this. They sort of know a language or two, but there's nothing else there. There are also plenty of people like this in the United States, unfortunately.
[*] - There are some differences, like that you can't mutate stuff in a pure functional language, at least without a crazy optimizer. But certainly not between, say, C# and Java - both imperative, same GC system, VM model, object model. Same language, slightly different syntax.
The article said: Men involved in fight clubs often carry bottled-up violent impulses learned in childhood from video games, cartoons and movies, said Michael Messner, a University of Southern California sociology and gender studies professor.
DragonWriter said: It is poor (though typical) reporting that these types of claims are reported simply as "so-and-so says", but it saves journalist from having to have any knowledge of or do any research in the field they are covering, they can simply find the nearest person with a degree or job in a superficially relevant field, and get a quote, and go home for the day. If they are particularly ambitious, they'll get two conflicting quotes from different experts, to show "balance".
PCM2 said: I see. And so, in your opinion, not-poor reporting would presumably involve the reporter spending the next six years getting an advanced degree in psychiatry and then stating his own opinion?
Extraordinary claims require extraordinary evidence.
I can't speak for DragonWriter, but I'd like to see evidence for a wild claim like that. Perhaps a reference to a peer-reviewed study. I don't have much respect for sociologists or gender studies professors. I can't think off-hand how to perform a well-controlled experiment that would determine if what this guy said is true, so I bet he couldn't either.
I've put into it design documentation, instructions for accessing our other services (e.g. Subversion repositories), troubleshooting tips, sequence diagrams of various race conditions, you name it. I try to periodically dump everything in my notes directory into the wiki. The effort of cleaning it up means I'll understand it later, having it on the wiki server means it's backed up regularly, and as a bonus, other people see it and don't need to ask me as many questions, so I can spend more time developing. And it gives people a way to still get answers when I'm off bicycling through Africa.
But collaboration technology like MediaWiki or bugzilla only works when people use it. There are always some people who won't play with others. If I put information on the wiki, they'll come bug me for it anyway. If I tell them it's on the wiki, they still won't read it. If I give them information verbally and specifically ask them to put it on the wiki, they won't do it. And then they wonder why I ignore their emails...
You can:
[1] - Someone did bring up the point that you can do stuff like tell the hardware (in your non-privileged device driver) to do a DMA transfer to memory you don't own. So the memory protection is weak without applying it to DMA transfers, which apparently some hardware does.
I've never understood the point of systems that plop a monolithic kernel on top of a microkernel. Sure, if there's a bug in the monolith, the microkernel doesn't crash. But who cares? There was only one thing running on top of it, and it isn't running properly anymore.
The point of a microkernel design is to compartmentalize things like filesystem drivers and the TCP layer, as the article said. How does this approach accomplish that?
> I see that micorkernel systems are terribly slow and not significantly more stable than monolithic systems.
> I don't have any experience with QNX, so I can't debate the performance of that OS.
Might I ask what microkernels you are familiar with? If they are all research systems, then the lack of stability you observed doesn't surprise me. Research systems don't get the many eyes of a system like Linux. If their VM system is faulty, they'll still fail. But using a microkernel design is a fool-proof way of preventing the NFS driver from corrupting memory throughout the kernel, so there are obviously ways in which it improves stability. A system which had both of these aspects would be more stable than one that has only one or neither.
Not even that, actually.
In C, it tells the compiler that the read or write to memory can't be reordered. If you do a read, it has to get it from memory right then, rather than reusing one from before that it might have stuck in a register. It doesn't tell the CPU anything about synchronizing its cache or executing the instruction in order, however. You've gotta have both.
In Java, it actually depends on the version of the language. third edition (Java 1.5, I believe). second edition (the first is the same; so Java 1.1-1.4). It appears to say what you said, but I don't buy it. Look at this article by a bunch of Java synchronization experts on double-checked locking. In particular, this sentence:
This change might have made it into the third edition. The second and first read like it provides this guarantee, but if these guys say not, then I'm not going to be depending on that without reading all of the chapter on thread interactions (not just the one section on volatile), reading everything they say, and doing some experiments. If that means my software runs 0.5% slower because I have more synchronization overhead than I need, then so be it.
This is nonsense. Java has no per-thread cache of variables. I don't even know what you are misdescribing. I can only guess that it's one of these things:
Increasing the complexity of your locking and making it more likely you'll cause deadlocks.
From your first sentence, I suspect you don't know what deadlocks are. Textbook definition: thread A has resource 1 and is waiting for resource 2; thread B has resource 2 and is waiting for resource 1. Consequently, both threads hang forever. Typically the resources are two mutexes (a.k.a binary semaphore, a.k.a the things you synchronize on in Java). If you have two threads that can acquire the same two mutexes in opposite orders, you have a potential deadlock.
The more mutexes you have, the more likely you'll cause a deadlock if you don't know what you're doing. If you have one big mutex for your whole system, deadlocking on mutexes is impossible. You should probably stick to that design until you research concurrency a little more.
More realistic than that confused nonsense? That's easy. Firewall did pretty well. (If you haven't seen it, stop reading now; I'll probably spoil it.) They used Ethereal (right after I'd spent a week at UNH-IOL looking at that screen). Some sort of pattern-based IDS. I think all of the technology mentioned in the film was real, down to the dog-collar GPS tracking system. The scanner/iPod hack was implausible, but not impossible. So if you forgive them a couple plot devices (like that "diagnostic" in which they scroll through all the account balances on console), they did pretty well.