DragonWriter · Slashdot Mirror

Re:Relaying my comments from the blog on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 12:10 · Score: 1

Quorum systems clearly fail the: "No set of failures less than total network failure is allowed to cause the system to respond incorrectly" criterion since one only needs to destroy quorum to stop the system from working.

That's not at all true. Not responding (which is what most quorum systems do when a quorum is lost) is not the same as responding incorrectly.

It is a failure of availability, not partition tolerance.

Re:NoSQL is also about arbitrary schemas on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 11:04 · Score: 1

This is true if you use the Agency List Model for hierarchical data.

You probably mean "Adjacency List" here.

Re:NoSQL is also about arbitrary schemas on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 10:55 · Score: 1

And on top of that, just try writing a query for hierarchical data! You'll have sub-selects for each level of hierarchy.

Well, if you are using a database that doesn't support the recursive form of Common Table Expressions from the SQL standard (from, IIRC, SQL-99.) That I know of DB2, SQL Server, Firebird, and PostgreSQL support that, and Oracle has a custom syntax for heirarchical queries (I think Oracle 11 may support recursive CTEs with standard syntax, as well.)

Using recursive CTEs, its possibly to do queries against heirarchies (and other graphs) directly in SQL, including queries that handle arbitrary depth rather than known-depth structures of the type you describe.

Re:Just in case anybody else doesn't know... on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 10:40 · Score: 1

Also worth noting: just because it has ACID doesn't mean that it's SQL, and just because it's SQL, doesn't mean that it is ACID.

Unless I recall correctly, to the extent an "SQL" system doesn't offer ACID guarantees, it is also divergent from the SQL standard.

Lots of things that use "SQL", of course, fall short of the standard in the details, and some things that use "SQL" (even as part of their names) aren't so hot on the fundamentals, either.

Re:Relaying my comments from the blog on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 10:27 · Score: 1

And don't forget the CAP theorem. You can't get Consistency, Atomicity and Partition Tolerance at the same time. RDBMS typically 'solve' it by dropping the requirement for the partition tolerance. Usually by using quorum sensing schemas, etc.

First, CAP is Consistency (which is different than the Consistency in ACID, incidentally, its consistency across nodes, not consistency with integrity constraints), Availability, and Partition Tolerance.

Second, quorum-based systems sacrifice availability, not partition tolerance; they are typical of distributed, strongly-consistent, databases. Sacrificing partition tolerance essentially means that the system cannot be implemented on top of a network that can lose messages (this is typical of non-distributed RDBMSs.)

Re:Interesting thesis on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 09:51 · Score: 1

Which, to anyone who has seriously thought about how to implement atomic transactions in a nosql environment, should not exactly come as a shock. It's the obvious solution to the problem, and I'm sure if you dig into it you'll find hundreds of implementations that work just like that.

The few distributed "NoSQL" implementations (like Scalaris) that offer strong consistency do this; but its about consistency, not atomicity, and most dstributed NoSQL implementations don't offer strong consistency guarantees, so they don't do this. (Note that this is a choice, not a failure: it has been proven -- the CAP theorem -- that for a system on a network which can partition, you can guarantee only one of consistency and availability, and there are definitely use cases for either guarantee.)

Re:Pfah. on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 09:12 · Score: 1

Big Table does offer ACID transactions (which is what the article's really about)...they just scale very poorly. I'm not sure how well they possibly can. If you have three different clients connecting to 3 different data centers scattered around the world, trying to transfer all the money from the same bank account into a Swiss numbered account...that needs ACID enforcement, and you will have a nasty performance hit. Two of those transactions absolutely must fail.

Two of them must fail, but that doesn't necessarily mean you have to have poor scalability. Scalable distributed systems with ACID guarantees have been demonstrated (TFA specifically addresses one, but others have been demonstrated before, e.g., Scalaris.)

Re:Pfah. on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 09:01 · Score: 5, Informative

Doesn't work so well if you've got a graph structure or a tree. If in a family tree, you want to find all 5'th descendants or all descendants of some guy, SQL won't make you happy.

A decade plus ago, and that would be true.

Standard SQL from SQL-99 on will, in fact, do this quite easily with via recursive Common Table Expressions. Now, some SQL-based DBMSs don't support enough of the standard to use this, but, current versions of, I believe, DB2, Firebird, PostgreSQL, and SQL Server all implement standard CTEs well enough to do those examples in SQL fairly directly, and Oracle has its own proprietary syntax (CONNECT BY) that works for the examples that you pose, though its less general than SQL-99 recursive CTEs.

Re:I hate SQL and Databases in General... on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 07:00 · Score: 1

Has relational algebra changed (no, it's complete)? Why would the basics of SQL change then?

Because SQL isn't a particularly faithful implementation of relational algebra?

Re:I hate SQL and Databases in General... on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 06:36 · Score: 1

BTW: The math in set theory hasn't changed since the 1960's, it doesn't "get old" and need replacing.

Its worth noting that, in additional to the arguments from proponents of non-relational databases, SQL also gets criticism from proponents of actually doing set theory right (e.g., Date and Darwen.)

Really, SQL and the databases using it are shaped as much by optimization of disk-based storage using popular computing architectures of the time at which it took shape as any mathematical model of data.

As computing architectures and performance attributes (not speed, but relative costs of different access patterns) of storage media change, underlying database implementations and the languages that best leverage them may change, even when you want to be generally guided by set theory.

Re:Pfah. on Yale Researchers Prove That ACID Is Scalable · 2010-09-01 05:29 · Score: 4, Insightful

NoSQL never was necessary. Traditional SQL database - not just terascale, but even simple ones like MySQL - regularly deal with data volumes at Google and Walmart that make the sites that built these databases in desperation look positively tiny.

Database size was never the main driving force beyond the new move toward NoSQL databases. Support for distributed architectures is. In part, this is about handling lots of queries rather than handling lots of data; it also -- particularly if you are Google -- deals with latency when the consumers of data are widely distributed geographically.

And note that one of the companies that is heavily involved in building, using, and supplying non-SQL distributed databases is Google, who, as you so well point out, is very much aware of both the capabilities and limits of scaling with current relational DBs.

This new research may offer new prospects for better databases in the future -- but TFA indicates that the new design has a limitation which seems common in distributed, strongly-consistent system "It turns out that the deterministic scheme performs horribly in disk-based environments".

In fact, given that it proposes strong consistency, distribution, and relies on in-memory operation for performance, it sounds a lot like existing distributed, strongly-consistent systems based around the Paxos algorithm, like Scalaris. And it seems likely to face the same criticism from those who think that durability requires disk-based persistence, and that replacing storage on disks (which, one should keep in mind, can also fail) with storage in-memory simultaneously on a sufficient number of servers (which, yes, could all simultaneously fail, but durability is never absolute, its at best a matter of the degree to which data is protected against probable simultaneous combinations of failures.)

So -- reading only the blog post that is TFA announcing the paper and not the paper itself yet -- I don't get the impression that this is necessary are giant leap forward, though more work on distributed, strongly-consistent databases is certainly a good thing.

Re:AT&T is more right than you can imagine on AT&T Says Net Rules Must Allow 'Paid Prioritization' · 2010-09-01 05:04 · Score: 1

The only issue we've ever seen is something I'm not even sure a network neutrality law would stop - Comcast forging packets to screw over BitTorrent.

Insofar as BitTorrent (while it may sometimes be used for illegal purposes) is not, per se, illegal, and one of the core net neutrality principles that has been in each of the FCCs articulations of net neutrality is that ISPs must not prevent users from consuming or providing any legal service and must not prevent users from attaching any legal device to the internet, this would certainly be within the coverage of any concrete regulation that actually addressed the net neutrality principles meaningfully.

Instead of adding new regulation, why not loosen up that one and see what real competition does for the internet instead of the government-enfornced monopolies we have today?

We don't have government-enforced ISP monopolies today. Some ISPs benefit from current or former monopolies (government enforced or otherwise) in the local phone service and/or cable television markets, but for the most part they do not have any government-granted monopoly on ISP service. However, without active government intervention (e.g., to secure necessary property rights to build cable networks), its practically impossible for anyone to compete with the established players on an equal footing.

Re:Prioritization can work... on AT&T Says Net Rules Must Allow 'Paid Prioritization' · 2010-09-01 04:54 · Score: 1

The usual Slashdot response is that there is no way prioritization is compatible with net neutrality, but we only have to look at the post office to see that it can be done. You have the choice to send by standard mail, or to pay more to speed up delivery. I'll grant that it's not a perfect analogy, but there are models that would work.

Its not only not a perfect analogy, its a fatally flawed analogy. Paying the USPS for different grades of service for particular pieces of mail is not analogous to a non-neutral net.

Individual postal contractors who actually own the fleets of trucks that deliver mail in the integrated network that forms the USPS providing different service levels for particular letters or parcels based on (e.g.) whether or not the addressee of the mail had paid them an extra premium fee on top of what USPS was paying them, or whether or not the addressee was one of their competitors in the shipping business, would be, very loosely, analogous to a non-neutral net, but really any USPS-based analogy to the Internet regarding neutrality is problematic because the USPS is not an interlinked set of independent networks like the internet -- while there are different delivery firms that play a role in delivery via the USPS, they are all contractors working for the USPS.

Re:An elegant solution to a non-problem on GMail Introduces Priority Inbox · 2010-08-31 08:31 · Score: 1

I'll suppose you failed to consider the final paragraph

I'll suppose I did not.

but even if you merely disagree with the notion that this technology makes the problem worse,

Yes, I disagree that stopping the problems caused by excessive nonpriority, nonabusive email making it harder to identify the email that warrants immediate attention makes that problem worse.

You seem to conceptualize the problem as the mere existence of email that doesn't warrant immediate attention. To the extent that it is a problem, its a much smaller problem for the user than the problem caused by the equal prominence of that email in a UI with the email that does warrant immediate attention.

consider this:
Allowing someone to email you is a choice.

In a sense, this is true, though its a choice a lot of people don't want to make on a positive basis. I am probably much more prone to using and managing manually-defined filters than most people, and even I don't want to add authorized senders to a whitelist, nor do I want to punish any but the most egregious offenders with a blacklist. (I've been using internet email since 1990, and heavily since about 1994, and, aside from spam, blocked exactly one person, ever.)

You may wish to blacklist (or fail to whitelist, whatever) everyone who doesn't conform to narrow personal rules that limit your inbox to high-priority items. And, if so, that's a great solution for you. But I don't think most of the people using email prefer that solution to the issue of assuring that the most important email is the most prominent.

Re:An elegant solution to a non-problem on GMail Introduces Priority Inbox · 2010-08-31 05:52 · Score: 2, Insightful

It is also an absolute non-problem.

I suspect that Google has a lot better handle on their users needs than you do in this area. Your proposed alternative is to get all senders in the world to change their behavior to fit the receiver's preferences. Google's new optional tool allows receivers using GMail a way of getting a reasonable first-cut view of message priority that is based on the receivers treatment of past messages without senders changing behavior. Google's tool, it seems, is more likely to work in the real word.

Re:So let me get this straight on GMail Introduces Priority Inbox · 2010-08-31 05:46 · Score: 1

Wrong, on all three counts.

Their theories are more like #1:

1. Lots of people (not everyone) have too many emails in their inbox to tell the important ones at a glance,
2. Manually creating and updating rules, while useful for lots of categorization applications, is a clumsy and time consuming way of getting a good first-cut of what is likely to be important for many users. (not "Rules are too complicated to use.")
3. Priority inbox will be useful for many users because it creates prioritization rules that are based on users reading and replying behavior which seems likely to provide a good first-cut of what is important to the user with minimal user involvement.

But in the long run, its not very useful for any user with a hint of intelligence,

Given that I imagine its been used inside of Google before being released as a public product, I doubt that.

and like other people are already stating - the inner workings will be dissected enough to where people will filter messages to get a higher rank.

Since prioritization for each user is based on what that user reads and respond to, in order to game the system the sender would essentially to have access to your GMail usage history; if that's the case, I'd say that worrying about that information used to manipulate Priority Inbox would be the least of your concerns.

Re:Intriguing, but... on GMail Introduces Priority Inbox · 2010-08-31 05:40 · Score: 1

This is intriguing, but it just seems to add yet another layer. Is it really needed? By leveraging Filters and Labels, you can automatically categorize email to whatever you want.

By using filters and layers you can manually create rules to categorize email.

Priority inbox doesn't require you to manually create rules, instead it infers the likely priority of mail based on your reading and replying habits.

They both have their uses.

Re:Of course they do... on Oxford Dictionary Considers Going Online Only · 2010-08-31 04:39 · Score: 1

They release every 20 years, not 10 years.

The project was started in 1857. The first version (under the name _A New English Dictionary on Historical Principles_) was published in installments between 1884 and 1928, a supplement and reprinting under the title _Oxford English Dictionary_ took place in 1933, supplements to this first edition OED were published between 1972 and 1986. The Second Edition OED was published in 1989. I'm not sure how you infer either a regular release cycle of 10 years or 20 years or, frankly, any other regular release cycle from that history.

Re:Of course they do... on Oxford Dictionary Considers Going Online Only · 2010-08-31 04:32 · Score: 1

Probably 95% of the definitions haven't changed recently, so using last decade's edition is hardly the sin that using, say, a ten-year-old IT book would be.

In the first section they revised for 3E from the current 2E, over 25% of the headwords are new from 3E, and the total wordcount has doubled.

While an OED would be awesome, even if I shelled out for one I wouldn't pay again in ten years. How many people who need one now will have the money, space, and incentive to get a new one in a few years?

Almost everyone who would actually buy the standard multivolume hardcopy OED (which is mostly academic and institutional -- and often academic institutional -- purchasers) rather than just say that it would be awesome in theory to have.

Because the people that have a need for a complete lexicon that warrants the price of the OED in the first place tend to have a need which justifies paying the price to have the most current complete lexicon.

Re:Don't they already have a tool for this? on GMail Introduces Priority Inbox · 2010-08-31 04:21 · Score: 3, Insightful

I thought that's what filters were for.

No, filters are for categorizing mail by the criteria you have thought through and told Gmail about.

Priority Inbox is an option that, when you use it, tells Google you want it to do best-guess prioritization automatically, without you telling it any more than "do your thing".

Priority Inbox will probably be most useful for people who don't want the bother of defining filters, though people who do have explicit filtering rules that are used to categorize mail may also find it useful for prioritizing the stuff that's left in the inbox.

Re:Of course they do... on Oxford Dictionary Considers Going Online Only · 2010-08-30 10:28 · Score: 1

They don't seem to get many "hits" when divided by the number of entries/articles.

By that standard, Google doesn't get many hits either.

By why would anyone ever use that standard?

Re:That's too bad. on Oxford Dictionary Considers Going Online Only · 2010-08-30 10:14 · Score: 4, Informative

I'm going to miss the deluxe boxed editions that are over 12 pounds of dead tree plus a little drawer complete with magnifying glass. I'm not kidding, I once saw one a book shop that had a little compartment that held a magnifier.

The Compact Edition (the two-volume version of the First edition or single-volume version of the Second edition which used even-smaller print) that come with a magnifier is not a deluxe edition. It is an inexpensive (compared to the regular, multivolume normal-print set), portable (again, compared to the regular, multivolume, normal-print set) reproduction of the regular set.

Re:A tidy sum in sales of the printed version... on Oxford Dictionary Considers Going Online Only · 2010-08-30 10:09 · Score: 3, Informative

Exactly, and its all for the same work. This next edition when it comes out in 2020 or whenever can still pretty much use 99.999% of definitions from 1989, the definitions of words don't change too much in academia, after all the OED isn't going to track the movement of slang that is in use for a year or two then fades out of the vernacular.

As a reality check on this, the first installment that was revised -- which deliberately started with a portion of the dictionary expected to need less revision than some other portions -- has 1,045 main entries, 286 of which were added in the revision (63 of those were included in previous supplements, so "only" 223, or 21.3% were completely new), and ~400,000 words of text (compared to ~200,000 words of text in the corresponding sections of the existing edition.)

So, no, the 3rd Edition is not going to be, from the facts in evidence at this point, just as minor update to the second edition.

Further, as to your comment about whether or not the OED will endeavour to track transitory slang, to quote from the preface to the Second Edition: "The aim of this Dictionary is to present in alphabetical series the words that have formed the English vocabulary from the time of the earliest records down to the present day, with all the relevant facts concerning their form, sense-history, pronunciation, and etymology. It embraces not only the standard language of literature and conversation, whether current at the moment, or obsolete, or archaic, but also the main technical vocabulary, and a large measure of dialectal usage and slang."

Re:Consumer financial sense??? on Oxford Dictionary Considers Going Online Only · 2010-08-30 09:54 · Score: 2, Insightful

So instead of paying $1,165 for something you can touch and have access to whenever you want (and possibly resell) Oxford thinks consumers would rather pay $8,850 ($295/year * 30 years (rough average time between releases)) and get something that they cannot access whenever they want (servers go down, power outages, etc.) instead?

Yes, they do, and they are probably right, since they're online subscriptions already vastly outnumber the full-size, full-content hardcopy sales.

Of course, you forget the benefits that online access has over the takes-a-whole-bookshelf edition: you can access it anywhere you have internet access, rather than anywhere you have the whole bookshelf with you, and you get the updates between hardcopy releases as the drafts are ready, rather than having to wait through the multi-decade cycle of hardcopy releases.

Considering that the whole reason to spend the large amount of money to get either the bookshelf version or the online version of the OED is that a complete lexicon of the English language is important to the user, the online version makes a lot of sense to the people that are in the market for the OED in the first place.

Also, considering that a lot of the online use is institutional, not individual, which has different pricing and often includes permission to download the entire database to local servers rather than accessing it from Oxford's servers (and, also, that most of the bookshelf-versions hardcopy sales are to institutional purchasers) and retiring the bookshelf-sized hardcopy version in favor of online access makes a lot of sense.

Re:It's a nice framework on Rails 3.0 Released · 2010-08-30 09:37 · Score: 1

It's a JIT-compiler which translates Ruby to Java bytecode. It's no surprise that this performs better than the C version, which is a bytecode interpreter -- the Java bytecode will be JIT-compiled again by Java into native code.

The only current benchmarks I've seen (which can't be trusted any further than most benchmarks, but what else are you going to use) show the current, bytecode interpreting, C-based Ruby (Ruby 1.9.2p0) outperforming JRuby 1.5.1 everywhere where the two aren't approximately equal.

Earlier versions of JRuby outperformed, across the board (except when using the obscure features that JRuby doesn't implement at all, like continuations which, at the time, were pretty poorly implemented in the C-based implementation) Ruby 1.8.x, which didn't compile to bytecode and run on a VM, but was a plain old AST-walking interpreter. Ruby 1.8.x was notably slow, even among dynamic "scripting" languages. Ruby 1.9.2 is far from fast, but seems to be roughly in the same ballpark as current Perl and Python implementations.

Slashdot Mirror

User: DragonWriter

Comments · 10,360