Master+of+Transhuman · Slashdot Mirror

I Hadn't Looked At This Study Before Today on Windows vs. Linux Study Author Replies · 2005-11-28 06:08 · Score: 1

and I still haven't read it (and I won't, for various reasons, including lack of time and frankly, lack of interest related to the reasons below.)

This point here http://interviews.slashdot.org/comments.pl?sid=168 949&cid=14084692/,however, makes it fairly clear that there were problems with this study. To what degree these were mandated by Microsoft, or added by someone on the research term with a bias one way or the other, or by someone on the team who just didn't know better, or whatever, is unclear. I won't make any accusations here at all.

One thing I would ask is: if the SUSE system had to be upgraded to the point that the RPM manager broke, why weren't backups done beforehand to be able to restore the system to its original configuration, to be able to back out the changes? Seems to me any competent sys admin - particularly one with enough experience to know that upgrading the compiler and/or libraries is risky - would have made sure he could recover the system if something broke.

This indicates to me either that the Linux sys admins weren't as competent as their years would indicate - and having five copies of one year's experience doesn't make you an expert, as the saying goes - or that there were other constraints on their performance NOT mentioned in the study - which would indicate bias (or incompetence or simple error) in the study design.

I think the real problem with this study is the idea of having a reproducible scenario to follow. In the real world, Linux vs Windows entails major differences in IT policy, administration policy, software, admin technigues, etc., etc. To even try to compare these on the basis of a single scenario is to compare apples and oranges. Also such a study does nothing to analyze the overall issues of vendor lock-in, security, quality of software, and many other issues.

It's easy to compare reliability and stability - how often do you reboot the machine? How often does the system crash? How often do you have security penetrations? It is NOT so easy to compare overall system functioning in a live environment. In that sense, this study HAD to be either biased or unable to come to any definite conclusions almost by definition. I am pleased to see the author acknowledging that the sample was too small to make any definitive conclusions, but I question his suggestion that the methodology has value.

This is essentially the problem with TCO studies in general. As a lot of people have said, TCO is very particular to what your overall policies and procedures are and these are specific to a given company. If you're a "Windows shop" and have no clue about anything else, it's going to cost you more IN THE SHORT TERM to switch to Linux than if you come from a UNIX shop. That's obvious. The REAL question is: what is it going to cost you overall OVER TIME to STAY a Windows shop than switch to Linux? Most TCO studies don't even attempt to touch that question. But the problem here is that a Windows shop is going to be totally different from a Linux shop, even if the "same" administration functions have to be done on the same hardware for the same applications.

There's just too much generality being brought down to too much specificity and too much extrapolation from the results to place any trust in these studies. And this author doesn't seem to show any more understanding of that than other authors - not surprising, since he's in the business of producing these studies.

I think it might be better to rely on more anecdotal studies of mixed Windows-Linux shops, such as we've seen occasionally here from sys admins working in them, that indicate the common experience of sys admins working on both sides, or the results of companies who HAVE mass-converted from Windows to Linux and who have then measured their costs and savings.

And in those studies, Linux beats Windows every time.

Again, remember what I always say in interpret

Re:GPL resistance? on Nessus 3.0 discussed · 2005-11-27 10:41 · Score: 1

"His failure to compete is because he has to do two things, develop the code AND support it, while his competitors only have to do one thing. Support it."

How is this different from any other closed source company? They have to develop and support, too, and their competitors in the SUPPORT business only have to support.

Entire classes of VARs exist that do just that.

And saying "support" means customization of the code, as some people here have said, is just a red herring. It doesn't. It merely means you known the software well enough to use it properly and guide clients in its proper use.

Not to say that customization of the code isn't a valuable service that only the developer of a closed source product can do. But then, the developer of an open source service can do it, too, and indeed this actually is one of the means the developer can enhance his product - by adding code extensions requested by support clients. The fact that other support competitors can do it, too, isn't significantly different. In fact, since those enhancements are also under the GPL, the original developer can take them into his product just as easily. So in reality, any of his support competitors who modify the code are in fact contributing to his product (assuming he has some access to that code, which obviously is not always the case. But that's an issue for the competitor's clients, too, because now they have a fork in essence.)

We'll see what the results are. If Nessus is that valuable, as you say (and I believe it is), and if in fact, as someone here said, that the problem was that enhancements were actually rejected by Tenable, then I expect it WILL be forked and he'll end up with MORE competitors, not less.

And that will prove my point.

Of course, it will also prove that the OSS model works - because consumers will now have a choice between Tenable's version and the new fork. And that's the proper result of competition.

Re:Support on Nessus 3.0 discussed · 2005-11-27 01:01 · Score: 1

Agreed. And my prediction exactly. Tenable has cut its own throat.

They've blamed the wrong thing for their failure to date - which guarantees greater failure in the future. Classic bad management.

Microsoft Hatchet Job Using The Globe on Peter J. Quinn Investigated for Travel Omissions · 2005-11-27 00:54 · Score: 2, Insightful

Nothing more.

The Globe is owned by the New York Times, which is Sultzberger being used by Bush and cronies to sell the Iraq War. Now we have the Globe being used by Microsoft to attack the Open Document Format decision in Massachusetts.

Once a sellout, always a sellout.

The Article Is Crap on A Look at Windows Server Outselling Linux · 2005-11-27 00:51 · Score: 1

Uninformed and mostly speculation about events that haven't happened yet - except the obvious one that Microsoft servers cost more than Linux servers, therefore their revenue is higher.

Duh!

Morons.

Windows Live? Gimme a fuckin' break!

Re:GPL resistance? on Nessus 3.0 discussed · 2005-11-27 00:45 · Score: 1

I said nothing about "expecting to roll forward without any contributions". I said it is not required for everybody who is a user to be a contributor, nor is it required that a community develop to BE an OSS project.

And where did I ever mention moving to closed source as better for the species than OSS?

Are you sure you're responding to the right post? If not, get a clue.

Re:GPL resistance? on Nessus 3.0 discussed · 2005-11-27 00:42 · Score: 1

Not necessarily. His competitors in the SUPPORT business don't need to do anything. They can just wait for another set of OSS developers to fork the project, build in the speed improvement (you think they can't figure out how to do that from the existing code - or by reverse-engineering the new binary?), and then the support competitors can go right back to competing with him again on a level playing field.

The worst that can happen to his support competitors is that they lose market share by having to wait for the fork and speed increases to be developed. I doubt they'll have to wait long...

So again, he's blamed the wrong thing for his failure to compete. The OSS code is not the problem - it's his execution of his support business model that is the problem. And by blaming the wrong thing for his failure, he guarantees that he will continue to fail.

Worse, he's cut his own throat. Now instead of having people utilizing HIS product - and thus maintaining some contact with and even control over the competition - he has forced others to either fork his product or come up with even better products which he will have NO control over - and which his support competitors can use to even greater advantage.

Dumb, very dumb.

Re:End of the day, you don't eat good intentions on Nessus 3.0 discussed · 2005-11-27 00:36 · Score: 1

"If it is far easier to make money with the proprietary business model than the open source business model then that means the open source business model SUCKS .....at least for making money."

First of all, as I said, the OSS model is NOT a BUSINESS model, it is a DEVELOPMENT model. Therefore your entire argument is irrelevant.

Secondly, you can produce a business model around the OSS development model to make money with. Red Hat and numerous others do. If Tenable is trying to develop a support income from Nessus, THAT is the business model. Whether it is successful or not depends on the details of that model and the execution of that model.

Ergo, if Tenable can't, their business model is wrong - or more likely nonexistent - or poorly executed - and it's irrelevant to the OSS development model. Therefore going closed source is not going to help them - which is exactly what I predict.

Which means your second paragraph is wrong, too - the decision maker just made another bad decision that will likely end up costing him his business in due time - because he's blamed the wrong thing for his failure and that's guaranteed to continue to make him a failure.

Re:One more point on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-27 00:27 · Score: 1

It has been a good discussion.

I think the only place I can refer you to with regard to Date's examples is the one in his "Database in Depth", Chapter Three, where he provides the following example of an erroneous query as a result of NULLs:

You have a database consisting of two relations:

Relation S: Supplier Number: S1 City: London
Relation P: Part Number: P1 City: NULL

Now you do the query as follows:

SELECT S.SNO, P.PNO
FROM S, P
WHERE (S.CITY P.CITY) OR
(P.CITY 'Paris')

For the only data in the database, the SQL result is: UNKNOWN OR UNKNOWN, which reduces to UNKNOWN, so nothing is retrieved.

But part P1 DOES have a corresponding city - xyz - in the REAL world, and it is either Paris or it isn't. If it is, the WHERE clause evaluates to 'London' 'Paris' OR 'Paris' 'Paris' - since the first term is TRUE, the expression would evaluate as TRUE.

If the P1 city is not Paris, then the expression would evaluate to 'London' xyz OR xyz 'Paris' - the second term is true, so the expression would evaluate as TRUE.

Thus, the boolean expression is ALWAYS TRUE in the REAL world, and the query should return the S1-P1 pair, REGARDLESS of what REAL value the null stands for (or is missing). The result - nothing returned - that is correct according to 3VL is different from the result that's correct - everything returned - in the real world.

He follows this up with the (more contrived but more obvious) example:

SELECT P.NO
FROM P
WHERE P.CITY = P.CITY.

The REAL world answer is the set of part numbers in the database. Using NULLs, however, you get nothing again.

His point apparently is that it doesn't matter what NULL is being used for - the query should respond as it is in the real world.

His conclusion is that you can't trust a database with NULLs because you can't know what answers to what queries are correct in the real world.

Using nonexistent tuples to represent the above missing data, you end up with the same result - nothing is returned, because Part P1 has no city and thus isn't in the database at all. I don't know the answer to that except the one that Darwen proposes - Part P1 is in a Relation U with the header "City Unknown", not in Relation P at all. In that case, the only way to get the real world result Date wants is to have a join with Relation U which always returns TRUE and returns a value of "Unknown" - which means the city DOES exist but is unknown (if a city didn't exist, there would no tuple in Relation P OR Relation U). And the value returned for that city is precisely the string "Unknown". That at least is a real value that can be represented in output.

It's a mess, all right. However, the important point is that Date believes that it is important to get the relational model correct, and THEN worry about the implementation. And he believes that the implementation could be much more sophisticated than present DBMSs are capable of - the prime example being the TransRelational Model implementation by Required Technologies. So the issues of performance required by having all these extra relations for "unknown" and the like would be minimized by a better physical model.

I don't know. Until a better DBMS is built, it's all irrelevant, I guess, since we're stuck with NULLs. All we can do is minimize their use and hope we don't have to use them too much to minimize performance issues with strict normalization.

As usual, the IT situation is depressing at best.

Thanks for the conversation, it has helped me at least consider the issues more clearly.

Re:One more point on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-26 12:09 · Score: 1

There may be no semantic difference between a missing value in a join and a NULL in the general case we're discussing, but it's possible that Date is referring to the overall issues of missing values AND the effect of NULLs in producing three-valued logic - which he demonstrates, in one case in the book, must necessarily produce an erroneous result - which presumably an entirely missing tuple wouldn't produce in the same situation.

In other words, if a thing isn't there, it isn't there. Whereas if "something" (a NULL) IS there, what exactly does it mean? And even if you know it means "unknown" (or "unavailable" which is basically the same thing in this case), in SOME queries, this might produce an erroneous result, according to Date's specific (if, as he admits, contrived) example.

So if you allow NULLs, you end up with a situation where the semantic result in MOST cases is identical to a missing tuple, but in SOME types of queries, you get erroneous results that a missing tuple method of handling missing data wouldn't provide. IF that's true, then it's a good argument for not using NULLs.

I haven't gone through that chapter in detail yet, so I can't say if that's the main issue or if that is what he's saying. Date left out a lot of stuff in discussing this due to space constraints in this small book.

He said that ONE argument against nulls was the erroneous result problem, and THEN in relation to using NULLs to handle missing data, he refers the reader to chapter seven, where he discusses the empty tuple and refers to another example illustrating its use. So I assume the two are related, but perhaps not directly. He also indicates there are other reasons he doesn't go into about both NULLs and missing data. Soo I'll need to look elsewhere for a more definitive discussion.

Re:GPL resistance? on Nessus 3.0 discussed · 2005-11-26 11:37 · Score: 2, Insightful

While I agree that OSS should be a two-way street, it doesn't require EVERYBODY using an OSS product to contribute to the project.

The idea that all users should be developers is nonsense. "Contributors", perhaps - "Here's a feature we'd like you to provide" - but even there, some people may use a product and be perfectly happy with what it does and not need anything else.

You can't say they can't use it just because they don't contribute to the project. That's just making a contract law substitution for a monopoly - "You can't use this unless we benefit directly." How is that different from the RIAA and MPAA wanting to license every possible meaning of fair use to produce revenue?

It's normal that humans do this - no human can possibly allow any other human to somehow profit from the first one's actions. It's just not human nature. But it's not rational and it doesn't work to the benefit of the species as a whole, and thus it doesn't work to the benefit of most individuals, due to the economic effects.

As for people developing services around the product that compete with the developer's own services, this is, as I pointed out, irrelevant to the OSS model. It's the BUSINESS model that matters here, not the development model. So he closes the source? So what? Just because he speeds up Nessus by a factor of five, does he think no one else will? If somebody forks version 2 and speeds it up by 5, his competitors can use that version to continue to compete with him. It's totally irrelevant whether the source is closed or not in that regard.

The OSS model did NOT "cost him" - his business model - or lack of one - is what cost him.

Re:Why NULL's exist on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-26 11:29 · Score: 1

Right, understood. Zip codes would always be in the address table, however that table is designed. Unless there is some possibility that a zip code would not be available because it didn't exist for that location (I'm talking in the US only here, not internationall) - I suppose that's possible.

Sort of like the problem with international addresses - there's such variation that designing an address database that cleanly handles any kind of address is an interesting exercise.

Re:open source != open source project on Nessus 3.0 discussed · 2005-11-26 11:10 · Score: 1

Excellent point.

As I say in posts elsewhere, the fact that supposedly nobody contributed to Nessus probably has a REASON behind it, and in any event is irrelevant to the decision to close the source. They're simply trying to say that it won't make any difference by closing the source since it was all theirs anyway - and that it is not necessarily true.

The REAL reason is they can't figure out how to compete against people using their product without closing the source.

Re:I got some interesting results on my PC on Nessus 3.0 discussed · 2005-11-26 11:04 · Score: 2, Funny

Switch to Linux - I assume that was the last output Nessus put up on the screen before the PC left...

Re:End of the day, you don't eat good intentions on Nessus 3.0 discussed · 2005-11-26 11:02 · Score: 1

"but in the real world, there's cheques to sign."

Let's not forget also that in the real world, there are people who figure out how to get checks signed - and people who don't. That's true for proprietary software companies, too.

If you can't figure out how to make money from open source, the decision is usually to go closed source.

Says nothing about open source as a business model, really (especially since open source ISN'T a business model, it's a development model). Says lots about the decision maker.

Re:Seems simple enough... on Nessus 3.0 discussed · 2005-11-26 10:55 · Score: 1

I don't see how the next version of the GPL can "close" that "hole". And if it does, we're likely to see more proliferation of licenses than we have to date.

The idea that providing support for an OSS project independently of the project is against the OSS concept is just nonsense. The GPL is intended to insure access to source code and prevent that source code from being appropriated by proprietary companies and closed. Nothing more. It says nothing and should say nothing about how money is made around OSS. While I assume Stallman would like to see everything "free as in beer", the GPL has recognized from the start that it isn't likely to happen and has never required a restriction on making money from OSS.

I CAN see using trademark law (or the equivalent license terms under a new GPL license - which would be better) to prevent someone from taking an OSS project and using the NAME to brand your own (possibly incompatible or even totally different) version. After all, that's what Linus is doing with the Linux trademark (and others are doing with other OSS projects.) The purpose of trademark law is to prevent consumers from being misled about a product and to prevent fraudulent representation of a product - both laudable goals in the marketplace and appropriate in the OSS model.

The problem will be how to specify in a license the conditions that establish that including an OSS product in your product and then using the name of that product is IN FACT being used to misrepresent your product. I'd say that's a tough one to construct. That's where legal proceedings or arbitration come in, usually - establishing the facts of a situation.

We don't want to err on the side of preventing people from incorporating an OSS product in their product - one advantage of OSS is to produce useful products that can be built on to produce more useful products. We don't want to thow the baby out with the bathwater because some people prefer to take an easy way out in building their business model around an OSS product, to the detriment of the original OSS product or its developer.

Re:Hold your horses on Nessus 3.0 discussed · 2005-11-26 10:42 · Score: 1

At the moment, I'm not saying that's not a good thing. It's good that the new version of Nessus will still be free (albeit with restrictions.) And of course there's nothing wrong with charging for support - that's not even an issue here.

I'm just saying the guy doesn't accept the OSS model anymore.

That's fine, but his reasons aren't reasons - they're either irrelevant or simply a refutation of the OSS model per se.

Re:GPL resistance? on Nessus 3.0 discussed · 2005-11-26 10:35 · Score: 3, Interesting

"there has been almost zero code contributed to Nessus, and that by providing its source, they were helping a lot of companies that could be classified as their competitors"

First, the two points are independent.

And the first one is almost irrelevant. Who cares if nobody contributes to your OSS project? That's irrelevant to anything. Naturally, due to the nature of the concept of OSS, it would be BETTER if a community of developers appears and supports the project - that's the advantage of OSS over proprietary. But it's not a requirement per se. In fact, however, it usually indicates that there is a REASON for this - which might be how the project is run, the technical difficulty of the project, the niche market for the project, or any number of things - some of which might be solvable, some may not.

The second point is just a refutation of the concept of OSS: instead of trying to make money from support or other business models using OSS, just dump the concept and go back to being proprietary. It's NOT A REASON, it's a CHOICE!

And again, it goes back to the what and how of the project. Does Linus complain that Sun uses Linux while producing OpenSolaris - arguably a "competitor"? Granted, Linus doesn't view himself as a "competitor" in business against Sun - he's simply a developer who wants to advance the state of the art in OS building.

The problem is, the Nessus guy does view himself as a competitor in a closed market. He wants to use Nessus to produce other security software and sell it. He views everybody else who uses Nessus to produce other security software to sell as "competitors". Well, they are - if that's your business model.

It's an issue of perception, however, not necessarily reality. It's also an issue of whether you feel you can BE competitive on a level playing field - obviously this guy doesn't.

That doesn't make his choice the right one - it's just his choice. I think it will cost him in the future.

Open source doesn't mean you don't have competitors. Every project stands or falls on its merits in the marketplace of ideas. That's why we have something like a thousand Linux distros - most of which are utterly irrelevant to most users and utterly irrelevant to the position of Linux in the marketplace of users.

And open source as a SOURCE of business models is not different. The question is whether you can develop a business model that allows you to make money - or even get "rich" (whatever "rich" means to you), if you're smart enough - and that's really not relevant to open source as a development model.

Some people deride open source as a bunch of geeks working for free while somebody else gets rich off their efforts. While this may in fact happen on occasion, it isn't a direct consequence of the OSS development model.
The only place where it might be an issue is in developing something that can be seized on by a company like Microsoft which ALREADY has an monopoly position due to its closed source model and its business practices and then turned against the OSS developer. The GPL was intended to prevent this by disallowing the incorporation of OSS software into a proprietary product and closing off access to the source.

But the GPL says nothing about somebody taking an OSS product, incorporating it into a different (preferably better) OSS product, and thus obsoleting the original OSS product. The OSS COMMUNITY says that you SHOULD return value to the original OSS product. But that doesn't always happen, nor should it always happen.

If you develop an OSS product, and try to make a business out of it, you should be smart enough to assume that other people will take your product and try to develop a business around it as well - and conduct yourself accordingly. If you believe in the OSS model, you can find ways to continue to develop using that model and still compete effectively.

The Nessus guy just doesn't believe in the OSS model, it's that simple.

Re:Why NULL's exist on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-26 10:09 · Score: 1

Usually, in the case of something like branch offices, you break that off into a separate table. For instance, at City College, we have several possible addresses for staff and students, and several possible email addresses. Both of these get their own table linked by the foreign key to the primary person table. While putting all the addresses in one tuple might be just as unambiguous and provide better performance, I think it's cleaner to separate them out, even though relational theory actually allows for such.

As for zip code autolookup impacting responsiveness, I'd say that's irrelevant (or may be) because if you don't have the zip code, you're not going to be querying that record anyway (if a query depends on the zip code), and responsiveness is strictly an implementation issue anyway. Depending on how you do it, the impact could be really minimal.

It also argues for the practice of separating out attributes based on their probability of being known in advance as well. In other words, never put address in the primary tuple (physically) if it is subject to not being known. Only put attributes in the primary tuple that have to be known and are likely to be always known (name, for instance - if you don't have somebody's name, you really don't have anything at all).

I would agree that using NULLs for "optional" fields is not a good idea, either. I think Date discusses all these issues in his "Database in Depth" by referring to "empty tuples" - every subset of a tuple is a tuple, and it may be empty. That is, it may have a heading, but no values. And the best way to represent this is to break it out into its own relation which has no tuples. He shows this in the case of a supplier with a status and one without - the one without simply doesn't appear in the database.

In fact, a relation can even have an empty heading - he shows that this is equivalent to the concept zero in math and apparently is very important in relational algebra. He refers to two relations as Table_Dum and Table_Dee - relations of degree zero, of which only two exist. I'm still not sure I follow all this, but it indicates there are depths to this that I haven't plumbed yet.

I think Date's view of the absence of join values is summed up in the quote he uses when discussing 6NF, from Wittgenstein: "Whereof one cannot speak, thereon one must be silent." As Date puts it, "If you use nulls, you're effectively making the database state explicitly that there's something you don't know. But if you don't know something, it's much better to say nothing." So instead of showing a value with null, you don't show the value at all. As you say, it's effectively the same as using a NULL - but Date argues it's better since it doesn't introduce the logical ambiguities of NULLs in other areas.

Whereas you might not know WHY the value is not there, putting a NULL there doesn't help you solve that particular question. In my view, that goes back to the overall system design - if you have the right design as to capturing and vetting the data before it gets to the database, then you can find out from THAT why you don't have a value there. (And of course all THAT information can be captured in a database - just not the same database.)

Of course, all this depends on having a database that can handle the semantics of all this - and since most of them can't, we're back to using NULLs as a poor substitute.

I need to find time to finish reading "Database in Depth" - it's an excellent overview of these issues, although it's a bit terse due to the small size of the book (and I can't afford the $105 they charge for Date's huge college textbook on databases right now.)

Re:What do you even say to that? on MS Has Free Software Removed From U.N. Paper · 2005-11-25 21:56 · Score: 1

No, that was me.

Ballmer just wants to kill Google (and Linux, and OSS, and Sun, and IBM, and...probably Bill if Bill doesn't watch his back...)

I mean, "technological freedom is a political manifesto?"

As Jayne said in "Serenity", "Where's that git from?"

Also to paraphrase Jayne, "Microsoft is starting to seriously damage my calm!"

Re:Why NULL's exist on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-25 21:41 · Score: 1

I think it's acceptable to have multiple addresses in a record, at least if they're not all intended to be the same address TYPE (such as "home address").

I'm not sure I'd go for the idea that tables are just logically correlated attributes. In the entity concept, of course they are. If they aren't, I don't see the "logical correlation".

The zip code issue can be easily solved by an auto-lookup BEFORE having to put a NULL in the field, I'd say. There may be other attributes harder to come by, of course, so the principle might stand, but I'd still prefer to see this sort of thing handled by a properly designed data collection and vetting system that front-ends the database.

I'd agree that breaking out a column into its own table just because the data is missing is an extreme way to handle it - but my solution is not to let the data into the database at all if required data is missing.

Not every piece of data in a system needs to be in a database per se, however it is stored. You might store tuples with missing data in Oracle or wherever, as long as you don't treat it as part of the "real" database. Then NULLs don't matter. The issue is whether NULLs should be allowed in tables that are actually intended to be used for querying. If you have a customer with a missing zip code, leave that customer tuple in a "vetting database", rather than the "query database" - until you've retrieved the missing data. Makes for a nice clean solution - now you can query your heart out for any "missing" and "unknown" stats you want without worrying about mixing it up with "real" data...

Also, I suppose there's nothing preventing missing attributes from being in a "virtual table" while still residing in whatever passes for a physical table. The queries just ignore them entirely, just not at the NULL level, but on the table level.

Re:Why NULL's exist on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-24 22:00 · Score: 1

I agree with you on the wage example. An employee really shouldn't have both a pay rate and a salary, so such a model is not correct on the face of it (weird business rules for some weird industry not considered here.) NULLs don't help when the model is wrong, they make things worse. This is what I mean by reasoning backward - until the model represents reality perfectly, other issues - such as performance or even feasibility - shouldn't be considered, as they will introduce errors in the design. The reason TRM is considered so valuable by Date and Pascal is because it supposedly allows dealing with performance and feasibility issues without compromising (to some better degree at least than current DBMS systems) the relational correctness of a model.

As I mention in my other response, I think the zip code issue is not so much "atomicity" as it is whether an attribute is a REQUIRED attribute or not. If it is required, it should NEVER be "unknown" - therefore NULLs are not necessary. (If a fact is not "available", that information should not be in the database at all - it should be on somebody's "To Do" list to find that fact.)

If it is NOT required, it shouldn't be in the model at all, since by definition you're not doing anything with that attribute but collecting it for vague and unclear reasons - which leads to the sort of data corruption we see. This is a business issue as much as it is a relational model issue.

Databases are supposed to be collections of consistent verifiably logical propositions about entities and their relationships. Anything that weakens that weakens the value of the database. This is why I suspect NULLs can be dispensed with. I just need to nail down the "why" so it can deal with ANY situation that might arise where it SEEMS that NULLs are a solution. Otherwise people will just keep coming up with "Well, what about THIS situation?" - requiring re-analyzing the thing from scratch to demonstrate that their situation really isn't special.

This is where Date and Pascal get frustrated - they can explain the issues until they're blue in the face, but the rest of the industry just says "Well, it won't work BECAUSE...it DOESN'T work with existing DBMS systems." To which Date and Pascal can only respond, "Duh! What did you expect from DBMS systems that DON'T implement the model correctly?"

Might be interesting if the open source Rel language project ever gets anywhere in properly implementing the model. Or if the TRM company ever gets off the ground.

Re:I might add on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-24 21:37 · Score: 1

I agree with both points of your addendum. "Not applicable" is the wrong way to think about a NULL by definition. And they do introduce ambiguity.

I'm not sure about your other points, but that's because I'm not yet familiar enough with the relational model. I don't argue for any particular implemention for that reason, yet.

I think the classic case of a missing zip code is fairly easy to examine. Obviously the issue is context - when the attribute is empty because the data has not been entered, it is both "unavailable" AND "unknown". Obviously any query against that attribute must ignore that tuple because it is not known whether it applies to the query proposition. This is the key point: a query is an attempt to state a logical proposition about the data set.

Marking it only as "unavailable" doesn't solve the problem; marking it as "unknown" does but still renders the tuple useless for queries related to that attribute. Putting an empty string in there may or not be interpretable as either "unavailable" or "unknown" - that depends on the DBMS implementation and the business rules. The only real solution is either to fill in the missing attribute - or ignore that attribute - which is a business rule decision that is not relevant to the relational model per se.

What the relational model says is that you simply can't use a tuple marked as "unavailable" or "unknown" in a query that is stating a proposition about the set - unless that proposition is the simple one "is this attribute unknown?" NULLS are only useful to tell you what tuples belong to the set of unknowns - they can't tell you anything useful about any other query proposition.

The question I guess devolves to whether we need to mark a tuple with an unknown attribute as "unknown" and keep it in the set - which in some sense violates the intention of the database to be a collection of facts about entities and relationships - or move that tuple out of the database altogether until it DOES represent a fact about the entities and relationships.

In the old days of flat files, you had batch updates. Each batch update was preceded by a vetting run that dumped out records with missing or incorrect data. In that sense, the only thing that was (supposed) to make it into the end result files was good data. No "unknowns" were allowed (in a good system design) because it was too hard to get rid of them later. If something was "unknown", somebody was assigned to MAKE it "known." Now, with databases and direct data acquisition, data gets dumped into the database without adequate vetting and an overreliance on NULLS to solve the issue. It may be time to rethink that approach, particularly since it goes against the relational model concept that a database is a CONSISTENT set of verifiable logical propositions about entities and relationships (not necessarily true propositions in reality - but preferably so, otherwise why have the database at all?) NULLs violate that consistency requirement big time.

The business rule answer should be: if you don't know a fact about a required attribute, that tuple should not be in the set at all until you do know. And if the attribute is NOT required, it shouldn't be in the model at all. A zip code, for instance, should be REQUIRED - if you're bothering with an address at all. If you don't know it, you'd best find it out rather than dumping it into the database and trying to resolve it later by trying to select between tuples about which you know this attribute and ones where you don't.

This approach completely avoids the issue. Granted, it means strict application of business rules - and this is the rub, since nobody wants to do that. People want to have their cake and eat it, too - dump everything into the database, including unknown things, and then have the ability to extract correct and useful logical propositions about the entities and relationships involved.

Doesn't work. Typical human behavior - do the wrong thing, and hope for the right results.

Can I create an "anti-moron shield" with this? on Wireless Sensor Networks for Killing Mosquitoes · 2005-11-24 01:33 · Score: 1

I'll settle for spraying morons with bug spray, even.

I would like something to get the gnats out of my room, actually. Maybe it would help if I emptied the garbage (daily)?

Re:But what do you field optionnal fields with? on MySQL to Counter Oracle's Purchase of InnoDB · 2005-11-24 01:27 · Score: 1

If your relational design is correct, you don't HAVE a maiden name for a male. Such a field makes no sense.

You're reasoning backwards from existing DBMS systems, not from relational theory.

Slashdot Mirror

User: Master+of+Transhuman

Comments · 5,622