jadavis · Slashdot Mirror

Re:Harness the TIDE not waves on New Wave Power Research Rising Off Oregon Coast · 2007-12-09 05:36 · Score: 1

Either soln. provides simple, cheap power which is renewed

Really? It's cheap? What's the cost per kilowatt-hour?

Re:ORM still broken? on Ruby on Rails 2.0 is Done · 2007-12-08 14:00 · Score: 1

In one of the apps I wrote, a user has many email addresses and an email address has many users

CREATE TABLE user (username TEXT PRIMARY KEY, firstname TEXT, ...);
CREATE TABLE user_email (username TEXT REFERENCES user(username), email EMAIL, PRIMARY KEY (username, email));

(I assume there's a type "email" that is the domain of all valid email addresses, but that's a separate issue).

So what's wrong with that design? It seems to be in 5NF to me, according to your stated business rules. I don't know what two designs you're comparing, but the above 5NF design seems like the logical (and obvious) one to me.

You say that a "correct" design must have a unique email column? Why?

Normalization is a formal process you undertake after formalizing your business rules. Some of those business rules are called "functional dependencies" and those are the primary business rules you care about during the normalization process. You never normalize in the abstract, without knowing the business rules, it's impossible.

And let me repeat: normalization is NOT a continuum. A relation is always in one of: 1NF, 2NF, 3NF, BCNF, 4NF, 5NF, or 6NF. If it is in one of the latter normal forms, it is also in all of the earlier (i.e. a 3NF relation is also in 1NF). So which is it?

Chances are, if your data is in 3NF, it's also in 5NF. In fact, chances are that many simple databases are in 5NF, because for simple databases, the 5NF is the obvious solution.

The formal process of normalization is useful because when the database is complex, the good design might not be so obvious, and the list of business rules might be too long to just use intuition and rules of thumb.

Re:ORM still broken? on Ruby on Rails 2.0 is Done · 2007-12-08 09:38 · Score: 1

My stance isn't as extreme as his ("database is just a big hash")

That's not extreme, in fact, that's the first thing that occurs to most people. And the first DBMSs were something relatively close to that, or they were hierarchical.

But people had problems with that kind of database. It made very few guarantees about data consistency, so that meant that it was very hard to rely on the results of any question you might want to ask. The consistency was left up to the application, which only sees a small subset of the data at any one time, so it was rare that any real consistency was enforced.

Furthermore, there were no operations that mapped directly to logical operations, so nobody had the benefit of formal logic when trying to make complex inferences about the data.

So, they invented relational databases.

Very often they're systems for displaying, storing and retrieving small to moderate amount of information (unless you're working on a really really big multi-million system).

It has nothing to do with the size of the data set. Even small datasets are usually much to large for humans to manually inspect the data.

It has much more to do with how important your data is, and what kinds of objective questions will you need to be able to answer from that data reliably and automatically?

I've developed a way to reduce the memory usage by 30%

Interesting, I'll take a look.

Re:ORM still broken? on Ruby on Rails 2.0 is Done · 2007-12-08 09:09 · Score: 1

I would word this more simply:

ActiveRecord uses the following equation: "save = commit". But "save" is fundamentally different from "commit". "Save" takes whatever the application state is, and copies it to permanent storage, so that the application state can be restored later.

"commit" means that you must reconcile whatever data you've collected with all of the data that you've previously collected , so that the data is consistent, and has a formal meaning, from which you can later ask meaningful questions, and get reliable answers.

Note that "save" is very context-sensitive. What version of the application are you restoring the state to? What assumptions surround the state information that may no longer be true?

Also note that applications can only see a small subset of the data, so cannot reconcile the current state with other (perhaps conflicting) information that has already been collected.

I largely agree with your complaints; I think they all come from this fundamentally wrong equation made by ActiveRecord (and ORMs in general).

However, you say that:
"I am also not talking about extreme normalization. I am talking about basic normalization... e.g; up to say the Third Normal Form."

Show us a design that's in 3NF, and then show a different design based on the same business rules in 5NF. It's hard to think of an example, because most 3NF designs are 5NF. So what is "extreme normalization" then? It makes no sense at all.

Re:ORM still broken? on Ruby on Rails 2.0 is Done · 2007-12-08 08:53 · Score: 1

If, by normalizing stuff to the extreme, makes the application 6 times harder to write

Well, it certainly is clear that your "data theory knowledge is limited". What does "normalizing to the extreme" mean? There are 5 (or 6, if dealing with temporal data) widely recognized normal forms, each with a very formal definition.

Most well-designed databases are already in 5NF. If a relation is in 3NF, it's probably also in 5NF, unless you are dealing with multi-valued or cyclic dependencies, something that occurs rarely in practice.

And if it's not in 3NF, and you saw the design side by side with a 3NF database, you would probably agree that the 3NF was a better design. And it would probably be in 5NF, too.

So normalization just formalizes the concept of good design, so that when designs get very complicated, you can still assure yourself that you have a reasonably good design based on your business rules (constraints). The formal model for normalization is not even really needed for simple designs, because 3NF is obviously the good design.

The professor, however, took normalization to the extreme: he even introduced a 'numbers' table. So a 'phone_number' table has a one-to-many relationship with the 'numbers' table. (Or something like that; I can't remember the question completely.)

Huh? That makes no sense at all. Perhaps you should get a new professor, or read a good book. Any reasonable representation of the real world would see that "phone numbers" are their own domain (set of valid values). If it had particular meaning for your application, you might see a phone number as country_code + area_code + number, that is, 3 separate domains. For instance, maybe you want to know the country that the number applies to, and the general area the number applies to. But that's only if you actually want to maintain those mappings in order to extract the meaning from the input data.

There is a very fundamental difference between normalization, and turning the data you have into consistent information that has meaning to computers (and therefore from which automated inferences can be made). A trivial example to illustrate the difference is:

CREATE TABLE stuff (id INT PRIMARY KEY, t TEXT);

That table is perfectly normalized (it's in 5NF). But the data has not been imbued with much meaning. You can't infer new, useful facts that you didn't already know. Or, at least you can't make those inferences in an automated way using formal logic. You could always look back at the code that inserted the data (which version of the code was that again?), and look at the data yourself, and manually infer enough about the data to answer your question. I prefer the automated way, personally.

It's up to you to decide what has meaning in your application. If the meaning of a phone number to you is "something you dial", you might not care about the country code. If you want your application to automatically dial phone numbers for you to conduct a customer survey, and need to sample from many regions, perhaps you care very much about the area code and country code.

Once you have identified those things that have meaning, you can distill them into more formal definitions, and construct your database in a such a way that you can ask meaningful questions and get reliable results.

Oh, and if you're designing applications that generate data that will be important to your company for a long time, don't be proud of your ignorance of data management.

Re:ORM still broken? on Ruby on Rails 2.0 is Done · 2007-12-08 08:17 · Score: 2, Interesting

Yup, 5th normal form is a tad much, but there is no reason to don't go 3rd.

"A tad much"? Either you're being facetious, or you have a profound misunderstanding of normalization.

In many cases, a relation in 3NF is in 5NF. In fact, you have to be somewhat creative to think of a practical example of something in 3NF but not in 5NF.

"Normalization" is highly misunderstood. There is no such thing as "extreme normalization". If you see an example of something in 2NF, and then see a 3NF design side by side, you would probably think to yourself "yeah, the 3NF design looks better to me".

Re:I noticed the lack of theory in the ToC on Head First SQL · 2007-11-25 06:42 · Score: 1

BTW, one of my latest approaches is a methodology I call SODA (Service Oriented Database Architecture), which suggests that the object model of the application and the relational database should be designed as a loosely coupled system,

That is effectively moving the entire "model" part of MVC into the database. That's not a bad approach, and there are a lot of reasons I like it. I think you still need to offer a way to write arbitrary read-only SQL queries in the application; that's just too valuable to give up.

What you gain from SODA can usually also be gained by defining tables and their constraints well, and just leaving the model in the application.

Re:I noticed the lack of theory in the ToC on Head First SQL · 2007-11-21 18:00 · Score: 1

The goal ought to be to optimize time and expenses across the entire software lifecycle rather than cutting down on the most important places where time gets spent (on the design).

Agreed 100%.

I'd like to add that the data the application is generating is often very valuable to the business. The costs of a bad database design are not apparent until you actually try to read the data and decipher some kind of meaning from the data as a whole. Often, the database design is so bad that the information is simply never extracted, because the data is too meaningless.

There's a natural tug-of-war between those who want to get the release out the door, and those who want to be able to make use of the data it generates later on. It's much easier to simply insert whatever data the application is given rather than to detect that it is wrong, throw and error, and allow the user to correct it. But that's exactly what should be done with data in order to effectively report on it later.

Re:The usual post on Japan to Start Fingerprinting Foreign Travelers · 2007-11-21 04:39 · Score: 1

Isn't that what Google does?

Yes.

Centralization does have advantages for some types of things, like finding information. But it's bad for things like the economy, privacy, and freedom.

And you're right: when they try to gain centralized power, they will bundle it with convenience.

Re:The usual post on Japan to Start Fingerprinting Foreign Travelers · 2007-11-18 18:50 · Score: 1

In all seriousness, it's not the fingerprints themselves that worry privacy advocates.

The worry comes from creating a solid link between your fingerprint and the rest of the data that makes up your identity, and also from the centralization. Think if everyone had access to the central database.

Re:Science! on MIT Students Show How the Inca Leapt Canyons · 2007-11-17 19:02 · Score: 1

Don't give the Europeans credit for Gunpowder. Poor choice for the example.

It's not about who invented the gunpowder.

Europeans in the Americas had huge advantages because they made use of the inventions and discoveries made by a large number of people over a long period of time, whereas American natives were much more isolated. Even within the Americas, discoveries didn't move around as much, because the Americas are mostly North/South and discoveries move more easily along similar latitudes (because of climate).

Re:solution on First Use of RIPA to Demand Encryption Keys · 2007-11-15 05:14 · Score: 1

Because private companies are the pinnacle of competence and government is the pit of deepest stupidity.

Hah.... it's funny because it's true.

Re:Sun also releasing Xen-based virtualization on Oracle Is Latest To Take On VMware · 2007-11-13 12:28 · Score: 1

Oh, oops, my comment settings made it appear you were replying to someone else. Whoops.

Re:+5, Gets the relational model on Ask Database Guru Brian Aker · 2007-11-13 06:56 · Score: 1

Oh, just that by using instance variables your functions have destructive side-effects

Ok. I am just getting into some functional programming myself (Ocaml), so I was curious what exactly you meant.

Re:Sun also releasing Xen-based virtualization on Oracle Is Latest To Take On VMware · 2007-11-13 06:03 · Score: 1

Even if the post was written by a shill for Sun*, it's certainly not "spam". "Spam" is completely non-targeted marketing, while the post in question is about as targeted as marketing gets.

Also, he was pointing to opensolaris, which is free of charge. You didn't call posts about RedHat's products "spam", even though both RedHat and Sun are open source oriented companies, and all the posts in question are about their open source offerings.

Re:Unbreakable Xen on Oracle Is Latest To Take On VMware · 2007-11-13 05:55 · Score: 1

Oracle has lost a hell of a lot of real money to open source

Not due to linux, nor virtualization. In fact they have probably gained a lot from linux, and why not? The less someone spends on their operating systems, the more they can spend on Oracle licenses.

They may have lost a few sales to MySQL and PostgreSQL, but that's no reason to attack linux or Xen.

Re:tradeoffs between InnoDB and PG storage on Ask Database Guru Brian Aker · 2007-11-13 05:48 · Score: 1

That makes me think that designs that use heavy locking are losing to designs that use row versioning.

Re:Data Truncation on Ask Database Guru Brian Aker · 2007-11-13 05:34 · Score: 1

No, which is why you set SQL_MODE to something sane

It would be interesting to know which of these have more application support:
(a) MySQL configured sanely
(b) PostgreSQL

I just recently encountered a large application with only simple database access, and it worked with MyISAM but not with InnoDB.

Re:+5, Gets the relational model on Ask Database Guru Brian Aker · 2007-11-13 05:29 · Score: 1

when you consider the implications of instance variables

What are the implications you're referring to?

It sure as hell isn't OO, and it doesn't fit 1-1 with their code written in the One True Proramming Style, and therefore is hacky and outdated.

I encounter this perception all the time.

tradeoffs between InnoDB and PG storage on Ask Database Guru Brian Aker · 2007-11-12 10:57 · Score: 1

If I understand correctly, InnoDB, Oracle, and PostgreSQL storage models all use multiple row versions, so what are the tradeoffs? How do these tradeoffs explain some of the performance differences, such as concurrent performance and serial performance? Stability of performance versus erratic performance? How do they affect maintenance and performance stability over time?

InnoDB and Oracle both use rollback segments (I may be mistaken here), while postgresql uses non-overwriting storage and reclaims it later. What's your opinion of the two approaches?

Why do databases like SQL Server and DB2 still use heavy locking rather than multiple row versions (I may be mistaken here)? Is that an antiquated design, or does it still have potential?

Where can I find more detail about this information?

Re:Hmmm on The Science Education Myth · 2007-10-27 04:32 · Score: 1

So apply this to labor markets and see what you end up with.

Ok.

In an ideal market, every job is filled (no labor shortage)

Right. If a company really wanted to hire someone, they'd increase the wage until they had the employees. But they don't really want to pay the cost of the employee, so the job is not really "open". I have plenty of jobs "open" for someone willing to work for a penny an hour, but obviously I don't really want those people, because a penny an hour is less than their market value. What you are describing is that labor is scarce (and by definition, every good or service that is part of the economy is scarce).

"Shortage" has a specific economic definition. "Shortage" has a different definition from "scarcity".

Look at http://en.wikipedia.org/wiki/Economic_shortage:

"In the economic use of "shortage", however, the affordability of a good for the majority of people is not an issue: If people wish to have a certain good but cannot afford to pay the market price, their wish is not counted as part of demand."

and nobody is unemployed (no labor surplus)

There is a labor surplus, but only because of a price floor on wages (aka minimum wage). That means that there's a surplus of people that provide less value than that price floor, and so they are unemployed. But a price floor is not a free market.

Re:Hmmm on The Science Education Myth · 2007-10-26 03:40 · Score: 1

It was obvious to me that the market could not support additional workers. If we needed more Sci/Eng experts, then there should be plenty of, say, $70+k jobs for holders of advanced degrees (esp ones in this super-hyped nanotech field).

This doesn't make any sense at all. You and the summary both make the claim that there are more engineers than demand, but that shows a fundamental misunderstanding of economics.

In a free market, there is generally not a surplus or a shortage of anything. As the supply increases, the price might fall, but that does not mean there's a surplus. We don't "need" anything or anyone, everything is a trade. If we have more people with skill X and the demand curve remains the same, the price to hire someone with skill X will fall.

If I didn't know better I'd think the "Gathering Storm" thing was a ruse to depress wages.

That's exactly right. Just like when a corporation says we "need" more Mexican immigrants or we "need" more H1B visas, it's all for one reason: they want to get the people cheaper. It's simple economics. Any company can find an engineer if they really want one, but they don't want to pay the high costs.

Re:This smacks of bullshit... on Web Accessibility Gets a Boost In California Court · 2007-10-15 18:32 · Score: 1

It was both private industry, and governmental policy. Governmental policy allowed it, but it was certainly in practice in private businesses. Governmental policy, especially federal policy, eventually forbade it, primarily because it could be shown that the facilities were rarely if ever equal.

Reading wikipedia on Rosa Parks' story, it appears that the reason the bus was segregated was due to racial segregation laws. It also appears that the bus was part of the public transit system.

The Montgomery Bus Boycott was a political and social protest campaign started in 1955 in Montgomery, Alabama, intended to oppose the city's policy of racial segregation on its public transit system. ...
Pressure increased across the country, and on June 4, 1956, the federal district court ruled that Alabama's racial segregation laws for buses were unconstitutional.

-- http://en.wikipedia.org/wiki/Montgomery_Bus_Boycott

"Separate but equal" was exclusively a governmental policy, as far as I can tell. Please show me a reference to the contrary. It can be any credible source that actually uses the words "separate but equal" referring to a private business's policy. Please avoid the obscure, if it's really as common as you say you should be able to find several mainstream sources easily.

I understand the theory you're trying to present, and it's not a new one, and I don't think I'm missing your point at all.

However, you should at least acknowledge that a private business not doing something is very different from the government imposing laws that enforce discrimination.

Re:This smacks of bullshit... on Web Accessibility Gets a Boost In California Court · 2007-10-15 03:36 · Score: 1

Available buses and shuttles rarely go to every shop in a neighborhood, and even if they do, a Target or other major store may be the only ones open in a neighborhood at late hours, or with certain products.

You missed my point. These things do not apply at all to an online store.

And no, "separate but equal" covered quite a lot more than government service.

I didn't say discrimination did not take place in private establishments. But the "separate but equal" doctrine was indeed a government policy, not a private one. And it was considerably more evil, because the government always has a complete monopoly, whereas in the private sector it's at least possible that people can take their business elsewhere.

Re:This smacks of bullshit... on Web Accessibility Gets a Boost In California Court · 2007-10-15 03:02 · Score: 1

Goodness. Next thing you know, you'll have us being "separate but equal".

There's a very big difference between "separate but equal" and forced accessibility of stores. Do you know what the difference is?

The difference is that "separate but equal" referred to government services.

You're making a comparison to neighborhoods as though Target is the only option. But this is the internet, blind people have many options on the internet.

Slashdot Mirror

User: jadavis

Comments · 1,994