kpharmer · Slashdot Mirror

Re:Programming lehttp://developers.slashdosson 101 on Phillip Greenspun: Java == SUV · 2003-09-22 02:10 · Score: 1

SQL Generator? And it handles subselects, inline views, grouping, multiple outer joins, and functions?

Or - is that too difficult to generate, forcing you to make up for this thru additional application code and reduced application functionality?

If it's anything like any other sql generator, then it's the latter. That's great for trivial applications by amateur or otherwise junior programmers. However, robust and powerful applications typically require SQL is that requires the above capabilities. Time to take a SQL 101 class.

Re:WTF!! on Mandrake Linux 9.2, Adware Version · 2003-09-12 11:20 · Score: 1

The great thing about an oversight (like not noticing that adfree mechanism was available on slashdot) is that $20 can easily remedy the problem.

The problem with neurological disorders like Tourette's Syndrome is that many are uncurable. Still, if you're looking for help: http://www.tsa-usa.org/

Re:WTF!! on Mandrake Linux 9.2, Adware Version · 2003-09-12 06:36 · Score: 1

Don't personalize it buddy, no one is saying that folks who do watch tv, etc are inadequate, have small penises, etc.

Just in case it keeps you awake at night - when slashdot has a non-advertising alternative (like Mandrake has in RH, SUSE, etc) I'll be gone.

Re:Quick question on Mandrake Linux 9.2, Adware Version · 2003-09-12 04:35 · Score: 1

>>Needless to say, I won't consider use of an >>advertising-supported product.

>Would you consider buying the CDs straight from >Mandrake (or from your local enlightened software >shop), or does the taint of advertising still >cling to all Mandrake products? Cause that would >be a pretty extreme situation.

Depends on how obnoxious the advertising is. I don't mind paying for the product, but also want to be able to install the free version for certain machines: those seldom used, given to friends, etc.

> Incidentally, I agree with you about avoiding
> TV, commercial radio, commercial-heavy
> magazines, etc. I see television maybe once
> every month or two if I'm in a bar or something,
> and every time I'm re-surprised by how crass the
> consumerism is - both the overt ads and the
> embedded ones in programs (plus the pervasive
> 'owning expensive new stuff will make you happy'
> attitude that acts like a sort of generic ad for
> stuff).

Yeah, it's amazing how much your outlook on life can change when you cut that garbage out.

Re:WTF!! on Mandrake Linux 9.2, Adware Version · 2003-09-12 03:44 · Score: 2, Insightful

Just because advertising is ubiquitious doesn't mean that it's mentally healthy to listen to messages telling you / manipulating you into believing that life would be better if you spent money on their widget.

Nor does the ubiquity mean that lies and exaggerations aren't deceptions. They are. It's ugly.

Life is *far* simpler without TV, without commercial radio, and away from billboard-infested roads. Try living that way for a while - you find yourself far less defined by what you own.

Needless to say, I won't consider use of an advertising-supported product.

Re:DB2 ICE sets TPC-H performance standard on Linu on Open Source Database Clusters? · 2003-09-12 03:16 · Score: 1

Right now all of my servers are running db2 8.1.0 & 8.1.2 on aix 5.1. Reliability is great, performance is fine. These are all stand-alone 2-6 CPU boxes without extended memory, interpartition-parallelism, etc - so they have minimal OS dependencies.

I'm setting up our RH7.3 servers next week. I think we'll be fine - by avoiding certain features, and having the flexibility of designing our own application.

Good luck

Re:DB2 ICE sets TPC-H performance standard on Linu on Open Source Database Clusters? · 2003-09-11 14:20 · Score: 2, Informative

Ouch, sounds like you should have gotten an experienced dba to set it up for you. DB2's too complex to go with simple defaults, and clustering is definitely a high-skills endeaver.

As far as insert loads go, we've seen 500 rows / second on five year old hardware without any problems. Although that's far short of what DB2 is capable of, it's fine for a sustained load. Beyond that batch loads hit 15,000 rows per second easily on the same box.

And as far as pricing goes, today you could get DB2 Express for those little dual-cpu boxes for just $500. A really fast four-way will cost you $32,000 - still way shy of $100k. You don't need to hit that kind of pricing unless you're doing inter-partition parallelism. And as I mentioned above - that's just not worth doing unless you've got the right skills to pull it off.

Right tool for the job? on Open Source Database Clusters? · 2003-09-11 14:06 · Score: 2, Interesting

> The right tool for the job people

Right, and a myoptic application of the above advice would lead to a dozen different database products in a typical department. They'd all be the right tool for some job - unless you're hoping to reuse skills, reuse backup solutions (TRM for DB2, Veritas for Oracle, etc), have any hope of reliable integration, etc.

So, yeah - get the right tool for the job. But before you right that out you need to take a big step back and get a sense of what your strategic direction is, and what are all the implications of such a decision.

I know a lot of folks converting mysql to other solutions right now - because some junior guy figured it was the best solution. It might have been for the app - but it wasn't for the department. Which is like winning a battle but loosing the war.

Re:Postgre sucks! on PostgreSQL Inc. Open Sources Replication Solution · 2003-08-29 08:10 · Score: 1

> I don't see a good reason to put data integrity
> logic (beyond transactions) in the database and I
> do see a good reason (db platform independence) to
> put it in the application.

Been doing this for over half my life now - relational databases have been around since 1980, and haven't really changed that much. Meanwhle I've seen a wide variety of languages come and go. The reality is that the application is far more volatile than the database.

So, you're far more likely to want to add a .NET application to your oracle database currently front-ended with Java than you are to swap the oracle database to sql server. And when you add that second application - you'll almost guarantee inconsistent data.

Stored Procedures aren't very portable, but then again simple check constraints and simple procedures (consisting mostly of sql) are. Easily built, easily ported, and allow n-number of applications front-ends.

That's the way to go if you want to see your survive technology changes.

Re:IANADBA (a dba's opinion) on PostgreSQL Inc. Open Sources Replication Solution · 2003-08-29 02:37 · Score: 3, Interesting

I've got about 17 years experience with RDBMS', covering oracle, db2, postgresql, sql server, etc. So....

Postgresql looks like it's better positioned to eat SQL Server's lunch than oracle's to me. First Off, back in the day (3 years ago), when oracle was licensing by 'power unit' - it cost about $1000 / CPU / Mhz. So - a single CPU license for a 1-ghz machine would set you back $100k! Since then they've had to drop prices - because of the market, and because of DB2 - which is far less expensive.

Anyhow - if you're going to invest a cool million or two in a top-end, enterprise box - a Sun E10K or an IBM Regatta - then you don't rig up some cheap AC solution, use surplus wiring, or a free database on it typically. You put Oracle or DB2 on it. Sure - Postgresql is fine database and it'll save you some bucks, but when you're putting your reputation on the line and have build a business case that justifies (say) $2m in hardware, and probably $4m in labor - it's foolish to try to save $1m on oracle by going with postgresql.

Without *any* support for parallelism, without stronger replication, without better 3rd-party support (think of Toad for instance), without thousands of experienced develpers & dbas out there, without more robust availability functionality - it simply isn't ready to tackle the biggest projects. Or those projects with extremely high availablity requirements. Or relatively large reporting projects (no parallelism). Or just about any projects in a really dedicated single-vendor shop with its act together.

But that's ok - that leave 30-50% or so of the other database work that it can do just fine right now. That's a huge market. And unlike mysql - if your construct a database application using postgresql and then later want to port it to oracle, db2, or sybase - it's just a normal porting of the application. You construct the architecture in just about the exact same way (for most applications anyhow), so the porting is straight-forward. Not so in the case of mysql - where it's severe limitations result in applications doing a ton of the database work - and porting ends up being a complete rewrite.

try reading the article on SuSE CEO's Two-Distro World · 2003-08-21 10:45 · Score: 5, Insightful

quote was taken out of context - SuSE's just saying that corporate IT is focusing on just two distributions.

Don't know about you - but I see very few other distributions out there on corporate boxes...

Re:My experience... on What Is the Future of Business Intelligence? · 2003-04-19 12:52 · Score: 1

You're right - these companies exaggerate the simplicity of the implementation of these products.

However, this is due to the fact that they push products - not processes. The technology isn't the challenge, and the products are seldom the answer - the answer is a highly iterative process, BI & DW best practices, a staff that can handle basic numbers, and culture that encourages questioning.

I've been doing BI for ten years now and have seen quite a few substantial successes. It's also one of the reasons that companies have avoided the huge inventory problems that occur in most recessions - they have been better able to understand exactly what their sales & supply picture looked like than ever before.

So, yeah - it can be difficult, especially for those who buy all the 'marketecture' crap. But, for those that see management information as a requirement to process management - it's no less essential than the guages on a car.

Re:The company I worked for already tried this... on What Is the Future of Business Intelligence? · 2003-04-19 12:36 · Score: 1

That's like complaining that car owners waste too much time making sure that the sensors are working right on their cars so that their dashboard gauges work right.

Yeah, it's *completely* ok that that time is spent to ensure that execs have good information on what's going on in the business.

Of course, if they are measuring the wrong thing, or don't understand that one metric can't fully represent how a process is working - then it's more of a matter of being info-literate than it is about having a useless tool.

Review of the Projections on What Is the Future of Business Intelligence? · 2003-04-19 08:55 · Score: 2, Insightful

Unfortunately, few in the slashdot community are familiar with this segment of our industry. Even fewer appear to be encumbered by this lack of knowledge.

The terminology and concepts referred to in these articles are mostly old hat, and anyone who's good and has experience with:
- decision support systems (DSS)
- business intelligence (BI - similar to DSS)
- data warehousing
- ods
- data marts
- reporting
- balanced score-cards
- data mining
- personalization
- SPC
- management science
should be familiar with all of them. Even some folks who've implemented BI components within large ERP & CRM applications should be familiar with them.

None of the projections are revolutionary - and none appear terribly insightful. Let's walk thru them one at a time:

1. In five years 100m people will use visualization tools almost daily: can't speak to the numbers, but I would be surprised if a majority of computer-users aren't using analytic technology daily - without even realizing it. As far as visualization goes, we're starting to enter the 'dancing-dog' phase of visualization - when the technology is over-applied without any thought of the usability or business impacts. So, yeah - we might see quite a lot of use, but I don't think we'll see nearly so much successful use of it.

2. BI will save $200 billion a year: perhaps, but I doubt that enough users are sufficiently info-literate (not computer-literate!) to pull this one off. Still, even with the primitive skills that people have in this area, BI can make efficiency improvements.

3. In 2-3 years quarterly-adjustments will be ditched in favor of real-time ones. The use of real-time analytics is increasing, though slowly. Micro-adjustments in pricing is only slowly be introduced, anything larger will continue to be adjusted on a quarterly basis - since it involves organizational changes - and people can't sustain real-time changes.

4. In 5 years BI & data mining terms will disappear: this is the one projection that I haven't heard before, and it seems the least likely. Both are essential prerequisites to embedding analytics in applications - since they help identify the rules, algorithms, etc to be used real-time. BI is also useful along-side analytical applications - since it allows you to measure what the real-time app is up to.

My own predictions? Analytics are definitely going to become more embedded into applications. More importantly however - people will become more accustomed to, and more comfortable with the basic concepts. And that's the real pre-requisite to making progress here. After all, the challenges to making better decisions based upon quantitive methods aren't technological - they're social. You need people who are info-literate, people who care, and organizations willing to question themselves. *That's* the real challenge!

Inverted Dilbert on What Is the Future of Business Intelligence? · 2003-04-19 07:01 · Score: 1

Gotta love when techies insist that terms are nonsense because they don't recognize them:
-Visualization
-Business Intelligence
-Executive Dashboards
-Balance Scorecard
These terms have been around for at least five years - and refer to how highly-enriched analytical information is delivered from data warehouses and other analytical applications. Nothing in the article was revolutionary.

Just because you don't hear these terms when knocking out Apache/PHP/Mysql websites doesn't mean they aren't legitimate.

Score another niche for python on Python in a Nutshell · 2003-04-16 06:10 · Score: 1

Too Slow? Depends on your application.

I use it for ETL (Extract/Transform/Load) in data warehouse environments where millions of rows (and sometimes hundreds of millions) have to be transformed daily. These transformations range from simple text substitution to binary searches on large arrays. In this environment python performs adequately. More importantly however, it allows these applications to be developed quickly, tested quickly and maintained easily.

Back in the day, these applications would have been written in C. Today a few are, or a few pieces of a larger python framework are, but given the cost of hardware it's simply cheaper to just buy a slightly larger box than spend another few weeks on development.

I've used a wide variety of languages and vendor products for this activity, so far python has proven to be the best solution by far.

Re:OO databases are an evolutionary step...backwar on Object Prevalence: Get Rid of Your Database? · 2003-03-03 04:11 · Score: 2, Informative

Right, the OODBMS is acceptable - as long as you reject the need to query across objects.

Of course, there's nothing *relational* about the need to do this, this doesn't have anything to do with the mis-application of methodologies.

These are simple use-cases. And to reject them is to limit the functionality that the solution will offer. That's fine.

But, almost everyone needs the ability to identify all objects with attribute X. It's called a report, and it provides you with the information needed to manage the process. Without this ability you're driving in the dark without headlights.

Encapsulation Drawbacks on Object Prevalence: Get Rid of Your Database? · 2003-03-03 02:30 · Score: 1

One of the seldom-admitted downsides to all OO-oriented data persistance strategies - is that private or encapsulated data still needs to be seen by external applications - in order for them to get the info necessary to manage the process.

In simpler terms - your data can stay encapulated until someone needs to view trends, exceptions, stats across all objects. This is typically called reporting.

So, keep the data perfectly encapsulated if you will. But when the time comes to answer questions like: "I just want to know if everything is running ok" - then you've got to share it. Since time-series analysis is the basic way we think about process management ("is it working as well as it was last month?") that means that you should keep historical objects as well. Of course, every one of these reports might have to interface with millions of objects, rather than simply and directly reading millions of rows. So, don't expect it to be fast at all in this scenario.

Note too that since reporting is typicaly a highly-iterative activity, it is desirable to make it easily-implemented by report developers, users, or others. Although you can certainly knock out some of your core reports in java, it wouldn't be wise to limit the reporting capabilities to java developers.

Agenda on Mitch Kapor's Outlook-Killer · 2002-10-20 11:42 · Score: 1

Yep - Agenda is about as great an application as I've ever seen. And some of this product's functionality is exactly what I want to manage my 20,000+ email message archive.

Easily create categories and have the software immediately determine which email falls into that category - *and to what extent*.

Yep, this would be highly cool. Count me in as an early adapter.

Re:How Ironic on Perl and XML · 2002-08-22 07:24 · Score: 2, Interesting

Jeez,

Calm down and look at what we're talking about here:
- Perl: a language well-known emphasizing ease of development at the cost of ease of maintenance
- XML: a distributed metadata mechanism that emphasizes ease of maintenance at the cost of ease of development.

Doesn't that sound like an odd mix that may occasionally be reasonable, but often shows a poor understanding of priorities and options?

What's next? How about:
- MySQL & WebLogic
- Dos bat files & Oracle
- Zope with isam files
- SAP sitting on top of Dbase IV

Common problems in it:
- emotional attachments to tools
- blind acceptance of marketing hype
- lack of perspective beyond a single project
Combined, these characteristics probably make these two technologies look ideal.

How Ironic on Perl and XML · 2002-08-22 06:03 · Score: 1

So, Perl is optimized for quick and dirty problems. The kind in which you know in advance you won't be maintaining it, and don't need a solid, easily-maintained language.

And XML is a sophisticated mechanism for describing data for when a simple pipe-delimited file format won't achieve a sufficient buzzword high. It's so powerful that this book doesn't even go into detail on learning XML - the reviewer recommends additional books for really getting to understand XML, should you desire to replac the fast & easy delimited format.

Oh yeah, sounds like a match made in heaven.

Re:Your own calls. on Coding for Multiple Databases in C/C++? · 2002-08-19 05:18 · Score: 1

Yep - if you're supporting many databases you'll want to be the one who codes the SQL for each. The only time I would use generic SQL is if my application was extremely limited in scope, had very little data, and I had no access to the databases.

Although your basic SQL syntax is transportable, you've started with a database that implements only a subset of the syntax (MySQL).

By coding the SQL yourself (via a database API within the application layer) you can use different implementations for Oracle, SQL Server, etc. This means you'll be able to not only use the valid SQL that wasn't supported in MySQL, but you'll also be able to take advantage of various inconsistently implemented extensions. Sure, it's a drag to use non-standard code, but in the end it'll save you from having to write far more code in the application layer.

Additionally, you may find that your application runs fine in MySQL - but that the same logic results in deadlocks, etc in other databases - because you are already taking advantage of extensions to the ANSI SQL spec in MySQL (either implicitely or explicitely).

So:
- update joins
- delete joins
- limited response from selects
- creating sequence numbers
- outer joins
- date functions
- transaction management
- etc
Are all somewhat optional, but are very much worth pursuing in almost any database application.

Benchmarks on MySQL A Threat To The Big Database Vendors? · 2002-08-18 16:48 · Score: 1

> The specifications for the TPC benchmarks are
> freely available -- it's fairly easy to write a
> client application that follows one of the
> benchmark specs to test a specific database.
> contrib/pgbench in the PostgreSQL tree, for
> example, implements a "TPC-B-like" benchmark.

Yeah, but it takes a lot of time and equipment to run a full suite of benchmarks - especially if you want to know how it'll run on a 4,6, or 8-way SMP, with different io subsystems, with different schemas, different data characteristics, different tuning parameters, etc, etc.

And you need to run a variety of them - since any application worth its salt will need OLTP-oriented transactions as well as table scans, loads, unloads, etc.

Meanwhile, you get folks claiming that 'X' is faster than Oracle - and all they've done is compared a few queries on a development workstation.

And the irony is that the open source databases are probably the only ones that you can publish benchmarks for without prior vendor approval.

Re:dirty secret of big databases on MySQL A Threat To The Big Database Vendors? · 2002-08-18 05:37 · Score: 1

If you're talking data warehousing, then you probably aren't that concerned about stored procedures, triggers, views, foreign keys, etc - the so-called advanced features that MySQL is missing.

However, you should be using:
- partitioning (for a separate charge now)
- parallelism
- bitmap indexes
- cost-based optimization
- unions (needed when you have partitioning)
- etc, etc

These are very significant features - partitioning for instance, allows you to organize tables into logical segments - so that when you do your table scans (almost all the time in the warehouse) you only scan 10% or less of the data. Further - when you want to roll off old data you don't have to try to delete 200 million rows - you simply drop the partition. Parallelism allows you to divide the work of the query between CPUs - which means that your 8-way SMP can put all 8 CPUs to work on a single query - drastically improving the response time.

If you aren't using these features, then you're waisting your money with Oracle. Unfortunately, without these features, managing a TB of data - and getting good speed without throwing tons of hardware at the problem is extremely difficult.

DB2 and Informix both offer these features as well - and perform just as well at a fraction of Oracle's price, but SQL Server is missing many of them. And for that reason I don't see SQL Server as much of a competitor in the TB+ arena. Nor do I see Postgresql or MySQL there either.

And, unfortunately it took the commercial database vendors years to get these features working properly - parallelism especially. So, don't expect the handful of folks at MySQL to knock out this capability in the near term. Maybe in five+ years.

Re:MySQL A threat, hah, tell me another one... on MySQL A Threat To The Big Database Vendors? · 2002-08-17 17:25 · Score: 2, Informative

Wrong on two accounts:

1. It isn't just subselects - the list of *basic* stuff includes views, stored procedures, foreign keys, etc, etc. Note that we're not even getting close to the advanced features like parallelism, bitmap indexes, solid replication, etc, etc.

2. Product & tool selection is driven by a variety of factors - including internal consistency, vendor relationships, staff skill sets, etc. So, while it is true that there are projects simple enough for MySQL (especially embedded databases) - it is also true that most *custom* database applications deserve something better than what we were doing twenty years ago. And at the point in which you have Oracle, MySQL, and then need to install Postgresql - you will be regretting the time to learn a slightly different product, obstacles to reuse, and administrative complexity of having MySQL in the mix.

And lastly - the simplicity of MySQL is largely an illusion - since without transactions, subselects, views, etc you've simply moved the complexity from the database to the application layer. And while sometimes that is fine, the typical result is that simple tasks that could be done in a few lines of code or a few minutes in SQL instead take hours and hundreds of lines of application code.

Slashdot Mirror

User: kpharmer

Comments · 561