New Releases For MySQL (4.0), Samba (2.2.2)
pHaze writes: "Michael 'Monty' Wideneus has just released MySQL 4.0 Alpha. Downloadable here. This has got some really cool new features -- my favourites being built-in InnoDB support and better support for MATCH/AGAINST queries on fulltext indexes (really fast if you're writing a search engine)." And corz writes: "The Samba Team announced the release of Samba 2.2.2 on Saturday. Among the new features include the winbind daemon, which "allows UNIX systems that implement the name service switch (nss) to be entered into a Windows NT/2000 domain and use the Domain controller for all user and group enumeration. This allows a Samba server added to a Windows domain to serve file and print services with *NO* local users needed in /etc/passwd and /etc/group - all users and groups are read directly from the Windows domain controller." Sounds great to me. Jump to the mirrors and grab it now."
benchmark?????
Postgres doesn't suit my needs.
I benchmarked the MATCH/AGAINST functionality and on a query on a fulltext index which included 3 varchar(250) fields populated with an average of 100 chars each and a search string of three words (perl, apache and mysql) on a table with 100,000 records, I get a response of 0.97 seconds average. The first query comes in slow at around 5 seconds, but after it's cached (I assume) the index, it's blazingly fast. This on a P550 (intel) with 400 Megs of Ram and a vanilla single IDE disk. I'm curious about how this fulltext searching compares with SwishE. Anyone got any benchmarks.
KDB is a very fast and efficient. It also has the best stored procedure language around (it may look like Perl, but it is no where close to it in philosophy).
.1 second.
----
on thursday jan 4, 2001 steve miano, ed bierly, keith mason and i
loaded 2.5 billion trades and quotes on a 50cpu linux cluster.
simple table scans on one billion trades, e.g.
select distinct sym from trade
select max price from trade
take 1 second
multi-dimensional aggregations, e.g.
/ 100 top traded stocks
100 first desc select sum size*price by sym from trade
/ daily high and close
select high:max price, close:last price by sym, date from trade
take 10 to 20 seconds
translating the data from TAQ to kdb took about 5 hours.
(steve had loaded the 200 TAQ cd's onto several disk drives.)
distributing the 100gigabytes over the 100Mbit ethernet took 3 hours.
(this cluster should probably have Gbit ethernet)
loading the database (k db taq.m -P 2080), starting 50 slaves,
connecting, mapping shared indicative tables over nfs, building
parallel partitions, etc. took
----
1. What is Kdb ?
Kdb is an extremely fast RDBMS extended for time-series analysis.
2. Does Kdb support SQL92, ODBC and JDBC ?
Yes.
3. Is Kdb a read-only RDBMS ?
No. Kdb is very fast for OLTP (online transaction processing).
For example, it runs over 50,000 ATM-style transactions per second logged
to disk with full recovery on a single cpu. This was against a database of
over 100,000,000 accounts, tellers and branches. Kdb can do batch updates at
several hundred thousand records per second per cpu.
4. Is Kdb a memory resident RDBMS ?
No. Kdb has minimal memory requirements and is very fast from disk.
For example, it ran the gigabyte TPC-D (an industry standard decision support benchmark)
queries and updates on a 200MHZ PC with 64 megabytes of memory, an ultrawide SCSI
controller and four disk drives many times faster than the best published results
at a fraction the cost.
5. What about time series ?
Kdb handles much more than just SQL92 tables. Online analytical
processing (OLAP) on multi-dimensional arrays is done with our
extended SQL language, KSQL. For example, on the 35 megabyte OLAP APB-1
benchmark queries, Kdb ran 12,000 queries per minute with no precalculation.
6. Since Kdb is so fast, does it require more storage ?
No. Kdb is simple and will often store just the raw data.
For example, in TPC-D, the published results required storage
between 3 and 10 times the raw data. The Kdb factor is a little over one.
Some OLAP tools require (for fast queries) massive precalculations. For example,
in APB-1 some expanded the 35 megabytes of input data to many gigabytes. Kdb
aggregates relations (extended with time series fields) so fast that precalculation
is often obviated. Certainly when the raw data is less than a few gigabytes.
7. Is there a parallel version ?
Yes. Although Kdb can handle much larger databases than other database
products without requiring parallel processing, there is a parallel
version for the largest applications. Kdb scales
----
KDB is the classiest database on the internet.
See http://kx.com
-j
I'm sure if you find an implementation of Win32 that isn't shoddy, PostgreSQL will be speedy and useful on it.
ANSISQL is cross-platform, but even if you somehow avoid ROLLBACKWORK and somehow acquired an absolutely 100% reliable platform, you have to wrap extra nonstandard statements (LOCKTABLE) around every SELECT and INSERT to get MySQL to behave in a way that's even arguably correct. I thought extensions were supposed to be for performance tuning- why doesn't it do that by default?