Domain: dbms2.com
Stories and comments across the archive that link to dbms2.com.
Stories · 21
-
How Should a Non-Techie Learn Programming?
CurtMonash writes "Nontechnical people — for example marketers or small business owners — increasingly get the feeling they should know more about technology. And they're right. If you can throw up a small website or do some real number-crunching, chances are those skills will help you feed your family. But how should they get started? I started a thread with the question on DBMS2, and some consistent themes emerged, including: Learn HTML + CSS early on; Learn a bit of SQL, but you needn't make that your focus; Have your first real programming language be one of the modern ones, such as PHP or Python; MySQL is a good vehicle to learn SQL; It's a great idea to start with a project you actually want to accomplish, and that can be done by modifying a starter set of sample code (e.g., a WordPress blog); Microsoft's technology stack is an interesting alternative to some of the other technology ideas. A variety of books and websites were suggested, most notably MIT's Scratch. But, frankly, it would really help to get more suggestions for sites and books that help one get started with HTML/CSS, or with MySQL, or with PHP. And so, techie studs and studdettes, I ask you — how should a non-techie go about learning some basic technological skills?" -
Hospital Turns Away Ambulances When Computers Go Down
CurtMonash writes "The Indianapolis Star reports that Tuesday Morning, Methodist Hospital turned away patients in ambulances, for the first time in its 100-plus history. Why? Because the electronic health records (EHR) system had gone down the prior afternoon — due to a power surge — and the backlog of paperwork was no longer tolerable. If you think about that story, it has a couple of disturbing aspects. Clearly the investment in or design of high availability, surge protection, etc. were sadly lacking. But even leaving that aside — why do problems with paperwork make it necessary to turn away patients? Maybe the latter is OK, since there obviously were other, more smoothly running hospitals to send the patient to. Still, the whole story should be held up as a cautionary tale for hospitals and IT suppliers everywhere." -
Web Analytics Databases Get Even Larger
CurtMonash writes "Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2 1/2 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff." -
Web Analytics Databases Get Even Larger
CurtMonash writes "Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2 1/2 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff." -
Web Analytics Databases Get Even Larger
CurtMonash writes "Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2 1/2 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff." -
Web Analytics Databases Get Even Larger
CurtMonash writes "Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2 1/2 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff." -
Web Analytics Databases Get Even Larger
CurtMonash writes "Web analytics databases are getting even larger. eBay now has a 6 1/2 petabyte warehouse running on Greenplum — user data — to go with its more established 2 1/2 petabyte Teradata system. Between the two databases, the metrics are enormous — 17 trillion rows, 150 billion new rows per day, millions of queries per day, and so on. Meanwhile, Facebook has 2 1/2 petabytes managed by Hadoop, not running on a conventional DBMS at all, Yahoo has over a petabyte (on a homegrown system), and Fox/MySpace has two different multi-hundred terabyte systems (Greenplum and Aster Data nCluster). eBay and Fox are the two Greenplum customers I wrote in about last August, when they both seemed to be headed to the petabyte range in a hurry. These are basically all web log/clickstream databases, except that network event data is even more voluminous than the pure clickstream stuff." -
MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups." -
MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups." -
MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups." -
MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups." -
MapReduce Goes Commercial, Integrated With SQL
CurtMonash writes "MapReduce sits at the heart of Google's data processing — and Yahoo's, Facebook's and LinkedIn's as well. But it's been highly controversial, due to an apparent conflict with standard data warehousing common sense. Now two data warehouse DBMS vendors, Greenplum and Aster Data, have announced the integration of MapReduce into their SQL database managers. I think MapReduce could give a major boost to high-end analytics, specifically to applications in three areas: 1) Text tokenization, indexing, and search; 2) Creation of other kinds of data structures (e.g., graphs); and 3) Data mining and machine learning. (Data transformation may belong on that list as well.) All these areas could yield better results if there were better performance, and MapReduce offers the possibility of major processing speed-ups." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
The 1-Petabyte Barrier Is Crumbling
CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS." -
Russian Chatbot Passes Turing Test (Sort of)
CurtMonash writes "According to Ina Fried, a chatbot is making the rounds that successfully emulates an easily-laid woman. As such, it dupes lonely Russian males into divulging personal and financial details at a rate of one every three minutes. All jokes aside — and a lot of them come quickly to mind — that sure sounds like the Turing Test to me. Of course, there are caveats. Reports of scary internet security threats are commonly overblown. There are some pretty obvious ways the chatbot could be designed to lessen its AI challenge by seeking to direct the conversation. And finally, while we are told the bot has fooled a few victims, we don't know its overall success rate at fooling the involuntary Turing "judges.""