Intel Launches Its Own Apache Hadoop Distribution
Nerval's Lobster writes "The Apache Hadoop open-source framework specializes in running data applications on large hardware clusters, making it a particular favorite among firms such as Facebook and IBM with a lot of backend infrastructure (and a whole ton of data) to manage. So it'd be hard to blame Intel for jumping into this particular arena. The chipmaker has produced its own distribution for Apache Hadoop, apparently built 'from the silicon up' to efficiently access and crunch massive datasets. The distribution takes advantage of Intel's work in hardware, backed by the Intel Advanced Encryption Standard (AES) Instructions (Intel AES-NI) in the Intel Xeon processor. Intel also claims that a specialized Hadoop distribution riding on its hardware can analyze data at superior speeds—namely, one terabyte of data can be processed in seven minutes, versus hours for some other systems. The company faces a lot of competition in an arena crowded with other Hadoop players, but that won't stop it from trying to throw its muscle around."
It's Big
How does that compare to something like spark?
So they've migrated an open solution to a vendor locked in solution? Sweet.
To offset political mods, replace Flamebait with Insightful.
...
Who logs in to gdm? Not I, said the duck.
must run gentoo
Not always. It's been used as such a buzzword that it's come to be used any time when the amount or complexity becomes a limit to what you're trying to do.
So in the case of NRT (near real time), it might be a relatively small amount of data. Or it might be that there's enough different formats of data or other complexity that it's a problem.
And it's also discipline specific ... I've heard of groups complaining about 50GB being a lot of data ... because they're dealing with tens of thousands of Excel spreadsheets. For those in astronomy, 50GB is nothing; you have to get up into the multi-TB range before you have to worry ... and that's still small for some disciplines who deal in PB of data.
Build it, and they will come^Hplain.
Abstracted architecture must really irritate them, since the basic equivalent functionality of AMD and ARM processors isn't even noticeable by most end-users. It looks like high-end corporate customers are all that they have left.
After reading the articles about ISPs clamping down and monitoring customer Internet traffic, the move to the "cloud" for many, and large hardware companies ditching standards to create proprietary systems, it feels like the mainframe days are on their way back. What's next, back to being charged for online minutes? With a minute-based metric tracking system in place, that'll open the door for the government to step in and add an Internet tax to each minute spent online. They'll probably use the FCC as a proxy so that it doesn't look so much like a tax.