What do the other elite forces think - what do the seals use ?
SEALS are at best in the top 5.
1) French Foreign Legion 2nd REP GCP 2) Guatemalan Kaibiles 3) Mexican GAFE 4) UK SAS 5) US Navy SEALS
Actually, many SAS and SEALS train with the Kaibiles, and after their stint in the US/UK military they end up joining the French Legion to see real action.
A former client of mine was paying SAS $10,000/month to host a shitty dashboard that was updated once per quarter. It didn't even come with a vanity URL. That's the typical SAS market: gold-plated clients with unlimited budgets and almost no actual needs.
We spent an afternoon rewriting this piece of shit as a HTML dump from matlab and "deployed" it on the corporate intranet.
When you don't provide added value, you quickly become obsolete.
If Hadoop is as amazing as you say it is then why aren't more companies enjoying success with it?
Can you provide numbers to back your statement that not many companies are "enjoying success" with it? Or are you content to repeat the same bullshit over and over?
A few interesting facts. -Cloudera, Hortonworks, MapR, Pivotal are all in Gartner Magic Quadrant for Data Warehouse and Database Management Solutions for Analytics -Most of the big BI products (MicroStrategy, etc) offer connectors to AWS EMR, HDInsight and various other Hadoop offerings. Do you know why? Because people use them. -Hortonworks and Cloudera, two big Hadoop vendors, have $100+ millions in revenue.
So spin your bullshit any way you want if it makes you more secure in your little DBA garden, but Hadoop is now a growing technology in the enterprise. Are all Hadoop project successful? Of course not, but the same goes for SAP, Cognos, Oracle or any other data-related project. Doesn't mean they're not successful products.
Why is still happening? Why won't Uber stop raping their female employees?!?
They should let them write code for the self-driving cars instead. That way, just like in real life, female drivers would cause accidents but it's the male driver that has to swerve or veer to avoid a collision that would be at fault.
The problem those companies face is that they grew so fast that they're struggling with past technical decisions that are difficult to revert (e.g. Twitter and their initial RoR architecture). The wheels keep turning so they end up having to build sophisticated layers on top of their legacy garbage.
We've all been there. Someone (maybe even you) builds a throwaway Excel macro or Wordpress-driven monstrosity just to address a temporary need that is not worth spending more than 2h on, and first thing you know it's become a mission critical component. Now imagine that this piece of shit ends up powering tools used by millions of users; you can't simply stop everything and rewrite from scratch, and you can't stop adding features, for which you need busloads of new programmers. At that point you don't need programmers who will tell you: dude you should use Nodejs or MySQL, you need programmers who can apply a very narrow and deep understanding of graph computing (or whatever powers your AI-like engine) to constantly changing requirements.
Most things they do on a daily basis at Amazon or Google would get you fired from a normal IT job, just like throwing explosives in a fire is done on a regular basis in oilfields but would cost a NYC firefighter his job.
The Hadoop defenders will no doubt counter with, "but Hadoop wasn't designed to be an RDBMS!", to which I say it doesn't matter. That's what people were trying to make Hadoop into because that's what businesses thought that they needed: a drop in replacement for SQL and RDBMS that addressed their scalability problems. In the meantime SQL and RDBMS developers have answered the challenge and continued improving their tools, addressing many of the shortcomings that Hadoop was supposed to resolve while Hadoop was still over promising and under delivering. The old quip is still true, "SQL is dead. Long live SQL."
That's bullshit and obviously you're a DBA defending his turf. A Hadoop cluster will scale beyond anything a RDBMS can handle, and if the only tool in your toolbox is SQL you can use products like Hive or Hawq that will process your queries through a specialized JDBC driver and run them across as many nodes as your budget can afford.
For instance you could have petabytes of data in CSV format stored on your HDFS cluster, and you could create a relational model on top of them without rewriting a single byte, then use SQL to interact with this huge data set. It's like mounting external sources in Oracle or Postgresql, but at a scale that neither product can process.
Do you know what the NSA used to store all that big brother data? Accumulo, which sits on Hadoop. They would have never been able to crunch that volume of data with [insert your RDBMS product here].
Don't diss stuff you don't understand. Nobody is taking your precious database away, there's just an alternative for people with more complex needs.
What has happened instead is that quite a few "tech experts" did not understand what it actually was and had completely unrealistic expectations. Map-reduce is nice when you a) have computing power coming out of your ears and b) have very specific computing tasks.
Spot on. Hadoop is meant to run on a shitload of commodity computers, which is something most organizations don't have - if you can afford a shitload of commodity computers your sysadmins will probably choose to buy high-end SAN and top notch blade servers, and virtualize everything.
You can see it immediately when you install a packaged version like Hortonworks; the wizard will put data on all your volumes because it assumes you're running on a bunch of low-end servers with shitty RAID or even JBOD - but if you're in a typical enterprise situation, your server is a virtual machine and all the volumes come from the same virtual disk so there's no point in spreading the data across volumes.
And the specialized computing part is also true. Processing data on cluster means that you have to "think" your workload in terms of map-reduce (whether you're crunching on Hadoop MR, Tez or Spark) and this does not always translate in a computing environment that is relevant for everyday situations.
Basically, those tools were designed for Google and Yahoo: tons of servers, team of highly skilled programmers. It's still a valuable technology stack if you have the right use case but more often than not, a typical BI product or a MPP appliance is a better choice.
As far as I know, most people are using Apache Spark for new projects.
Spark is a framework that includes ETL, in-memory computing and a machine learning library - a typical case of wheel reinventing.
Those "most" people you mention probably only use the machine learning part, and on a fairly small data set. In theory, Spark RDD can scale to "Petabytes" (says them) but I've never seen it work on even TB level volumes of data, while Hadoop scales to unlimited volumes (Yahoo used to run a 40,000 nodes cluster).
Spark is awesome but it's not a replacement for Hadoop for distributed computing, it's not as powerful as sqoop for ETL and it's not as advanced as Flink for streaming. They should just focus on the machine learning library, like Mahout ended up doing.
People who bash Hadoop without understanding at a very minimum the moving parts have obviously no experience with it.
Hadoop is not one thing. It's three:
1) a distributed filesystem (HDFS) 2) a job scheduler (Yarn) 3) a distributed computing algorithm (MapReduce)
Many tools like Hbase or Accumulo *need* HDFS. That's a core component and there's no equivalent in Spark. Anyone saying HDFS is obsolete is a clueless idiot.
Anyways the Spark vs Hadoop narrative is bullshit. A serious Spark setup usually runs on top of a Hadoop cluster, and often you can't get away entirely from MapReduce (or its actual successor, Tez) because Spark runs in-memory and doesn't scale as much; for some workloads you need the read-crunch-save aspect of MapReduce because there's just too much data, and MapReduce is also more resilient as you don't lose as much when a node crashes during a job. Spark is more advanced and has actual analytics capabilities thanks to a powerful ML library (while Hadoop is just distributed computing), but it's not a case of either/or.
For instance a common approach is to use Hadoop jobs to trim down your data (via Pig or other blunt tool) to a point where you can run machine learning algorithms on Spark.
As for Kafka, it's just a fucking message queue. It's fast and very powerful, but comparing it to Hadoop is like saying you should use Linux instead of MySQL.
Whoever considers buying services from those Snowflake morons, run away.
Fair enough. The part that annoys me is not people who don't shop at Walmart (I don't go there much myself), it's people who talk about "rewarding" a retailer with their business. That's just ridiculous.
Ok let's say you buy a pair of jeans. Levis "Signature" is $17 at Walmart, cheapest Levis at JC Penney is $46 (I checked). So you can get through 3 pairs of Walmart jeans before you get to the JC Penney price. And while the Walmart ones are lower quality, they're still jeans, they do the job.
And when it's time to buy more, the price at Walmart will have gone down to $15.75 or something like that while it will be like $53 at JC Penney.
a typical household shopping at Walmart save something like $1,200 / year on groceries alone. That pays for Netflix, internet and maybe even a mobile phone.
Tough to lead in the technology sector with second tier and lower recruits.
There's a revolving door between Walmart and Amazon. Top tier IT goes from here to there, because they're basically the only retailers with so much volume that they have challenges nobody else has. Granted, the lower rungs on the ladder are mostly visa workers (Walmart is the biggest recruiter of visa workers year after year) because there's no point in paying a premium for people who install MS-Office, but where it matters, it's top talent.
I for one would love to work on IT projects at Walmart. Can you imagine the engineering challenges? We're not talking about digital goods, but real things with a real volume and weight that require space for storage and energy for shipping. Tons and tons of random crap, from batteries to lawn furniture and pepperoni. Fascinating.
I see no reason to reward Walmart with my business.
"Reward" them? Is that how you see yourself, the belle of the ball that all retailers should bend over backward to please just so they can "deserve" your benevolent purchase of low-quality chinese dishware and apparel?
Buying and selling are two sides of a shared transaction. Neither side is "rewarding" the other. If you don't want to do business with Walmart, go somewhere else to buy your toilet paper for a little more. They're not going to go bankrupt over your boycott or over the strongly worded Facebook posts you make. There's plenty of people who are more than happy to accept a less comfortable shopping experience in exchange for lower prices.
If you have a decent budget, call Red Hat or Hortonworks and you'll see that open source vendors can also wine and dine you properly.
What do the other elite forces think - what do the seals use ?
SEALS are at best in the top 5.
1) French Foreign Legion 2nd REP GCP
2) Guatemalan Kaibiles
3) Mexican GAFE
4) UK SAS
5) US Navy SEALS
Actually, many SAS and SEALS train with the Kaibiles, and after their stint in the US/UK military they end up joining the French Legion to see real action.
A former client of mine was paying SAS $10,000/month to host a shitty dashboard that was updated once per quarter. It didn't even come with a vanity URL. That's the typical SAS market: gold-plated clients with unlimited budgets and almost no actual needs.
We spent an afternoon rewriting this piece of shit as a HTML dump from matlab and "deployed" it on the corporate intranet.
When you don't provide added value, you quickly become obsolete.
Farewell, SAS.
Please tell me they named the fantastic "Microsoft Bob" app after him.
If Hadoop is as amazing as you say it is then why aren't more companies enjoying success with it?
Can you provide numbers to back your statement that not many companies are "enjoying success" with it? Or are you content to repeat the same bullshit over and over?
A few interesting facts.
-Cloudera, Hortonworks, MapR, Pivotal are all in Gartner Magic Quadrant for Data Warehouse and Database Management Solutions for Analytics
-Most of the big BI products (MicroStrategy, etc) offer connectors to AWS EMR, HDInsight and various other Hadoop offerings. Do you know why? Because people use them.
-Hortonworks and Cloudera, two big Hadoop vendors, have $100+ millions in revenue.
So spin your bullshit any way you want if it makes you more secure in your little DBA garden, but Hadoop is now a growing technology in the enterprise. Are all Hadoop project successful? Of course not, but the same goes for SAP, Cognos, Oracle or any other data-related project. Doesn't mean they're not successful products.
OMG.
Why is still happening? Why won't Uber stop raping their female employees?!?
They should let them write code for the self-driving cars instead. That way, just like in real life, female drivers would cause accidents but it's the male driver that has to swerve or veer to avoid a collision that would be at fault.
The problem those companies face is that they grew so fast that they're struggling with past technical decisions that are difficult to revert (e.g. Twitter and their initial RoR architecture). The wheels keep turning so they end up having to build sophisticated layers on top of their legacy garbage.
We've all been there. Someone (maybe even you) builds a throwaway Excel macro or Wordpress-driven monstrosity just to address a temporary need that is not worth spending more than 2h on, and first thing you know it's become a mission critical component. Now imagine that this piece of shit ends up powering tools used by millions of users; you can't simply stop everything and rewrite from scratch, and you can't stop adding features, for which you need busloads of new programmers. At that point you don't need programmers who will tell you: dude you should use Nodejs or MySQL, you need programmers who can apply a very narrow and deep understanding of graph computing (or whatever powers your AI-like engine) to constantly changing requirements.
Most things they do on a daily basis at Amazon or Google would get you fired from a normal IT job, just like throwing explosives in a fire is done on a regular basis in oilfields but would cost a NYC firefighter his job.
The Hadoop defenders will no doubt counter with, "but Hadoop wasn't designed to be an RDBMS!", to which I say it doesn't matter. That's what people were trying to make Hadoop into because that's what businesses thought that they needed: a drop in replacement for SQL and RDBMS that addressed their scalability problems. In the meantime SQL and RDBMS developers have answered the challenge and continued improving their tools, addressing many of the shortcomings that Hadoop was supposed to resolve while Hadoop was still over promising and under delivering. The old quip is still true, "SQL is dead. Long live SQL."
That's bullshit and obviously you're a DBA defending his turf. A Hadoop cluster will scale beyond anything a RDBMS can handle, and if the only tool in your toolbox is SQL you can use products like Hive or Hawq that will process your queries through a specialized JDBC driver and run them across as many nodes as your budget can afford.
For instance you could have petabytes of data in CSV format stored on your HDFS cluster, and you could create a relational model on top of them without rewriting a single byte, then use SQL to interact with this huge data set. It's like mounting external sources in Oracle or Postgresql, but at a scale that neither product can process.
Do you know what the NSA used to store all that big brother data? Accumulo, which sits on Hadoop. They would have never been able to crunch that volume of data with [insert your RDBMS product here].
Don't diss stuff you don't understand. Nobody is taking your precious database away, there's just an alternative for people with more complex needs.
What has happened instead is that quite a few "tech experts" did not understand what it actually was and had completely unrealistic expectations. Map-reduce is nice when you a) have computing power coming out of your ears and b) have very specific computing tasks.
Spot on. Hadoop is meant to run on a shitload of commodity computers, which is something most organizations don't have - if you can afford a shitload of commodity computers your sysadmins will probably choose to buy high-end SAN and top notch blade servers, and virtualize everything.
You can see it immediately when you install a packaged version like Hortonworks; the wizard will put data on all your volumes because it assumes you're running on a bunch of low-end servers with shitty RAID or even JBOD - but if you're in a typical enterprise situation, your server is a virtual machine and all the volumes come from the same virtual disk so there's no point in spreading the data across volumes.
And the specialized computing part is also true. Processing data on cluster means that you have to "think" your workload in terms of map-reduce (whether you're crunching on Hadoop MR, Tez or Spark) and this does not always translate in a computing environment that is relevant for everyday situations.
Basically, those tools were designed for Google and Yahoo: tons of servers, team of highly skilled programmers. It's still a valuable technology stack if you have the right use case but more often than not, a typical BI product or a MPP appliance is a better choice.
As far as I know, most people are using Apache Spark for new projects.
Spark is a framework that includes ETL, in-memory computing and a machine learning library - a typical case of wheel reinventing.
Those "most" people you mention probably only use the machine learning part, and on a fairly small data set. In theory, Spark RDD can scale to "Petabytes" (says them) but I've never seen it work on even TB level volumes of data, while Hadoop scales to unlimited volumes (Yahoo used to run a 40,000 nodes cluster).
Spark is awesome but it's not a replacement for Hadoop for distributed computing, it's not as powerful as sqoop for ETL and it's not as advanced as Flink for streaming. They should just focus on the machine learning library, like Mahout ended up doing.
People who bash Hadoop without understanding at a very minimum the moving parts have obviously no experience with it.
Hadoop is not one thing. It's three:
1) a distributed filesystem (HDFS)
2) a job scheduler (Yarn)
3) a distributed computing algorithm (MapReduce)
Many tools like Hbase or Accumulo *need* HDFS. That's a core component and there's no equivalent in Spark. Anyone saying HDFS is obsolete is a clueless idiot.
Anyways the Spark vs Hadoop narrative is bullshit. A serious Spark setup usually runs on top of a Hadoop cluster, and often you can't get away entirely from MapReduce (or its actual successor, Tez) because Spark runs in-memory and doesn't scale as much; for some workloads you need the read-crunch-save aspect of MapReduce because there's just too much data, and MapReduce is also more resilient as you don't lose as much when a node crashes during a job. Spark is more advanced and has actual analytics capabilities thanks to a powerful ML library (while Hadoop is just distributed computing), but it's not a case of either/or.
For instance a common approach is to use Hadoop jobs to trim down your data (via Pig or other blunt tool) to a point where you can run machine learning algorithms on Spark.
As for Kafka, it's just a fucking message queue. It's fast and very powerful, but comparing it to Hadoop is like saying you should use Linux instead of MySQL.
Whoever considers buying services from those Snowflake morons, run away.
Can you name a company that looks at anyone's interests but their own?
I also have weird problems with outlook.com on non-Windows machines.
before libreoffice was cool
This is 2017 and we are still "before libreoffice was cool"
when you do that you cant afford to become 300-400 pounds...
Somehow I doubt that food is the root cause of your financial situation.
hint #1) there's "high" in your username
hint #2) there's 702 in your username, which is Las Vegas area code.
I rest my case.
You mean there's more than one? I thought it was just one guy with no life and a lot of conflicting opinions.
To close many tabs at once, I just moused-over the leftmost one and center-click rapidly until they are all closed.
Don't they change size after a while? You must have ninja skills.
I usually have 40-50 tabs open at once, and there is no way I will do that on a browser not supporting vertical tabs.
Tab hoarding is a serious condition and it appears it has started to impact your daily life. Maybe it's time to get help.
The current menu is breaking the Designer Dogma, which mandates to never have more than 5 options. These menu items have to go.
Your spelling it wrong.
Fair enough. The part that annoys me is not people who don't shop at Walmart (I don't go there much myself), it's people who talk about "rewarding" a retailer with their business. That's just ridiculous.
Ok let's say you buy a pair of jeans. Levis "Signature" is $17 at Walmart, cheapest Levis at JC Penney is $46 (I checked). So you can get through 3 pairs of Walmart jeans before you get to the JC Penney price. And while the Walmart ones are lower quality, they're still jeans, they do the job.
And when it's time to buy more, the price at Walmart will have gone down to $15.75 or something like that while it will be like $53 at JC Penney.
a typical household shopping at Walmart save something like $1,200 / year on groceries alone. That pays for Netflix, internet and maybe even a mobile phone.
You must hold Walmart stock to peddle that line.
Tough to lead in the technology sector with second tier and lower recruits.
There's a revolving door between Walmart and Amazon. Top tier IT goes from here to there, because they're basically the only retailers with so much volume that they have challenges nobody else has. Granted, the lower rungs on the ladder are mostly visa workers (Walmart is the biggest recruiter of visa workers year after year) because there's no point in paying a premium for people who install MS-Office, but where it matters, it's top talent.
I for one would love to work on IT projects at Walmart. Can you imagine the engineering challenges? We're not talking about digital goods, but real things with a real volume and weight that require space for storage and energy for shipping. Tons and tons of random crap, from batteries to lawn furniture and pepperoni. Fascinating.
I see no reason to reward Walmart with my business.
"Reward" them? Is that how you see yourself, the belle of the ball that all retailers should bend over backward to please just so they can "deserve" your benevolent purchase of low-quality chinese dishware and apparel?
Buying and selling are two sides of a shared transaction. Neither side is "rewarding" the other. If you don't want to do business with Walmart, go somewhere else to buy your toilet paper for a little more. They're not going to go bankrupt over your boycott or over the strongly worded Facebook posts you make. There's plenty of people who are more than happy to accept a less comfortable shopping experience in exchange for lower prices.