What about your remote intranet access? Can it detect when you're on a mobile network and make sure to force you to download 12MB of compressed javascript on every page?
Remote virtual desktops are okay for basic use, but even on high-end infrastructure there's a tiny latency which is quite annoying when coding (unless you type real slow). It's not "in your face" but you can feel it and it makes the experience unpleasant.
They were able to demonstrate their algorithm in simulations
So they don't beat the world's worst traffic, they beat simulations. Unless someone previously mastered the art of making immensely accurate traffic simulations this is useless.
Then I will take your word for it and officially retract my disparaging comment.
For the record I don't have a short attention span, I just severely dislike people who write long paragraphs. They remind me of those people who chain their sentences together when they speak, just to make sure you have no way to escape the discussion. Paragraphs are a courtesy to readers, a subtle way to let them off the hook if they want to skip ahead and see if there's a point in continuing reading.
Just because your mediocre company forces its employees to do it, doesn't make it the correct decision
In my experience, the bigger the organization gets, the more it's important to think in terms of "right" practice, not "best" practice. The correct decision is the one that makes the business successful consistently; and unless you have the psychic ability to see the future, slowing down the business to do things by the book is typically a bad idea, especially if the company is experiencing a huge growth.
the true kings of the big data world are DB2 and TeraData.
You had me until you mentioned DB2. I've never heard of a PB-level DB2 instance, I don't even think it's possible. Last time I checked a table couldn't go over 2TB and even BLOBs can't be bigger than 2GB.
Yes. Spark can optionally run on Hadoop, which is not the same thing as being based on Hadoop. So before implying that other people would "know" something if they had worked with Spark, make sure that the thing in question is true.
Just for the sake of discussion, if I was to design a POS today I think I'd consider the new in-memory engine in MongoDB. It's pretty cool; it writes nothing to disk (ever) but it can be part of a cluster where some other members use the normal engine. Each cluster supports up to 50 members, and the client can specify a preferred read node. So I would leave the write master in the backend and all the POS would have the inventory pretty much in real time on their local read node.
Or since they bring up Kafka in the article, that could also be an interesting component. Unlike other queue engines, Kafka keeps track of how far each subscriber has gone in the sequence of events, so it's possible to have clients with wildly different needs connected to the same queue (such as a real-time ticker or a monthly dashboard).
Well probably none of this applies in your case. I think I just have a thing for POS in general so I'm semi-jealous of your situation.
You're a stupid motherfucker. You have nothing useful to say. You contribute nothing useful to this site or to society [...] (etc)
I was unable to read the rest of your comment because I have a policy of stopping when it becomes obvious that the other person is just throwing a tantrum.
If you disagree with the fact that Wikipedia clearly indicates that Spark is NOT based on Hadoop, support your claim with a link or citation. Otherwise there is no need to get your panties in a bunch, you clearly don't have enough trolling skills to make even a drunk Mike Tyson circa 1997 angry.
If you want to make a graph to support your insults, maybe you should make sure the graph itself is not stupid. Being a self-righteous cunt is not enough, you need to dot your i's and cross your t's. I'm sure you'll do better next time now that you are aware of it.
Hey, you forgot about the Keyboard Warriors! Surely they must be the fiercest of them all.
Kaibiles are required to raise a pet and then kill it. Elite Legionnaires are thrown handcuffed in a small cage with a live chicken and are only allowed out once the chicken is dead. Spetsnaz used to be handed a shovel at the end of the training day and only had a moment to dig a hole and jump in it before officers started shooting at them.
But yes, those guys have nothing on the fierce keyboard warriors, such as PTA moms putting up outraged Facebook pages or male feminazis joining twitter mobs.
"Attention span" is a metric, not a value by itself. It's like writing "mph" or "flavor", there's no implied quantity or quality.
If at least you had put me on the left side one could have argued it was some kind of axis and I was on the short side. But as it stands your chart makes no sense.
I think even a vanilla Postgresql will do 1-2 Petabytes.
The maximum column size for Postgres is 1GB. The maximum table size is 32TB. So let's say you have a 1PB data set, that means you need to shard your data in at least 25 tables of 250 columns.
Let's say you want to run a query vertically; you'll need to join those 25 tables, start the query and go on vacation for a month. That's how 1PB works on Postgres.
And don't you even dare do some leaf-level manipulations on that volume of data, like a lateral join - unless you enjoy a faint smell of burnt plastic in your data center. Meanwhile, that kind of thing runs smoothly on Hadoop, and if it's too slow you just add nodes.
I'm not saying RDBMS are dead - in my opinion the vast majority of use cases warrant for a traditional RDMBS or non-Hadoop NoSQL database. But when it comes to seriously big data, fuggedaboutit.
So they have Alpha 2 but no Final Alpha, and they have a Final Beta but not Beta 2. How hard is it to have a minimum of consistency in a release schedule?
Did they choose this scheme to annoy aspies, or are they just that nonchalant and careless? Oh wait, I've tried Ubuntu so I know the answer.
As a linux user and an anti-fan of systemd, I thought I'd give FreeBSD a try - it's been years since I last gave any BSD a go. [...] So, first thing I try and do gives an unGoogle-able error message. That's enough playing about, I'll try a BSD again in a few more years.
We've all been there... Next on your list should be: installing Slackware on a brand new ultrabook that has no ethernet adapter.
Amen to that. For years every time I've used Python on AIX I commented out a line of code in one of the core Python libraries to make it work better on that cursed O/S. I couldn't do that with Powershell on Windows.
MacBook Pro VmWare Fusion
So you want a MacBook to work in Windows? Are you also a drinker of decaf coffee and alcohol-free beer?
What about your remote intranet access? Can it detect when you're on a mobile network and make sure to force you to download 12MB of compressed javascript on every page?
Remote virtual desktops are okay for basic use, but even on high-end infrastructure there's a tiny latency which is quite annoying when coding (unless you type real slow). It's not "in your face" but you can feel it and it makes the experience unpleasant.
They were able to demonstrate their algorithm in simulations
So they don't beat the world's worst traffic, they beat simulations. Unless someone previously mastered the art of making immensely accurate traffic simulations this is useless.
Then I will take your word for it and officially retract my disparaging comment.
For the record I don't have a short attention span, I just severely dislike people who write long paragraphs. They remind me of those people who chain their sentences together when they speak, just to make sure you have no way to escape the discussion. Paragraphs are a courtesy to readers, a subtle way to let them off the hook if they want to skip ahead and see if there's a point in continuing reading.
Just because your mediocre company forces its employees to do it, doesn't make it the correct decision
In my experience, the bigger the organization gets, the more it's important to think in terms of "right" practice, not "best" practice. The correct decision is the one that makes the business successful consistently; and unless you have the psychic ability to see the future, slowing down the business to do things by the book is typically a bad idea, especially if the company is experiencing a huge growth.
the true kings of the big data world are DB2 and TeraData.
You had me until you mentioned DB2. I've never heard of a PB-level DB2 instance, I don't even think it's possible. Last time I checked a table couldn't go over 2TB and even BLOBs can't be bigger than 2GB.
Yes. Spark can optionally run on Hadoop, which is not the same thing as being based on Hadoop. So before implying that other people would "know" something if they had worked with Spark, make sure that the thing in question is true.
Just for the sake of discussion, if I was to design a POS today I think I'd consider the new in-memory engine in MongoDB. It's pretty cool; it writes nothing to disk (ever) but it can be part of a cluster where some other members use the normal engine. Each cluster supports up to 50 members, and the client can specify a preferred read node. So I would leave the write master in the backend and all the POS would have the inventory pretty much in real time on their local read node.
Or since they bring up Kafka in the article, that could also be an interesting component. Unlike other queue engines, Kafka keeps track of how far each subscriber has gone in the sequence of events, so it's possible to have clients with wildly different needs connected to the same queue (such as a real-time ticker or a monthly dashboard).
Well probably none of this applies in your case. I think I just have a thing for POS in general so I'm semi-jealous of your situation.
You're a stupid motherfucker. You have nothing useful to say. You contribute nothing useful to this site or to society [...] (etc)
I was unable to read the rest of your comment because I have a policy of stopping when it becomes obvious that the other person is just throwing a tantrum.
If you disagree with the fact that Wikipedia clearly indicates that Spark is NOT based on Hadoop, support your claim with a link or citation. Otherwise there is no need to get your panties in a bunch, you clearly don't have enough trolling skills to make even a drunk Mike Tyson circa 1997 angry.
>For instance you could have petabytes of data in CSV format stored on your HDFS cluster
And somewhere in a tiny sub-corner of those petabytes, someone generated the CSV with Excel and the quoting is all messed up.
Almost all the tools default to tab-delimited (Pig, cut, etc) but yes there's usually an Excel saboteur or two in every organization.
If you want to make a graph to support your insults, maybe you should make sure the graph itself is not stupid. Being a self-righteous cunt is not enough, you need to dot your i's and cross your t's. I'm sure you'll do better next time now that you are aware of it.
Hey, you forgot about the Keyboard Warriors! Surely they must be the fiercest of them all.
Kaibiles are required to raise a pet and then kill it. Elite Legionnaires are thrown handcuffed in a small cage with a live chicken and are only allowed out once the chicken is dead. Spetsnaz used to be handed a shovel at the end of the training day and only had a moment to dig a hole and jump in it before officers started shooting at them.
But yes, those guys have nothing on the fierce keyboard warriors, such as PTA moms putting up outraged Facebook pages or male feminazis joining twitter mobs.
If you actually would work with Spark, you would know it is based on Hadoop, just saying.
Even a retard with a low-speed internet access can look this up on Wikipedia and prove you wrong. Are you trolling or just stupid?
"Attention span" is a metric, not a value by itself. It's like writing "mph" or "flavor", there's no implied quantity or quality.
If at least you had put me on the left side one could have argued it was some kind of axis and I was on the short side. But as it stands your chart makes no sense.
I think even a vanilla Postgresql will do 1-2 Petabytes.
The maximum column size for Postgres is 1GB. The maximum table size is 32TB. So let's say you have a 1PB data set, that means you need to shard your data in at least 25 tables of 250 columns.
Let's say you want to run a query vertically; you'll need to join those 25 tables, start the query and go on vacation for a month. That's how 1PB works on Postgres.
And don't you even dare do some leaf-level manipulations on that volume of data, like a lateral join - unless you enjoy a faint smell of burnt plastic in your data center. Meanwhile, that kind of thing runs smoothly on Hadoop, and if it's too slow you just add nodes.
I'm not saying RDBMS are dead - in my opinion the vast majority of use cases warrant for a traditional RDMBS or non-Hadoop NoSQL database. But when it comes to seriously big data, fuggedaboutit.
So they have Alpha 2 but no Final Alpha, and they have a Final Beta but not Beta 2. How hard is it to have a minimum of consistency in a release schedule?
Did they choose this scheme to annoy aspies, or are they just that nonchalant and careless? Oh wait, I've tried Ubuntu so I know the answer.
Even jokes are one line fixes in Ubuntu :-)
Here's the best one-line fix for Ubuntu:
wget Fedora-Live.iso
As a linux user and an anti-fan of systemd, I thought I'd give FreeBSD a try - it's been years since I last gave any BSD a go.
[...]
So, first thing I try and do gives an unGoogle-able error message. That's enough playing about, I'll try a BSD again in a few more years.
We've all been there... Next on your list should be: installing Slackware on a brand new ultrabook that has no ethernet adapter.
Using Ubuntu is like ordering a mocha at Starbucks; if you don't like coffee, just get a hot chocolate and move on.
If you want a Debian, use Debian; if you want a retarded UI, use a Mac. There's no point in going halfsies.
I hope it's something that doesn't sound retarded so I can finally talk to my boss about Linux.
Or do him a solid and suggest to use Red Hat like real companies do.
I wanted some popcorn on hand before I started reading the comments.
Are you currently leaving smudges on a touch screen or on a keyboard?
But at-least you have the option! :)
Amen to that. For years every time I've used Python on AIX I commented out a line of code in one of the core Python libraries to make it work better on that cursed O/S. I couldn't do that with Powershell on Windows.
^ for those who didn't want to read that wall of text, the guy basically says "it depends"
How dare you bring up common sense in this emotional discussion