Why are you comparing "PE" in a 4 year degree to a fitness club? It's a full on class where you spend most of your time learning in a lecture hall. As for "hands on" skills, so many people that learn these "skills", but only know exactly what they were taught and nothing more. Those "hands on" skills have expiration dates when the technology changes. If you really want to learn something useful, learn they theory behind those skills. While theory and "hands on" skills can both be learned, most places tend to focus on one or the other and theory is much more important.
I think what he meant is if brain surgery was as bureaucratic as many projects, it would take too long and would be a botched job almost every time. But yes, lots of pre-op stuff to do.
I personally don't see the point of more than using 200watts at home. Just enough for two light bulbs.
Wait, if you have enough power to the house, you can have ovens, refrigerators, and air conditioning units?! Wow, who would have thought about the new awesome ideas 100 years ago when power was limited?
Unless you're God and can see the future, stop acting like you know that there is absolutely no benefit to improving technology.
80/20 rule would most definitely apply on average. When reading about Netflix' caching servers, something like 10% of the data represents about 70% of the hits.
Netflix' SSD servers only have about 10TB of storage, and they have about a 70% hit rate, while the rust-bucket servers have 100TB of storage and have about an 80%-90% hit rate. Their entire catalog is about 1PB.
P2P would work best for flash-mobs. It would reduce the number of cache servers they need deployed in other ISPs.
Assuming decent buffering, you can start streaming the video live at the beginning, and the P2P can start buffering the later part of the video. Just use the normal servers for starting the buffer, but then use P2P to populate as much buffer as you can. I'm sure the 80/20 rule would apply.
So it makes little sense to have asymmetric fiber service other than for marketing purposes.
Most fiber deployments use GPON, which shares bandwidth. High speed sending optics are more expensive than high speed receiving optics. Most ONTs can receive up to 2.5gb/s, but can only send 1.25gb/s.. If using Active Ethernet, then it'll be completely symmetrical, and you'll have 1gb up/down. But in GPON mode, you have 2.5 down and 1.25 up.
Google Fiber uses WDM-GPON, which has 32 lamdas of 1.25gb/1.25gb, so it's all symmetrical, but they were an early adopter and used the draft version.
The other question that comes up. Since the fiber is already dedicated, why use GPON? Well, you get higher port densities and less power consumption, but quite a bit. There is a good benefit of having 32 customers per port instead of one customer per port.
The point is that each and every component involved, from hardware through firmware to software, is designed under the premiss that it is okay to drop a packet at any time for any reason, or to duplicate or reorder packets.
That entire sentence is damn near a lie. Those issue can happen, but they shouldn't happen. You almost have to go out of your way to make those situations happen. Dropping a packet should NEVER happen except when going past line rate. Packets should NEVER be duplicated or reordered except in the case of a misconfiguration of a network. Networks are FIFO and they don't just duplicate packets for the fun of it.
As for error rates, many high end network devices can upwards of an error rate of 10E-18, which puts it at one error every 111petabytes. I assume you'd have to divide that error rate by the number of hops.
I've seen enough system designs where they send data as UDP packets and they require incredibly low packet-loss rates, border-lining never. It can be done, but you're not going to be using dlink switches. You can purchase L4 switches now with multi-gigabyte buffers. They're meant to handle potentially massive throughput spikes and not drop packets.
I assume this is all intra-datacenter traffic or at least an entirely reserved network.
The whole point of QoS is to not have to add more hardware, but to make better use of your current hardware while not having large amounts of jitter. Mainframes don't need to worry about interactive processes, but many modern day work loads do. What they want is a good average throughput with a maximum latency.
* the UDP traffic contains multiple data packets (call them "jobs") each of which requires minimal decoding and processing
anything _above_ that rate and the UDP buffers overflow and there is no way to know if the data has been dropped. the data is *not* repeated, and there is no back-communication channel.
How are you planning on handling UDP checksum errors without a backchannel or EC? The physical ethernet layer is lossy, so you're screwed even before the packet hits the NIC.
Lossy?
I just logged into my switch at home and it has 146 days of uptime with 20,154,030,043 frames processed and 0 frame errors. I can even do a 1gb/1gb, for a total of just under 2gb/s at once, iperf, and have 0 packets dropped.
Let the network group worry about QoS. But yes, errors will eventually happen, they're just very rare. But when they do happen, it's probably pathological and you'll get a lot of them. But I wouldn't go so far to say "the physical ethernet layer is lossy", as a general statement.
What kind of crappy network equipment does your job use that has packet loss at anything less than line rate? He's talking about near 1mbit/sec of UDP. I can get 0% packet-loss around the world for only 1mb/s
But if done correctly, you can do line rate UDP with 0% loss. Routers can do line rate without loss all the time. He's talking about thousands of packets per second, not the millions to tens of millions a modern NIC can handle.
When you handling lots of little messages/jobs/tasks that are coming in quickly, passing data between processes is a horrible idea. Between context switching and system calls, you're destroying your performance.
You need to make larger batches.
1) UDP/Job comes in, write to single-writer many reader queue(large circular queues can be good for this) and the order number, maybe a 64bit incrementing integer. If the run time per job is quite constant, then you could use several single reader/writer queues and just round robin them. This would reduce potential lock contention, but would come at the cost of variable work loads could cause a bias towards a single worker.
1.a) You're not receiving packets fast enough to worry about threading reading from the NIC. If you had to look into making this part faster, like millions of Packets Per Second, the first thing I would find out is if this packets are coming from multiple data sources and if jobs need to be processing in order relative to all sources or to themselves. If themselves, then you could have a load balancer trying to round-robin and sticky by Source IP.
2) Worker sees jobs in queue(since this is a speed sensitive dedicated matching, polling could work, but may want event based), grabs N jobs, where those N Jobs can be reliably completed in a timely fashion, this may be 1 or may be 100, who knows until you test. Note the order number of your Jobs. You don't really need to grab N jobs if using a single reader/writer queue since there is no real contention, but reading in batches is good for high contention queues like multi-readers.
3) Your worker will now loop through each job running each script, hopefully all on the same worker/thread.
4) Write out the completed jobs to a single reader single writer queue. If you don't use a single reader/writer queue and instead have a multi-writer queue, you may want to commit finished jobs in batches to reduce contention.
5) Have another worker poll/event each of queues for each worker. This worker can make sure the jobs are put back in order. This process I assume to be relatively lite, so probably a single worker to handle all of the worker queues, but could also be threaded. You just need to manage the ordering somehow.
You should have no more than N number of workers per core, where N is probably a small number, like 2. Lots of threads is bad.
I love single reader/writer queues, they can be lock-less.
Your problem sounds close to what Disruptor handles (Google: disruptor ring buffer)(fun read: http://mechanitis.blogspot.com...). May want to also look into that kind of design. It's an interesting project that runs on Java and.Net, and I think C or something, but I can't remember. Still a good read.
If you got a 2x increase in single threaded performance on a 100k node cluster, you could probably get rid of quite a bit more than 50k nodes because of scaling issues.
Earlier 64-bit AMD CPUs did not have a 64bit atomic compare-and-swap instruction, so Microsoft limited their OSes at those times to 8TB. If only Microsoft supported compiling for your arch. Stupid closed source OS.
I understand where you're going with that, but I don't think minimum wage is meant to support a family. It should be just enough for a single person to fully support themselves, including basic healthcare, eating healthy, and a bit of extra to have fun with friends so they don't get stuck in depression. Essentially enough to be a physically and mentally healthy social individual.
If everyone's equal, then nobody has incentives to do anything
Research shows that paying people too much is worse than paying too little. You get the best performance out of people when they get paid just enough to be content with their life. Where they can have a decent roof of their head, be healthy, afford the bare necessities and a bit extra to have fun with the family a few times a week. That's it. They need to be secure and they need to be happy, but no more.
On average, if someone is getting paid enough to own a flashy car, their productivity will go down.
This only applies to job with any amount of creativity. Jobs that are repetitive manual labor show an almost linear increase in productivity with pay, so increased pay works just fine. But we're in the process of automating manual labor and will at some point in the future have replaced all manual labor with automation.
And regulations are what stopped factories from taking orphans off the streets and putting them to work in very dangerous settings that many times results in the deaths of the children. Have a skinny chimney? Starve your orphan slave, until they can fit, then get them in there and light the fire under them so they quickly push through all that soot that will clog their lungs and lead to a painful early death.
There's also the chance the child may get stuck and starve or burn to death. But whatever. Zomg! regulations are bad!
In a way, regulations are a government enforced set of ethics and morals. Not entirely, but very similar.
He completely separates his business decisions from his own personal decisions. He doesn't let ideology get in the way of making sure his business survives, but in his own personal life, he will fight for those ideologies.
If people don't like it, then they should fix the laws.
Only a few hundred years ago it was religion. Flavor of the century. Ass holes will always be assholes, they just use whatever is the current popular tool to accomplish their misdeeds.
Why are you comparing "PE" in a 4 year degree to a fitness club? It's a full on class where you spend most of your time learning in a lecture hall. As for "hands on" skills, so many people that learn these "skills", but only know exactly what they were taught and nothing more. Those "hands on" skills have expiration dates when the technology changes. If you really want to learn something useful, learn they theory behind those skills. While theory and "hands on" skills can both be learned, most places tend to focus on one or the other and theory is much more important.
He doesn't want to use frameworks, he wants to reinvent the wheel, poorly. This is a mix of sarcasm and opinion.
I think what he meant is if brain surgery was as bureaucratic as many projects, it would take too long and would be a botched job almost every time. But yes, lots of pre-op stuff to do.
I personally don't see the point of more than using 200watts at home. Just enough for two light bulbs.
Wait, if you have enough power to the house, you can have ovens, refrigerators, and air conditioning units?! Wow, who would have thought about the new awesome ideas 100 years ago when power was limited?
Unless you're God and can see the future, stop acting like you know that there is absolutely no benefit to improving technology.
80/20 rule would most definitely apply on average. When reading about Netflix' caching servers, something like 10% of the data represents about 70% of the hits.
Netflix' SSD servers only have about 10TB of storage, and they have about a 70% hit rate, while the rust-bucket servers have 100TB of storage and have about an 80%-90% hit rate. Their entire catalog is about 1PB.
P2P would work best for flash-mobs. It would reduce the number of cache servers they need deployed in other ISPs.
The 80/20 rule is a great rule.
Assuming decent buffering, you can start streaming the video live at the beginning, and the P2P can start buffering the later part of the video. Just use the normal servers for starting the buffer, but then use P2P to populate as much buffer as you can. I'm sure the 80/20 rule would apply.
So it makes little sense to have asymmetric fiber service other than for marketing purposes.
Most fiber deployments use GPON, which shares bandwidth. High speed sending optics are more expensive than high speed receiving optics. Most ONTs can receive up to 2.5gb/s, but can only send 1.25gb/s.. If using Active Ethernet, then it'll be completely symmetrical, and you'll have 1gb up/down. But in GPON mode, you have 2.5 down and 1.25 up.
Google Fiber uses WDM-GPON, which has 32 lamdas of 1.25gb/1.25gb, so it's all symmetrical, but they were an early adopter and used the draft version.
The other question that comes up. Since the fiber is already dedicated, why use GPON? Well, you get higher port densities and less power consumption, but quite a bit. There is a good benefit of having 32 customers per port instead of one customer per port.
A 100mb interface that can get 112mb/s? That's impressive.
The point is that each and every component involved, from hardware through firmware to software, is designed under the premiss that it is okay to drop a packet at any time for any reason, or to duplicate or reorder packets.
That entire sentence is damn near a lie. Those issue can happen, but they shouldn't happen. You almost have to go out of your way to make those situations happen. Dropping a packet should NEVER happen except when going past line rate. Packets should NEVER be duplicated or reordered except in the case of a misconfiguration of a network. Networks are FIFO and they don't just duplicate packets for the fun of it.
As for error rates, many high end network devices can upwards of an error rate of 10E-18, which puts it at one error every 111petabytes. I assume you'd have to divide that error rate by the number of hops.
I've seen enough system designs where they send data as UDP packets and they require incredibly low packet-loss rates, border-lining never. It can be done, but you're not going to be using dlink switches. You can purchase L4 switches now with multi-gigabyte buffers. They're meant to handle potentially massive throughput spikes and not drop packets.
I assume this is all intra-datacenter traffic or at least an entirely reserved network.
The whole point of QoS is to not have to add more hardware, but to make better use of your current hardware while not having large amounts of jitter. Mainframes don't need to worry about interactive processes, but many modern day work loads do. What they want is a good average throughput with a maximum latency.
* the UDP traffic contains multiple data packets (call them "jobs") each of which requires minimal decoding and processing
anything _above_ that rate and the UDP buffers overflow and there is no way to know if the data has been dropped. the data is *not* repeated, and there is no back-communication channel.
How are you planning on handling UDP checksum errors without a backchannel or EC? The physical ethernet layer is lossy, so you're screwed even before the packet hits the NIC.
Lossy?
I just logged into my switch at home and it has 146 days of uptime with 20,154,030,043 frames processed and 0 frame errors. I can even do a 1gb/1gb, for a total of just under 2gb/s at once, iperf, and have 0 packets dropped.
Let the network group worry about QoS. But yes, errors will eventually happen, they're just very rare. But when they do happen, it's probably pathological and you'll get a lot of them. But I wouldn't go so far to say "the physical ethernet layer is lossy", as a general statement.
What kind of crappy network equipment does your job use that has packet loss at anything less than line rate? He's talking about near 1mbit/sec of UDP. I can get 0% packet-loss around the world for only 1mb/s
But if done correctly, you can do line rate UDP with 0% loss. Routers can do line rate without loss all the time. He's talking about thousands of packets per second, not the millions to tens of millions a modern NIC can handle.
When you handling lots of little messages/jobs/tasks that are coming in quickly, passing data between processes is a horrible idea. Between context switching and system calls, you're destroying your performance.
.Net, and I think C or something, but I can't remember. Still a good read.
You need to make larger batches.
1) UDP/Job comes in, write to single-writer many reader queue(large circular queues can be good for this) and the order number, maybe a 64bit incrementing integer. If the run time per job is quite constant, then you could use several single reader/writer queues and just round robin them. This would reduce potential lock contention, but would come at the cost of variable work loads could cause a bias towards a single worker.
1.a) You're not receiving packets fast enough to worry about threading reading from the NIC. If you had to look into making this part faster, like millions of Packets Per Second, the first thing I would find out is if this packets are coming from multiple data sources and if jobs need to be processing in order relative to all sources or to themselves. If themselves, then you could have a load balancer trying to round-robin and sticky by Source IP.
2) Worker sees jobs in queue(since this is a speed sensitive dedicated matching, polling could work, but may want event based), grabs N jobs, where those N Jobs can be reliably completed in a timely fashion, this may be 1 or may be 100, who knows until you test. Note the order number of your Jobs. You don't really need to grab N jobs if using a single reader/writer queue since there is no real contention, but reading in batches is good for high contention queues like multi-readers.
3) Your worker will now loop through each job running each script, hopefully all on the same worker/thread.
4) Write out the completed jobs to a single reader single writer queue. If you don't use a single reader/writer queue and instead have a multi-writer queue, you may want to commit finished jobs in batches to reduce contention.
5) Have another worker poll/event each of queues for each worker. This worker can make sure the jobs are put back in order. This process I assume to be relatively lite, so probably a single worker to handle all of the worker queues, but could also be threaded. You just need to manage the ordering somehow.
You should have no more than N number of workers per core, where N is probably a small number, like 2. Lots of threads is bad.
I love single reader/writer queues, they can be lock-less.
Your problem sounds close to what Disruptor handles (Google: disruptor ring buffer)(fun read: http://mechanitis.blogspot.com...). May want to also look into that kind of design. It's an interesting project that runs on Java and
He said the CPU is mostly idle. He's trying to set up his system to handle lots of tiny tasks and Linux isn't playing well with the regular tools.
If you got a 2x increase in single threaded performance on a 100k node cluster, you could probably get rid of quite a bit more than 50k nodes because of scaling issues.
Earlier 64-bit AMD CPUs did not have a 64bit atomic compare-and-swap instruction, so Microsoft limited their OSes at those times to 8TB. If only Microsoft supported compiling for your arch. Stupid closed source OS.
If drinking 100 gallons of water per day is bad for you, then drinking any amount of water is bad for you. I love you logic!
I understand where you're going with that, but I don't think minimum wage is meant to support a family. It should be just enough for a single person to fully support themselves, including basic healthcare, eating healthy, and a bit of extra to have fun with friends so they don't get stuck in depression. Essentially enough to be a physically and mentally healthy social individual.
If everyone's equal, then nobody has incentives to do anything
Research shows that paying people too much is worse than paying too little. You get the best performance out of people when they get paid just enough to be content with their life. Where they can have a decent roof of their head, be healthy, afford the bare necessities and a bit extra to have fun with the family a few times a week. That's it. They need to be secure and they need to be happy, but no more.
On average, if someone is getting paid enough to own a flashy car, their productivity will go down.
This only applies to job with any amount of creativity. Jobs that are repetitive manual labor show an almost linear increase in productivity with pay, so increased pay works just fine. But we're in the process of automating manual labor and will at some point in the future have replaced all manual labor with automation.
When it pays more to be unemployed, just having jobs isn't enough, you need jobs that pay a livable wage.
And regulations are what stopped factories from taking orphans off the streets and putting them to work in very dangerous settings that many times results in the deaths of the children. Have a skinny chimney? Starve your orphan slave, until they can fit, then get them in there and light the fire under them so they quickly push through all that soot that will clog their lungs and lead to a painful early death.
There's also the chance the child may get stuck and starve or burn to death. But whatever. Zomg! regulations are bad!
In a way, regulations are a government enforced set of ethics and morals. Not entirely, but very similar.
He completely separates his business decisions from his own personal decisions. He doesn't let ideology get in the way of making sure his business survives, but in his own personal life, he will fight for those ideologies.
If people don't like it, then they should fix the laws.
PHP is the poster-child of what popular isn't always good. At least it's brute-forcing passwords instead of a security vuln.
Only a few hundred years ago it was religion. Flavor of the century. Ass holes will always be assholes, they just use whatever is the current popular tool to accomplish their misdeeds.