I will never understand how flowing money into an area is bad.
Pricing people out of the homes and neighborhoods they're established in is disruptive. If your rents go up by 50% -- or you own, but your property taxes double -- that's a nontrivial personal hardship, particularly for folks who don't have wiggle room in their budgets to start with.
I live in East Austin -- a historically poor neighborhood. Last time I got involved in community governance was interesting -- went to a meeting to discuss whether a developer should be given a license to redevelop a recycling plant into a condominium project.
Half the people there -- including the faction I showed up with -- wanted to insist on mixed-use development with storefront space. The other group -- representing historical neighborhood residents -- wanted to ensure that low-income housing was included in the development. It wasn't feasible to accommodate both of us with the available funding; suffice to say that the debate process was informative.
Pulling off an effective MITM assumes that the ends aren't doing effective mutual validation. Now, that's true much, much more often than it should be, but jumping from "most people do X badly" to "Y's effort to implement X is doomed to failure" isn't exactly a reasonable position when X doesn't violate any theoretical constraints (as so many attempted products do -- "X must have a key to decrypt Y, but must not be able to copy Y", etc).
...and you need to keep control of that vehicle for a few weeks to get it into a friendly port for unloading, during which time (1) folks with guns are doing their best to find you, and (2) you have no hostages to use as bargaining chips if they do so.
That's an awfully high-risk venture to get the kind of talent you'd need to hijack control in the first place [stealing private keys used to encrypt/authenticate the control chanel, etc] to sign off on.
even worse then. you must be paying for the automation with speed.
"Must"?
If you pay attention, you might notice that the languages in question have strong metaprogramming support (and one of them has native immutable data structures with structural-sharing updates and typical log32n performance, and transactional memory baked in deep).
The metaprogramming support means that there's lots of room to do compile-time analysis and optimization, and the native transactional memory support means that the cell abstractions aren't doing much that the language's designers haven't already put a lot of effort into making fast.
Sure, there's performance overhead -- but it's overhead that's built to parallelize well. Whereas traditional locking can deadlock, Clojure ensures that there's always useful work going on somewhere -- the worst case you get into is that other threads' work needs to be thrown away on conflict. That's a model that scales a lot better to the highly parallel hardware of a decade from now than the conventional approaches today.
No; they support only their own hardware (Moto X w/ custom firmware).
On the other hand, it's a no-contract subsidized current-gen phone, and it's the first device I've had where manufacturer firmware is actually an improvement on AOSP.
If I could get reliable cell coverage in my home, I'd pay $200-300 for that.
Switch to Republic, and your voice, SMS and MMS all run over your WiFi, and hand off to Sprint's cell network when out-of-range.
Which gives you reliable coverage in your home, and a deep discount from a typical carrier's monthly rates.
-- Satisfied customer. (Well, moderately satisfied -- Sprint's 4G coverage in Austin was iffy until they got a bunch of tower repairs done; that they let it go for such a long time didn't speak well).
Well. I really shouldn't be letting myself getting sucked into this, but.
A world in which freedom of contract is unhindered is a world in which Shelley v. Kramer would have been differently decided. I prefer not to live in that world.
You can't assure when, where and how variables might be changing in some outer reaches of your program while another part of the program is assuming they are momentarily fixed.
Huh? All the reactive programming frameworks I've used (Python's Trellis, various cell-model approaches in Clojure) have transactional memory, so things never appear to be happening in parallel.
Reactive programming without transactional memory is Doing It Wrong.
A world in which not all promises are enforceable is not a world without enforceable promises.
Neither history or the world we live in presently has any shortage of examples of power or information imbalances resulting in individuals being unable, on a large scale, to effectively represent their own best interests.
But then -- we're continuing a conversation I expect we both know will be ultimately unproductive. Surely we can find a better use of our time.:)
Before continuing this discussion, allow me one question, which I believe should adequately establish whether we have any reasonable chance of being able to reach an agreement on principals, even should we come to agree on facts.
Do you consider it appropriate for a risk pool to contain individuals with varying levels of inherent risk? Consider, for instance, the common case of a risk pool consisting of the set of employees participating in a large enterprise's health plan. Is it appropriate for individuals who are at an inherently high personal risk (on account of genetic predisposition, disability, medical history, or known or predictable factors) to be subsidized by those who are not, or should the pool be stratified into bands by inherent risk level, and thus serving only its traditional role of spreading the costs of unpredictable (and, hopefully, non-clustering) events?
As the person who's building large-scale automation that uses SSH as its backing transport, anything that requires escalation to something that isn't SSH might as well be hands-and-eyes -- it's something my tools can't touch, and if the tools can't touch it, it might as well not exist.
See again, "too large a scale for issue remediation to depend on human involvement".
My own remain quite simple and effective within SysV supporting start stop, restart, reload, and status just fine.
Doing so how? Are they robust against other processes being assigned the PID of something that exited? Are they following LSB exit status conventions? Are they cgroup-aware? And even if you're getting all the details right in your own init scripts, have you gone to the effort of auditing all the vendor-provided ones?
Details matter, and requiring everyone to reinvent that wheel means when a canonical implementation could easily be provided means that in a great many cases the details will be wrong. For someone as concerned with correctness as you are, I'm surprised that isn't more troubling.
If there were no scripts anyway, I don't see how any other init system would have changed anything there. But surely there were scripts for system daemons such as sshd?!?
I said, no acceptable scripts.
OS-provided scripts for sshd are almost universally fire-and-forget, thus unacceptable.
Yes, sshd shouldn't ever exit. Yes, it's a serious bloody problem if it does. But if it exits, and stays dead, and no remediation happens? That's a bigger problem, because that means you need hands-and-eyes to fix it.
Every hospital in the United States is required to provide life saving treatment, regardless of whether you have insurance or not. That hasn't changed and it's not the issue here.
What does the requirement to provide life-saving treatment have to do with anything? It helps people who are so broke that they have no assets, but it doesn't help anyone else.
You have a heart attack, you get treated at a hospital which is required to do so; you're insured but not adequately, and you get a bill for $50,000 more than your insurance covers. Welcome to medical bankruptcy.
Now, how exactly are you supposed to shop around, rather than just taking the first-available treatment? Sure, they're required to provide that treatment whether or not you can pay -- but if you can pay, they're going to do everything in their power to be sure that you will.
In my wife's case, it wasn't a heart attack, but brain surgery -- and while she was in the hospital, her employer went out of business. Her insurance policy disappeared with them, and she was personally on the hook for follow-up care, wiping out years of savings.
The fundamental design of SysV is as good as any, it just needs a new major version number update.
I positively cannot agree.
Look at some of your system's SysV init scripts, and then have a look at some of the run scripts at http://smarden.org/runit/runsc....
Is the configuration complexity you're getting from every process having its own, bespoke set of init scripts rather than something that simple -- coupled with responding to standard signals -- really buying you anything?
As for the scripts, when is the last time you had to rewrite every rc script on a system?
About... two years ago? And that was the third time of many.
Though that was because, in all those cases, there *were* no acceptable scripts to begin with. Detaching processes from their parents, thus losing access to their status without either race-prone gymnastics or polling, is not acceptable by any means whatsoever. (Yes, that doesn't handle service-level status, but well, that's what service-level monitoring and runtime-aware management tools are for).
When is the last time a single policy was the most appropriate for every single daemon on the system?
Haven't had that happen yet, but I've had a single policy be most appropriate for 80% of services. And that policy was "restart".
That's likely to continue to be the case in the future. Look at Go -- the idiomatic approach taken to error handling is to panic and exit. That's not a bad thing -- it's a good thing, because you're getting back into a known state, much for the same reasons that it's appropriate to reboot when you have corrupt kernel state.
I haven't done HPC work, but my impression is that they tend to be pretty homogeneous.
I'm accustomed to working with much more hetrogeneous systems -- a N-node intake system, a NxM-node index cluster, a N-node storage system, a few different datastores behind various frontends... some of these are purely developed-in-house (by a separate dev team, meaning that influence on their code quality means asking nicely), some are commercial software with vendors who release on their own cycle and may or may not build with maintainability in mind, some of them are upstream OSS with a passel of patches.
If you're in a world where you can control the quality of the software, you're in a much happier place than I am.
Back around to systemd -- the restarting functionality isn't really a big deal; you can get that in any modern init system (which, of course, SysV ain't). The really interesting bits in systemd are the same ones that make it dependent on functionality unique to Linux -- tight integration with LXC and the like -- and there's room for legitimate debate about whether keeping all that in-process is the right approach. You may recall that I didn't start this out saying that systemd was great -- I started this thread saying that describing SysV init as "not broken" was wrong on its face, and I stand by that claim.
If you're crashing on memory corruption, you're also serving garbage due to memory errors. Perhaps you should consider going to ECC if it's happening that often. If a DOS attack takes the daemon out, it's got bugs. It's understood that a DOS attack may cause it to not get to requests in a timely manner but it shouldn't actually crash. Bizarre race conditions? That's another word for bug.
Over here in the real world, saying "that's a bug in the code, so it's not my fault that it brought the cluster down" doesn't fly -- if you're ops, your job is to keep the cluster up in the face of badly-written software on individual nodes. Advocate for better design and development practices, absolutely, but that can't mean that we take our services down while we spend a decade rewriting every third-party component.
What happens when the same memory corruption and race conditions send the daemon chasing it's tail but not actually terminating on an error? There will be no SIGCHLD or any other signal.
So if we don't solve everything, we can't solve anything?
Ugly hacks for detecting and remediating that kind of bug exist. The slightly-less-awful ones tend to be runtime-aware (if you're running a model where requests have sole use of a thread that's handling them, for instance, it's able to have considerably less splash damage in terminating a long-running request), making them inappropriate for a one-size-fits-all situation.
If you really just need to restart on process exit, why not a while loop in a shell script? If you want to be notified, add a line to the script to fire off an email to the admin group.
Great. So now we have to write bespoke policy (via individually maintained scripts) for every service in the system, and modify each and every one of those scripts when we want to make a policy change.
Oh, wait, that's the status quo. And it's bloody awful.
As I said elsewhere, guano occurs so sometimes using a restarter as a stopgap makes sense. But that really should be considered an exceptional case, not normal policy and it should certainly be considered a dirty hack. I don't see it being common enough in good practice to build into pid 1.
It's been part of pid 1 for decades; see/etc/inittab.
Moreover, if it's *not* part of pid 1, it's easy to get into a state where your system isn't amenable to any kind of remediation: You have pid 1 but nothing else running? Sorry, only option is a power cycle.
Well - they seems to tell us about new fantastic improved super batteries.
Eh, that's what you get from reading press releases.:)
New chemistries frequently have some particular thing they do really well, and a set of drawbacks. The problem you get is when you read the articles about "new battery has X% more energy density", or "new battery has X% higher charge/discharge rate", and expect to get both of those things in the same battery (much less a battery that isn't making tradeoffs unamenable to consumer use).
And batteries for consumer electronics are getting better over time, they're just not keeping up with best-chemistry-for-X in every factor X. Which isn't a reasonable thing to expect.
Pricing people out of the homes and neighborhoods they're established in is disruptive. If your rents go up by 50% -- or you own, but your property taxes double -- that's a nontrivial personal hardship, particularly for folks who don't have wiggle room in their budgets to start with.
I live in East Austin -- a historically poor neighborhood. Last time I got involved in community governance was interesting -- went to a meeting to discuss whether a developer should be given a license to redevelop a recycling plant into a condominium project.
Half the people there -- including the faction I showed up with -- wanted to insist on mixed-use development with storefront space. The other group -- representing historical neighborhood residents -- wanted to ensure that low-income housing was included in the development. It wasn't feasible to accommodate both of us with the available funding; suffice to say that the debate process was informative.
So people who make less money than you are also less than human?
If you can't appreciate why gentrification is a problem, I suggest that you're living in quite a bubble.
So do I.
Pulling off an effective MITM assumes that the ends aren't doing effective mutual validation. Now, that's true much, much more often than it should be, but jumping from "most people do X badly" to "Y's effort to implement X is doomed to failure" isn't exactly a reasonable position when X doesn't violate any theoretical constraints (as so many attempted products do -- "X must have a key to decrypt Y, but must not be able to copy Y", etc).
...and you need to keep control of that vehicle for a few weeks to get it into a friendly port for unloading, during which time (1) folks with guns are doing their best to find you, and (2) you have no hostages to use as bargaining chips if they do so.
That's an awfully high-risk venture to get the kind of talent you'd need to hijack control in the first place [stealing private keys used to encrypt/authenticate the control chanel, etc] to sign off on.
Or accuracy, apparently. That story was debunked.
"Must"?
If you pay attention, you might notice that the languages in question have strong metaprogramming support (and one of them has native immutable data structures with structural-sharing updates and typical log32n performance, and transactional memory baked in deep).
The metaprogramming support means that there's lots of room to do compile-time analysis and optimization, and the native transactional memory support means that the cell abstractions aren't doing much that the language's designers haven't already put a lot of effort into making fast.
Sure, there's performance overhead -- but it's overhead that's built to parallelize well. Whereas traditional locking can deadlock, Clojure ensures that there's always useful work going on somewhere -- the worst case you get into is that other threads' work needs to be thrown away on conflict. That's a model that scales a lot better to the highly parallel hardware of a decade from now than the conventional approaches today.
That's why "subsidized" is a relevant thing. :)
No; they support only their own hardware (Moto X w/ custom firmware).
On the other hand, it's a no-contract subsidized current-gen phone, and it's the first device I've had where manufacturer firmware is actually an improvement on AOSP.
I switched over from T-Mo. Republic's implementation is considerably better, particularly the handoff support.
Switch to Republic, and your voice, SMS and MMS all run over your WiFi, and hand off to Sprint's cell network when out-of-range.
Which gives you reliable coverage in your home, and a deep discount from a typical carrier's monthly rates.
-- Satisfied customer. (Well, moderately satisfied -- Sprint's 4G coverage in Austin was iffy until they got a bunch of tower repairs done; that they let it go for such a long time didn't speak well).
I do? I didn't notice myself explicitly managing dependencies when using Trellis or Hoplon.
Well. I really shouldn't be letting myself getting sucked into this, but.
A world in which freedom of contract is unhindered is a world in which Shelley v. Kramer would have been differently decided. I prefer not to live in that world.
Huh? All the reactive programming frameworks I've used (Python's Trellis, various cell-model approaches in Clojure) have transactional memory, so things never appear to be happening in parallel.
Reactive programming without transactional memory is Doing It Wrong.
A world in which not all promises are enforceable is not a world without enforceable promises.
Neither history or the world we live in presently has any shortage of examples of power or information imbalances resulting in individuals being unable, on a large scale, to effectively represent their own best interests.
But then -- we're continuing a conversation I expect we both know will be ultimately unproductive. Surely we can find a better use of our time. :)
I care more about equitable outcomes than about freedom of contract. As such, I do not expect our positions to be reconcilable.
Before continuing this discussion, allow me one question, which I believe should adequately establish whether we have any reasonable chance of being able to reach an agreement on principals, even should we come to agree on facts.
Do you consider it appropriate for a risk pool to contain individuals with varying levels of inherent risk? Consider, for instance, the common case of a risk pool consisting of the set of employees participating in a large enterprise's health plan. Is it appropriate for individuals who are at an inherently high personal risk (on account of genetic predisposition, disability, medical history, or known or predictable factors) to be subsidized by those who are not, or should the pool be stratified into bands by inherent risk level, and thus serving only its traditional role of spreading the costs of unpredictable (and, hopefully, non-clustering) events?
As the person who's building large-scale automation that uses SSH as its backing transport, anything that requires escalation to something that isn't SSH might as well be hands-and-eyes -- it's something my tools can't touch, and if the tools can't touch it, it might as well not exist.
See again, "too large a scale for issue remediation to depend on human involvement".
Doing so how? Are they robust against other processes being assigned the PID of something that exited? Are they following LSB exit status conventions? Are they cgroup-aware? And even if you're getting all the details right in your own init scripts, have you gone to the effort of auditing all the vendor-provided ones?
Details matter, and requiring everyone to reinvent that wheel means when a canonical implementation could easily be provided means that in a great many cases the details will be wrong. For someone as concerned with correctness as you are, I'm surprised that isn't more troubling.
I said, no acceptable scripts.
OS-provided scripts for sshd are almost universally fire-and-forget, thus unacceptable.
Yes, sshd shouldn't ever exit. Yes, it's a serious bloody problem if it does. But if it exits, and stays dead, and no remediation happens? That's a bigger problem, because that means you need hands-and-eyes to fix it.
What does the requirement to provide life-saving treatment have to do with anything? It helps people who are so broke that they have no assets, but it doesn't help anyone else.
You have a heart attack, you get treated at a hospital which is required to do so; you're insured but not adequately, and you get a bill for $50,000 more than your insurance covers. Welcome to medical bankruptcy.
Now, how exactly are you supposed to shop around, rather than just taking the first-available treatment? Sure, they're required to provide that treatment whether or not you can pay -- but if you can pay, they're going to do everything in their power to be sure that you will.
In my wife's case, it wasn't a heart attack, but brain surgery -- and while she was in the hospital, her employer went out of business. Her insurance policy disappeared with them, and she was personally on the hook for follow-up care, wiping out years of savings.
I positively cannot agree.
Look at some of your system's SysV init scripts, and then have a look at some of the run scripts at http://smarden.org/runit/runsc....
Is the configuration complexity you're getting from every process having its own, bespoke set of init scripts rather than something that simple -- coupled with responding to standard signals -- really buying you anything?
About... two years ago? And that was the third time of many.
Though that was because, in all those cases, there *were* no acceptable scripts to begin with. Detaching processes from their parents, thus losing access to their status without either race-prone gymnastics or polling, is not acceptable by any means whatsoever. (Yes, that doesn't handle service-level status, but well, that's what service-level monitoring and runtime-aware management tools are for).
Haven't had that happen yet, but I've had a single policy be most appropriate for 80% of services. And that policy was "restart".
That's likely to continue to be the case in the future. Look at Go -- the idiomatic approach taken to error handling is to panic and exit. That's not a bad thing -- it's a good thing, because you're getting back into a known state, much for the same reasons that it's appropriate to reboot when you have corrupt kernel state.
I haven't done HPC work, but my impression is that they tend to be pretty homogeneous.
I'm accustomed to working with much more hetrogeneous systems -- a N-node intake system, a NxM-node index cluster, a N-node storage system, a few different datastores behind various frontends... some of these are purely developed-in-house (by a separate dev team, meaning that influence on their code quality means asking nicely), some are commercial software with vendors who release on their own cycle and may or may not build with maintainability in mind, some of them are upstream OSS with a passel of patches.
If you're in a world where you can control the quality of the software, you're in a much happier place than I am.
Back around to systemd -- the restarting functionality isn't really a big deal; you can get that in any modern init system (which, of course, SysV ain't). The really interesting bits in systemd are the same ones that make it dependent on functionality unique to Linux -- tight integration with LXC and the like -- and there's room for legitimate debate about whether keeping all that in-process is the right approach. You may recall that I didn't start this out saying that systemd was great -- I started this thread saying that describing SysV init as "not broken" was wrong on its face, and I stand by that claim.
Over here in the real world, saying "that's a bug in the code, so it's not my fault that it brought the cluster down" doesn't fly -- if you're ops, your job is to keep the cluster up in the face of badly-written software on individual nodes. Advocate for better design and development practices, absolutely, but that can't mean that we take our services down while we spend a decade rewriting every third-party component.
So if we don't solve everything, we can't solve anything?
Ugly hacks for detecting and remediating that kind of bug exist. The slightly-less-awful ones tend to be runtime-aware (if you're running a model where requests have sole use of a thread that's handling them, for instance, it's able to have considerably less splash damage in terminating a long-running request), making them inappropriate for a one-size-fits-all situation.
Great. So now we have to write bespoke policy (via individually maintained scripts) for every service in the system, and modify each and every one of those scripts when we want to make a policy change.
Oh, wait, that's the status quo. And it's bloody awful.
It's been part of pid 1 for decades; see /etc/inittab.
Moreover, if it's *not* part of pid 1, it's easy to get into a state where your system isn't amenable to any kind of remediation: You have pid 1 but nothing else running? Sorry, only option is a power cycle.
Eh, that's what you get from reading press releases. :)
New chemistries frequently have some particular thing they do really well, and a set of drawbacks. The problem you get is when you read the articles about "new battery has X% more energy density", or "new battery has X% higher charge/discharge rate", and expect to get both of those things in the same battery (much less a battery that isn't making tradeoffs unamenable to consumer use).
And batteries for consumer electronics are getting better over time, they're just not keeping up with best-chemistry-for-X in every factor X. Which isn't a reasonable thing to expect.