They are cheap, almost indestructible, small, low-power and ancient enough to comfortably run any legacy application out there, even under pure DOS. Should one break (which is, in itself, rather unlikely even for heavily used units), full service manuals are available and having lots of them means easy replacements. They have traditional, hardware RS232 and LPT ports, one of each. As long as you need a single machine for a single PLC, X40s should be one of the best tools for the job.
OR if you can't get a job and want to go to the US, contact the people at NYC Resistor, the New York Hackerspace. Just tell them you are interested in those things, you'd like to come and hang around for two weeks around people with similar interests who actually do something with them, and you need a couch to crash on. I'm pretty sure it's going to be a cheaper, more interesting and more educational alternative to a summer camp.
The key, I think, is "properly". I've seen some deployments (none as big as yours, to be honest) that started off similar to that, but as changes went on, security measures went away or were literally worked around because they (obviously) slowed down the development process. The result was a total mess, neither secure nor efficent.
I haven't really had a chance to ever go totally mad with security, but if I somehow get it, I'm sure going to try all of those things listed in the previous post and much, much more.
If you can, split the application in two parts - the font end running on a world-facing web server and a back end on a private network. Use a well-defined, high level protocol for communication between the two. If you can afford (literally, it's just a matter of throwing more hardware at the problem) some overhead, use a text-based serialization format with a solid, well-tested parser. The simpler, the better. Check every single request at the backend in every possible way, data sanity checking at the door is crucial. Maybe sign the requests - won't do any good if someone breaks into the frontend boxes (because they will get the private key then), but will make it impossible to somehow impersonate those boxes without compromising them first. Sign responses. Generate and deploy new keys and certificates often. Use prime numbers (look up cicada principle) for intervals between key changes to avoid being predictable, if you're truly paranoid. Log everything, offsite. Send the logs over a smart network bridge that will let through logs, just logs and only logs, and only in one direction just to be sure. Make this bridge the one and only thing connected to the log server, other than the power cable, a monitor, a keyboard and a tape drive. Preferably use a similar bridge between the frontend and the backend servers, have it do sanity checking of all passing traffic in addition to the checks at the backend. Have different people implement the checks at the backend and at the bridge, do not let them share code. Preferably, use two different parsers for your serialization format of choice. If you can, put the databases on a third layer behind the backend (so that it's only doing business logic, not data storage). Try to embed some basic security in the database itself, especially data integrity checks. Have it roll the transaction back, tell the backend to bugger off and raise an alarm if it's told to do something that doesn't quite fit with the nature of the data. And so on, and so on, and so on. It's all about assuming that every single part of the system can and will contain security holes, but with so many layers, cross-checks and variations on the security measures (like using two different parser implementations for the same check), the probability of someone finding a usable chain of exploits is absurdly low. Remember, exploits have to be used several at a time to actually break into a system and not just DoS it.
I wonder if any web applications that properly implement all those things and more even exist, but it wouldn't hurt to try to make one, if you have the funds.
Oh, and one last thing. The most important one, actually. If you pull this off, your application might be so impervious to hacking in the usual sense of the word that it would be simply impractical to do that, not worth the time and effort. And guess what the determined hacker will probably do at that point? Dress as an air conditioning serviceman, show up at the facility, talk some shit into the guard and walk away with your data 15 minutes later, using equipment no more high-tech than a screwdriver. Or, if your guards are not as dumb as that, a *very* determined hacker might even employ themselves at your air conditioning service, cleaning or electrical work company and do the same the next time they *legitimately* show up at the facility. It's been done. In short, consider other aspects of security, moreso if you're actually a valuable target and almost unhackable through the internet.
This question, and the very act of asking itself, is full of fallacies and silent assumptions.
What is this "general public" you speak of, what does it mean to "still matter", how do you evaluate this property?
I'm using Linux because I am used to it, the tools I need, many subtle features included, are there (and they aren't there on Windows or OS X), and in general I can get the job done in a most timely, convenient and pleasant manner on Linux compared to any other environment out there. So, yeah, it's relevant for me.
Wait, what did you expect, that some ephemeral being called "The General Public" will descend upon this thread and lay pure truth upon it, drawn from its unbound knowledge? Sorry, no cake for you. I know some people have this tendency to readily extrapolate "I" into "we", "everyone" and such and happily provide answers to these questions, but you shouldn't listen to their bullshit. I mean, respect your own intelligence and try to see through dumb generalisations. And don't ask meaningless questions that invoke them.
Off the top of my head: LLVM and CUPS. Please get your facts straight before posting overly general statements. Your posts will be much more difficult to discredit as a whole on the basis of a single "all", "none", "anything" or whatever disproven by a single example to the contrary, thanks to elementary mathematical logic.
If it's sensible, this could be useful in some areas, for some vehicles. Looks like the whole gassification assembly is not exactly a work of precision engineering and could be built in somewhat sub-standard conditions. I'd expect that many third-world plantations of easily gassified produce have lots of leftovers and not all of those have sensible uses to date - some might be just dumped somewhere to rot.
On a different note, if I were the CEO of Starbucks, I'd get such a car as a publicity and marketing stunt, and power it with dried left-overs from brewing.
That's an interesting question. I can't give a definitive answer, but I think such a claim could hold some weight.
Note, however, that if you're an author (or more precisely, a sole copyright holder) of an application, you can't - in the "logical impossibility" sense of "can't" - "violate GPL" by doing that, or just about anything else. You're free to distribute software that is under your copyright in any way and shape you like, but anyone redistributing it after downloading your incomplete source tarball would be unable to comply with the GPL if someone further down the chain asked them to provide the full source.
Yes, you have a point, the comparison to mnemonic assembly output of gcc is a good one. I was trying to find an example such as this, but couldn't think of anything at the moment.
My explanation, however, still answers the OP's question - what was distributed was enough to recreate the binary without raising any suspicions, and that's why this could happen.
The problem in this case is that the concepts of "source code" and "object code" are a bit fuzzy with generated code that is GPL-licensed.
Someone wrote the bison grammar files (which are the missing source code in this case) and "compiled" them, by running bison over them. The resulting files were "object code" in the light of GPL, as they're not really intended nor suitable to be read or edited by a human (and the GPL's definition of source code is "the preferred form of the work for making modifications to it"), but at the same time, they were still technically source code, as in something that can be fed to another compiler, together with the actual source code of Emacs to build the executable Emacs binary.
Thus, the final binary can be recreated from those tarballs just fine, because *technically* it's the full Emacs source code all right. Legally, though, it's not, because of the definitions in GPL.
They're more durable - you can bang one against the desk, throw it around the room all day, then plug it in and it should still work (or, at worst, require fixing a broken solder joint or two, SMD capacitors sometimes fall off the PCB after a strong enough jolt), while no HDD in the world is going to survive that. Maybe people got that confused, the word "reliable" means many different things in layman's speech.
According to Nietsche, God is dead since at least 1882, which means that the source code is in public domain even by the standards of Disney. Sorry, no bonus.
[Feeding a troll? Sure, the bigger a piece, the better, more chance of choking!]
Oh my. I'm not even 25, and I feel the urge to call "get off my lawn" in response to your "old school games" list and the configuration you call "old hardware"...
AFAIK, on a desktop with two discrete graphics cards, you should be able to run Windows and Linux as guests at the same time, each using one card. I'm not sure about disk access, you might want to add a discrete PCI-E SATA controller for one of the systems to avoid any screwups caused by Windows doing something nasty, but other than that, this seems to be perfectly viable. A recent Sandy Bridge-based Core i7, with 8GB of memory on a good P67-based motherboard should run such a software stack with native performance of an SB i5 (roughly half the cache and threads of an i7 available most of the time for each guest) with 4GB of memory (if split evenly), which is more than adequate for everyday use.
The thing is, that those "gateways" can be smart and only allow certain packet types between certain senders and receivers. It is a kind of a very simple firewall, actually. In a C5, it most likely restricts communications only to those packets that were intended to be used by design, so it should let the airbag controller send a 112 request to the stereo, but not let the stereo deploy airbags spontaneously, even if the controller actualy does support triggering over CAN (I have no idea wether it does). I did not poke too much in the "vital" network even through the gateway and I certainly did not try making anything perform some action, only passive queries and some traffic sniffing, so I can't be sure, though. BTW, a CAN gateway also protects from network failure - even if a device gets a short on the bus lines or goes bonkers and floods out all the communication with some crap, or even gets taken over and distrupts it deliberately, the network on the other side of a gateway will still operate properly. Gateways must be prepared for this by design. In a car, this becomes pretty important during a crash - physical damage might short out communication lines and disable whole networks. Thus, we have another good reason to use network separation, or at least signal-level repeaters immune to shorts and noise.
And they actually share the address space without any network segmentation and routing? You know, CAN has something between a NAT and a network bridge - can't remember the term used by the spec right now - which was designed to allow controlled routing between parallel networks precisely for such things as this. I can't believe they wouldn't use that. For example, new Citroen C5s use such routing to separate vital and non-vital networking while allowing certain devices to communicate cross-network for reasons very similar to those you cited. They will even try to use the Bluetooth-connected handset (which is handled by the stereo, so that the music volume goes down when you get a call and the caller is heard in the in-car speakers) to call emergency services after a crash.
Please excuse the offtopic post, I'd just like to ask the poster above a question out of sheer language-related curiousity. I'm not a native English speaker and this phenomenon has been intriguing me for quite some time now.
That is, why do some English-speaking people Tend to capitalize some semi-Random words in their Sentences for no Good Reason?
Could be configuraion syntax. Simple, readable, concise, possible to understand even in very complicated scripts without looking at the manual. I'm using iptables routinely, but I liked pf for its simplicity.
In regard to the placement of the business logic, I think the truth - as always - lies somewhere in the middle and contains a disclaimer to the effect of "not applicable when implemented by morons" and "common sense not included".
Sometimes, the application is all about data and most aspects of the business logic can be summed up as maintaining data integrity. For example, a typical bulletin board updates various counters and lists every time someone posts something - the post counter for the user, the topic, the forum and the category, information about the latest posts for those entities, maybe lists of unread posts for the users (depends on wether those lists are inclusive or exclusive) and loads of other things. Most of those can be moved into triggers, maybe even views with rule-based updates of the actual data, with a great benefit for interoperability. In fact, I did co-author one such bulletin board and the decision to move as much business logic to the database as sensibly possible (and not a thing more) turned out to be correct. Right now, we're implementing some auxiliary functionality in Python (the original code is PHP) and it couldn't be easier. Just about everything is selected using views (which will get a dramatic speedup once I move everything to Pg9, which supports join removal - not that it's slow right now, just noticeable on the server load graphs) and most complex updates are performed using triggers and rules, so the frontend code can be kept simple, and even major updates to the actual business logic and data structure can be contained in the database and completely invisible to the outside world.
There were, however, things that I kept out of the database on purpose, generally due to poor state of interoperability of database engines and programming languages (which is kind of strange, considering that relational databases are such a damn old and mature technology), primarily when it comes to errors. The most prominent is acces control logic. I could engineer the database so that it just refused to accept a post in a forum where the author had no write access (while retaining the possibility of revoking him the access he had and keeping posts he wrote to date without any data integrity problems, of course), but I couldn't think of any elegant, sensible and clean way of turning complex database-generated errors, such as those thrown by constraints or manually in the pl/SQL code, into appropriate exceptions in the application and keep everything in line with transaction handling at the same time. I tried and what I came up with was a sorry kludge, so I just gave up and kept those bits outside of the database. Of course, that meant I had to reimplement some of this in Python and now I have to remember to update two codebases, but the code is simple enough that I can put up with this.
Of course, everything's documented, both in text and diagrams, just in case someone would inherit the code down the road. It's been quite helpful right now, too - there are parts that Just Work and the last time I even looked at the code was a few years ago, right now it is kind of new and unknown even to me when changes are to be made.
In short, I don't think this is about some holy rule that is unconditionally true and you have to "get it" or else. IMHO it's about common sense, practical knowledge in neighboring fields (you're a DBA and you don't know how a CPU cache, a memory controller or a physical hard drive works? Well, you're not going to be a good DBA for high-performance systems, regardless of your knowledge of databases) and experience. All of those together, not just one or two. Only then rules become just guidelines to step over when appropriate.
Besides, with Pg9 and its stream replication and hot standby mode, scalability just shot up through the roof and stopped at about the actual limits of the hardware, where contraptions such as the battery-backed RAM "disk" for WAL store I mentioned earlier come into play to push it even further.
An idiot prepared the server hardware requirements, then. A simple PCI-E card with a few RAM slots, a LiIon battery pack and a faux SATA controller (they are available from a few vendors and cost a few hundred bucks a piece, pretty cheap for such a thing), configured as the WAL store - the database had a write-ahead log, right? - would increase the capacity of a single such server at least tenfold.
The problem wasn't that the databse was used in a wrong way. Rather, it was a lack of a systems integration person in the team, who grasps all the general aspects of the deployment from the frontend down to the bare metal and can identify such problems and find remedies right when they occur.
There is. Sample at twice the Nyquist frequency of the recorded signal and a sample size that gives a sample resolution a tad bigger than what the recording equipment is capable of registering - measurement error formulas from the theory of metrology are your friends, coefficients come from the instruction manual for the microphone. You do know that an analog microphone doesn't have an infinite recording quality, right?
netstat -lpn seems simple enough. I tend to run it every time I change something in a configuration file of a network-enabled service, just to be sure. It would be irresponsible to do otherwise.
If you do, then surely you must have made an educated, conscious decision and made yourself aware of all the consequences? It's not like the GPLed code is not labeled as such - more often than not it's almost over-labeled, with a full notice in every file. What's the problem, again? Don't like it - don't use it. How could it be any simpler?
They are cheap, almost indestructible, small, low-power and ancient enough to comfortably run any legacy application out there, even under pure DOS. Should one break (which is, in itself, rather unlikely even for heavily used units), full service manuals are available and having lots of them means easy replacements. They have traditional, hardware RS232 and LPT ports, one of each. As long as you need a single machine for a single PLC, X40s should be one of the best tools for the job.
Wouldn't they get more if you turned off your computer and donated the money you'd otherwise have to spend on the electricity bill?
OR if you can't get a job and want to go to the US, contact the people at NYC Resistor, the New York Hackerspace. Just tell them you are interested in those things, you'd like to come and hang around for two weeks around people with similar interests who actually do something with them, and you need a couch to crash on. I'm pretty sure it's going to be a cheaper, more interesting and more educational alternative to a summer camp.
The key, I think, is "properly". I've seen some deployments (none as big as yours, to be honest) that started off similar to that, but as changes went on, security measures went away or were literally worked around because they (obviously) slowed down the development process. The result was a total mess, neither secure nor efficent.
I haven't really had a chance to ever go totally mad with security, but if I somehow get it, I'm sure going to try all of those things listed in the previous post and much, much more.
This. Especially the layers.
If you can, split the application in two parts - the font end running on a world-facing web server and a back end on a private network. Use a well-defined, high level protocol for communication between the two. If you can afford (literally, it's just a matter of throwing more hardware at the problem) some overhead, use a text-based serialization format with a solid, well-tested parser. The simpler, the better. Check every single request at the backend in every possible way, data sanity checking at the door is crucial. Maybe sign the requests - won't do any good if someone breaks into the frontend boxes (because they will get the private key then), but will make it impossible to somehow impersonate those boxes without compromising them first. Sign responses. Generate and deploy new keys and certificates often. Use prime numbers (look up cicada principle) for intervals between key changes to avoid being predictable, if you're truly paranoid. Log everything, offsite. Send the logs over a smart network bridge that will let through logs, just logs and only logs, and only in one direction just to be sure. Make this bridge the one and only thing connected to the log server, other than the power cable, a monitor, a keyboard and a tape drive. Preferably use a similar bridge between the frontend and the backend servers, have it do sanity checking of all passing traffic in addition to the checks at the backend. Have different people implement the checks at the backend and at the bridge, do not let them share code. Preferably, use two different parsers for your serialization format of choice. If you can, put the databases on a third layer behind the backend (so that it's only doing business logic, not data storage). Try to embed some basic security in the database itself, especially data integrity checks. Have it roll the transaction back, tell the backend to bugger off and raise an alarm if it's told to do something that doesn't quite fit with the nature of the data. And so on, and so on, and so on. It's all about assuming that every single part of the system can and will contain security holes, but with so many layers, cross-checks and variations on the security measures (like using two different parser implementations for the same check), the probability of someone finding a usable chain of exploits is absurdly low. Remember, exploits have to be used several at a time to actually break into a system and not just DoS it.
I wonder if any web applications that properly implement all those things and more even exist, but it wouldn't hurt to try to make one, if you have the funds.
Oh, and one last thing. The most important one, actually. If you pull this off, your application might be so impervious to hacking in the usual sense of the word that it would be simply impractical to do that, not worth the time and effort. And guess what the determined hacker will probably do at that point? Dress as an air conditioning serviceman, show up at the facility, talk some shit into the guard and walk away with your data 15 minutes later, using equipment no more high-tech than a screwdriver. Or, if your guards are not as dumb as that, a *very* determined hacker might even employ themselves at your air conditioning service, cleaning or electrical work company and do the same the next time they *legitimately* show up at the facility. It's been done. In short, consider other aspects of security, moreso if you're actually a valuable target and almost unhackable through the internet.
This question, and the very act of asking itself, is full of fallacies and silent assumptions.
What is this "general public" you speak of, what does it mean to "still matter", how do you evaluate this property?
I'm using Linux because I am used to it, the tools I need, many subtle features included, are there (and they aren't there on Windows or OS X), and in general I can get the job done in a most timely, convenient and pleasant manner on Linux compared to any other environment out there. So, yeah, it's relevant for me.
Wait, what did you expect, that some ephemeral being called "The General Public" will descend upon this thread and lay pure truth upon it, drawn from its unbound knowledge? Sorry, no cake for you. I know some people have this tendency to readily extrapolate "I" into "we", "everyone" and such and happily provide answers to these questions, but you shouldn't listen to their bullshit. I mean, respect your own intelligence and try to see through dumb generalisations. And don't ask meaningless questions that invoke them.
Off the top of my head: LLVM and CUPS. Please get your facts straight before posting overly general statements. Your posts will be much more difficult to discredit as a whole on the basis of a single "all", "none", "anything" or whatever disproven by a single example to the contrary, thanks to elementary mathematical logic.
If it's sensible, this could be useful in some areas, for some vehicles. Looks like the whole gassification assembly is not exactly a work of precision engineering and could be built in somewhat sub-standard conditions. I'd expect that many third-world plantations of easily gassified produce have lots of leftovers and not all of those have sensible uses to date - some might be just dumped somewhere to rot.
On a different note, if I were the CEO of Starbucks, I'd get such a car as a publicity and marketing stunt, and power it with dried left-overs from brewing.
That's an interesting question. I can't give a definitive answer, but I think such a claim could hold some weight.
Note, however, that if you're an author (or more precisely, a sole copyright holder) of an application, you can't - in the "logical impossibility" sense of "can't" - "violate GPL" by doing that, or just about anything else. You're free to distribute software that is under your copyright in any way and shape you like, but anyone redistributing it after downloading your incomplete source tarball would be unable to comply with the GPL if someone further down the chain asked them to provide the full source.
Yes, you have a point, the comparison to mnemonic assembly output of gcc is a good one. I was trying to find an example such as this, but couldn't think of anything at the moment.
My explanation, however, still answers the OP's question - what was distributed was enough to recreate the binary without raising any suspicions, and that's why this could happen.
The problem in this case is that the concepts of "source code" and "object code" are a bit fuzzy with generated code that is GPL-licensed.
Someone wrote the bison grammar files (which are the missing source code in this case) and "compiled" them, by running bison over them. The resulting files were "object code" in the light of GPL, as they're not really intended nor suitable to be read or edited by a human (and the GPL's definition of source code is "the preferred form of the work for making modifications to it"), but at the same time, they were still technically source code, as in something that can be fed to another compiler, together with the actual source code of Emacs to build the executable Emacs binary.
Thus, the final binary can be recreated from those tarballs just fine, because *technically* it's the full Emacs source code all right. Legally, though, it's not, because of the definitions in GPL.
They're more durable - you can bang one against the desk, throw it around the room all day, then plug it in and it should still work (or, at worst, require fixing a broken solder joint or two, SMD capacitors sometimes fall off the PCB after a strong enough jolt), while no HDD in the world is going to survive that. Maybe people got that confused, the word "reliable" means many different things in layman's speech.
According to Nietsche, God is dead since at least 1882, which means that the source code is in public domain even by the standards of Disney. Sorry, no bonus.
[Feeding a troll? Sure, the bigger a piece, the better, more chance of choking!]
Oh my. I'm not even 25, and I feel the urge to call "get off my lawn" in response to your "old school games" list and the configuration you call "old hardware"...
AFAIK, on a desktop with two discrete graphics cards, you should be able to run Windows and Linux as guests at the same time, each using one card. I'm not sure about disk access, you might want to add a discrete PCI-E SATA controller for one of the systems to avoid any screwups caused by Windows doing something nasty, but other than that, this seems to be perfectly viable. A recent Sandy Bridge-based Core i7, with 8GB of memory on a good P67-based motherboard should run such a software stack with native performance of an SB i5 (roughly half the cache and threads of an i7 available most of the time for each guest) with 4GB of memory (if split evenly), which is more than adequate for everyday use.
The thing is, that those "gateways" can be smart and only allow certain packet types between certain senders and receivers. It is a kind of a very simple firewall, actually. In a C5, it most likely restricts communications only to those packets that were intended to be used by design, so it should let the airbag controller send a 112 request to the stereo, but not let the stereo deploy airbags spontaneously, even if the controller actualy does support triggering over CAN (I have no idea wether it does). I did not poke too much in the "vital" network even through the gateway and I certainly did not try making anything perform some action, only passive queries and some traffic sniffing, so I can't be sure, though. BTW, a CAN gateway also protects from network failure - even if a device gets a short on the bus lines or goes bonkers and floods out all the communication with some crap, or even gets taken over and distrupts it deliberately, the network on the other side of a gateway will still operate properly. Gateways must be prepared for this by design. In a car, this becomes pretty important during a crash - physical damage might short out communication lines and disable whole networks. Thus, we have another good reason to use network separation, or at least signal-level repeaters immune to shorts and noise.
And they actually share the address space without any network segmentation and routing? You know, CAN has something between a NAT and a network bridge - can't remember the term used by the spec right now - which was designed to allow controlled routing between parallel networks precisely for such things as this. I can't believe they wouldn't use that. For example, new Citroen C5s use such routing to separate vital and non-vital networking while allowing certain devices to communicate cross-network for reasons very similar to those you cited. They will even try to use the Bluetooth-connected handset (which is handled by the stereo, so that the music volume goes down when you get a call and the caller is heard in the in-car speakers) to call emergency services after a crash.
Please excuse the offtopic post, I'd just like to ask the poster above a question out of sheer language-related curiousity. I'm not a native English speaker and this phenomenon has been intriguing me for quite some time now.
That is, why do some English-speaking people Tend to capitalize some semi-Random words in their Sentences for no Good Reason?
but I know they can do better than a 386 in 22 fkin years
How do you know? Are you a rocket scientist with a second degree in EE? Dare to show the diplomas?
Could be configuraion syntax. Simple, readable, concise, possible to understand even in very complicated scripts without looking at the manual. I'm using iptables routinely, but I liked pf for its simplicity.
In regard to the placement of the business logic, I think the truth - as always - lies somewhere in the middle and contains a disclaimer to the effect of "not applicable when implemented by morons" and "common sense not included".
Sometimes, the application is all about data and most aspects of the business logic can be summed up as maintaining data integrity. For example, a typical bulletin board updates various counters and lists every time someone posts something - the post counter for the user, the topic, the forum and the category, information about the latest posts for those entities, maybe lists of unread posts for the users (depends on wether those lists are inclusive or exclusive) and loads of other things. Most of those can be moved into triggers, maybe even views with rule-based updates of the actual data, with a great benefit for interoperability. In fact, I did co-author one such bulletin board and the decision to move as much business logic to the database as sensibly possible (and not a thing more) turned out to be correct. Right now, we're implementing some auxiliary functionality in Python (the original code is PHP) and it couldn't be easier. Just about everything is selected using views (which will get a dramatic speedup once I move everything to Pg9, which supports join removal - not that it's slow right now, just noticeable on the server load graphs) and most complex updates are performed using triggers and rules, so the frontend code can be kept simple, and even major updates to the actual business logic and data structure can be contained in the database and completely invisible to the outside world.
There were, however, things that I kept out of the database on purpose, generally due to poor state of interoperability of database engines and programming languages (which is kind of strange, considering that relational databases are such a damn old and mature technology), primarily when it comes to errors. The most prominent is acces control logic. I could engineer the database so that it just refused to accept a post in a forum where the author had no write access (while retaining the possibility of revoking him the access he had and keeping posts he wrote to date without any data integrity problems, of course), but I couldn't think of any elegant, sensible and clean way of turning complex database-generated errors, such as those thrown by constraints or manually in the pl/SQL code, into appropriate exceptions in the application and keep everything in line with transaction handling at the same time. I tried and what I came up with was a sorry kludge, so I just gave up and kept those bits outside of the database. Of course, that meant I had to reimplement some of this in Python and now I have to remember to update two codebases, but the code is simple enough that I can put up with this.
Of course, everything's documented, both in text and diagrams, just in case someone would inherit the code down the road. It's been quite helpful right now, too - there are parts that Just Work and the last time I even looked at the code was a few years ago, right now it is kind of new and unknown even to me when changes are to be made.
In short, I don't think this is about some holy rule that is unconditionally true and you have to "get it" or else. IMHO it's about common sense, practical knowledge in neighboring fields (you're a DBA and you don't know how a CPU cache, a memory controller or a physical hard drive works? Well, you're not going to be a good DBA for high-performance systems, regardless of your knowledge of databases) and experience. All of those together, not just one or two. Only then rules become just guidelines to step over when appropriate.
Besides, with Pg9 and its stream replication and hot standby mode, scalability just shot up through the roof and stopped at about the actual limits of the hardware, where contraptions such as the battery-backed RAM "disk" for WAL store I mentioned earlier come into play to push it even further.
An idiot prepared the server hardware requirements, then. A simple PCI-E card with a few RAM slots, a LiIon battery pack and a faux SATA controller (they are available from a few vendors and cost a few hundred bucks a piece, pretty cheap for such a thing), configured as the WAL store - the database had a write-ahead log, right? - would increase the capacity of a single such server at least tenfold.
The problem wasn't that the databse was used in a wrong way. Rather, it was a lack of a systems integration person in the team, who grasps all the general aspects of the deployment from the frontend down to the bare metal and can identify such problems and find remedies right when they occur.
There is. Sample at twice the Nyquist frequency of the recorded signal and a sample size that gives a sample resolution a tad bigger than what the recording equipment is capable of registering - measurement error formulas from the theory of metrology are your friends, coefficients come from the instruction manual for the microphone. You do know that an analog microphone doesn't have an infinite recording quality, right?
netstat -lpn seems simple enough. I tend to run it every time I change something in a configuration file of a network-enabled service, just to be sure. It would be irresponsible to do otherwise.
If you do, then surely you must have made an educated, conscious decision and made yourself aware of all the consequences? It's not like the GPLed code is not labeled as such - more often than not it's almost over-labeled, with a full notice in every file. What's the problem, again? Don't like it - don't use it. How could it be any simpler?