Alternatively, Google for keywords like "rack mount", storage, scsi and ide, and you'll find stuff like this, this, this and these. Terribly descriptive, I know;)
Those are all links to rack mount storage (2U to 4U) that present a SCSI interface to a host, and provide managed RAID over 12 to 16 IDE drives. Most include double- or triple-redundant power, dual host controllers, hot standby with automatic rebuild, etc.
So buy yourself a 44U 19" rack, wired for power, with a UPS at the bottom, and properly ventilated. Add in 2 4U PCs with SCSI adapters and gigabit ethernet, and 9 rack mount storage units, for a total of 144 drives. Four such enclosures should take care of 70Tb of storage.
With some careful planning, fibre and clustering, you could have two datacentres a couple of hundred meters apart, each with three enclosures, and a single server presenting a view of one virtual 80Tb drive (assuming 200Gb IDE drives in the system). Clustering allows for hot failover to the other datacentre if this master server fails. The master servers can control replication between the data centres. Not quite offsite, but a whole lot more fault tolerant.
Ah, telecoms:) Is this the industry/application in question, or just hypothetical? There was mention of throughput (indirectly) and availability, but not of latency in the original question. Also there was mention of queuing queries to a back-end database... this doesn't sound like a minimal-latency scenario.
Anyway, the technologies you mention are not likely to be acceptable in such a scenario -- but MOM is quite likely to be appropriate. In fact many cellular services are based on MOM (conceptually, although they may not use commercial MOM products), as you receive requests and send responses based on discrete messages from/to the handset.
Re:OOP Spaghetti and Object Taxonomies
on
The Post-OOP Paradigm
·
· Score: 2, Insightful
Correct answer: #4: There is no absolute taxonomy. Every class hierarchy must be designed with due consideration for the application, that is, the manner in which it is to be used. Speculating about possibly representations is meaningless without such a context.
You haven't given any detail about the nature of the application. You also appear more concerned with achieving high performance than high availability (which you only mention in the title). If this is such a big application why are you even talking about socket connections?
I must assume that you are developing an enterprise application, given your performance and availability needs. Contemporary systems of this nature fall loosely into one of two categories: web technology based, or not.
If you're basing your application on web technology, get someone with appropriate skills (consultant, contract or permanent staff). There is firewall, routing and load balancing hardware available to deal with redundancy and hot failover; leaving you with a farm of application web servers talking to a high availability database (which you can set up as a cluster system or on redundant hardware like Stratus).
If you're not using web technology, then you should be looking at an alternative enterprise technology, not rolling your own and asking about sockets. DCOM,.NET, Java/RMI, EJB, CORBA and MOM are your primary options. Of those there is an increasing leaving towards MOM (Message Oriented Middleware) in enterprise systems, as it offers scalability and ease of integration that the other technologies don't.
So investigate appropriate middleware, including the fault tolerant options that are offered. IBM's WebSphere MQ for example has failover support, and MSMQ can be run as part of a cluster.
You also need to ask yourself why such a high load is required. Do you have a huge number of clients? Does each client send/request a large amount of data? How can you restructure the system to reduce the number and/or size of requests/responses, or at least distribute them so that you don't have a single choke-point?
I can't decide whether your asking this question because you don't really have the experience necessary to design a system of this nature, or because you have enough experience to be comfortable about asking others. Either way, you're probably best off identifying areas of possible technical deficiency, and hiring a domain expert to look at the issues.
I am far more interested in professional behaviour than friendship. Leave your emotions, personal problems, politics, ego and anything else I may not like about you and you may not like about me at home. I don't want to have the opportunity to dislike someone because of their interests, views or behaviour - it makes for trouble in the workplace. And this is a big danger in teambuilding, and why teambuilding often does not work well, especially in the tech community in which you find heros.
I consider myself Pretty Damn Good. That doesn't mean I'm above asking colleagues for help if I think they have experience with a particular problem and can help me solve it faster than I can alone. It also doesn't mean I look down on people who ask me. Being a professional means behaving in a manner which looks out for the interests of the job as well; so learning and teaching are all part of life.
This is actually quite good advice. There are a lot of systems architects that advocate role playing as a means to test your architecture and design. Not AD&D and family, mind you -- each team member is responsible for one or more components of the system (the definition of component depends on the level at which you are testing the system). The the "end user" initiates an action by asking the UI component for something, and (s)he must "ask" (send messages to) other components, until you demonstrate complete end to end functionality of that request, including failure conditions.
All designers go through this process to some extent in analysing their design. The role playing approach gets more minds involved, and its easy to learn the problems to look out for through collaborative experience.
An interesting aside: around 1995 I was running a Cyrix 486 DLC (40Mhz) and experimenting with Linux, but primarily running DOS.
A friend came over, and we played Doom for a while. He mentioned that it was somewhat slower than on his computer, and I was a little peeved, pointing out that he DID have a 66Mhz DX2 with loads more memory. He admitted that, given the difference in machines, the speed wasn't at all bad.
It was only a quarter hour later that I began to realise that Doom was going a little slower than even I was used to. So I figured I would quit, check the config.sys and autoexec.bat and reboot to see if there was any change.
But on quitting I was surprised to find X Windows! I had completely forgotten that I had been running XDoom. But that's not the good part. I had started XDoom as a way to entertain myself while waiting for the kernel to compile... and it was still busy!
Doom on X running at near-DOS framerates... with a kernel compile in the background. Gotta love it.
I'm sorry, but the "Planet Twylite" GPL on www.gnu.org provides that a person can only use GPL'd software in terms of the GPL and undertakes "to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code" (assuming that person had distributed the software with or without modification in source or binary format).
So let's see... original creator establishes Copyright over software and licenses it under the GPL (making that person the licensor). Another party makes changes and distributes the modified software. That party is required under the license to provide to any "third party" the full source code. The original licensor is, relative to the party that made the changes, a "third party". Hence if the software is published (distributed), the licensor has a right to receive the source code modifications.
So it comes down to a semantic argument over the word "submission". In the Microsoft case the party modifying the code is expected to start the process of transferring the modifications back to them. In the GPL case the original licensor must make the first move to obtain the modifications. In other words, you can't refuse to make the modifications available to the original licensor.
While you are technically correct that the "Earth" GPL does not require submission of code back to the licensor; it requires that any third party can obtain that code so long as you have distributed ANY binary or source.
You may wish to read my statement in the context of the entire post, rather than on its own. Use of pointer arithmetic makes it impossible for a tool for analyse a program and determine when it is behaving incorrectly. It is thus impossible for a compiler to automatically insert protective code that will prevent the program from being coerced into behaving badly.
At best you can annotate regions of memory to prevent overflowing them -- but within stricter typing you cannot determine what the correct contents of those regions are.
A trivial example is the BSTR type, which is a NUL terminated wide character string prefixed with a length. The minimum allocatable memory for this type is the length of the string plus 6 bytes (4 for the length, 2 for the wide NUL). It is therefore entirely possible, even with memory range checking, to assign to the BSTR a pointer that points to the length (rather than the first character) or to the terminating NUL. Both are valid values for the pointer, but make the value of the BSTR invalid.
Other examples include buffers that begin with offset tables -- while it may be reasonable to assume that a compiler could have intimate knowledge of a significant type (on Windows) such as BSTR, it is unreasonable to assume that a compiler can have knowledge of every form of UDT that may use offsets. Unless you can force some sort of annotation of every value, and ensure it is not treated as some other type (which completely destroys the use of pointers), you can never protect your memory properly.
I am in no way arguing that C/C++ is being used incorrectly, or that it could be used correctly, or that it is confusing.
The issues we are contending are based on the understandings of two concepts: what is Java, and what is a VM? These sound like easy questions to answer, but they are not.
Java does not refer to the grammer, nor does it refer to the VM. Java refers to a platform. The grammar, execution machine and a basic set of libraries are all included in the specification, and cannot be meaningfully separated without altering the behaviour of the thing we call Java. This is perhaps most clear when you consider that there is a Java assembler, and that it is possible using that assembler to generate byte code that the VM will not execute, as it violate data or logic safety.
The concept of a virutal machine, on the other hand, is also rather blurred. In its most obvious form it is a layer of emulation that presents a complete environment to whatever runs on top of or within it. But almost all modern languages have some level of runtime environment, wihout which they canot function. Any interpretted language must rely on some emulation layer, or be compiled and no longer interpretted.
So when we look at the Java specification and say that it provides for array bounds checks and garbage collection, what is really providing this? It is a combination of the grammar, which facilitates declaring variables in sufficient detail, and the runtime environment, which dynamically checks that the constraints are obeyed.
When compiled to "native code", the scenario is no different. In order to provide array bounds checking and garbage collection, some runtime is required. Whether that runtime is a complete emulation layer, or a set of library routines that are called implicitly, it is still there, moderating and monitoring the behaviour of the software.
So, getting back to the original discussion. Java the grammar and Java the base class libraries make requirements of their execution environment. These can only be satisfied by an execution machine (real or virtual) or a "runtime".
Whether or not we choose to call this runtime a virtual machine does not change the fact that it exists and it presents to the code that uses it an environment that is different from the underlying reality. If you really want to stir the pop, throw in the fact that a virtual machine must be implemented in native code on a real machine (or in in another virtual environment), or the idea that just-in-time compiling in a virtual machine compiles on-the-fly to native code. Does this mean you are no longer running in a virtual environment? To go really over the top, the concept of protected memory and privledged instructions - as used on x86 architecture to enable safe multitasking in operating systems - is often referred to as a virtual machine, and the "native code" doesn't really have full access to the hardware.
So does all of this matter to security? Well, it can. Any code that you run is constrained by the environment in which it is executing. For "native code" that environment is a real machine in which certain privledges have been restricted by the OS in conjunction with the CPU or other hardware. For code in a virtual machine, the VM defines the limits of that environment.
So, taking a program written in any language and compiling it to run as native code, you get instructions that are constrained only by the permissions laid down by the operating system. So, on an x86 architecture where the OS does not force a non-executable heap and/or stack, this makes it possible to "execute data". Targetting a VM such as the Java VM with the same program, and assuming it were possible to inject arbitrary data into the program, it would still be impossible to execute that data, as the VM is aware of what is code and what is data. Even if you could inject and execute arbitrary instructions on Java VM, you could not automatically gain access to all data within that VM or all privledges the VM enjoys in its own execution context, as the VM enforces object-level access controls.
This is complete bullshit. The Java VM is exactly what makes the language safe. Even if you use Java assembler you can't overflow an array boundary, because the VM knows what an array is, knows its limits, and executes the instruction to set an element of the array on your behalf. Most of the RuntimeExceptions and all of the Errors in Java are thrown directly from the VM, not from code libraries.
This means that you can take an arbitrary, untrusted Java executable and run it, and it will not be subject to traditional buffer overflow vulnerabilities.
With a non-interpreted language like C, you can only get this level of trust if you compile the binary. Even then, you can only fully trust C if there is no use of pointer arithmetic, and all pointers are to data of a defined size. Without this knowledge it is not possible to generate code that can check (for example) array bounds.
For example, char* c = (char*)malloc(1000); cannot be statically checked by the compiler. It is possible to develop a compiler that knows of every mechanism that can be used to allocate memory, and annotate a variable with the memory allocated to it, but there are numerous problems with this approach.
It is certainly quite possible to take any sufficiently constrained language and compile it to native code such that it is not susceptible to buffer overflows. But this is quite different from a language-aware VM (like that of Java) where even a malicious binary cannot cause a buffer overflow.
As for C programmers having to live with the various problems you describe -- there are many techniques for avoiding these problems. It is possible to use garbage collection in C++, and get away from direct memory management. Using sized data instead of arbitrary mallocs, and creating classes to represent arrays allows you to overload operators to behave in a safe fashion. The simplest of attention to initialising variables instead of making assumptions because you can squeeze out an extra instruction cycle addresses many string problems.
What you are asking for cannot be done. Worse, it is a dangerous route to go down, because it gives an illusion of safety.
From a VM level you cannot know what a program it up to unless that program obeys certain rules. When dealing with x86 architecture (specifically), those rules are not sufficiently verbose to allow for the sort of checking you are after.
While a VM could intercept all stack access and prevent modification to the return address (presenting stack smashing attacks), it cannot tell if a malicious attack has caused values within a valid range in the stack or heap to be altered in a way that is not supposed to happen. Thus a VM approach would suffer all of the deficiencies of StackGuard.
So while you may be able to protect against a classic buffer overflow attack (overwrite the return address on the stack and jump to your own code), there is no guarantee against arbitrary modification of the behaviour of the software by adjusting variables.
The dangerous part is that you are trying to partition security and look at one aspect of it in isolation. This is shortsighted.
Using permissions, a binary running in a user acocunt is less of a threat to overall system security than a binary running as root -- irrespective of whether there are exploitable vulnerabilities in that binary or not.
Tools like the ptrace-derived sandbox further improve this situation -- an arbitrary binary could be denied access to the file and IO functions in the kernel, preventing a malicious intruder from reading or modifying the hard drive. Or those open function could be filtered by directory. Network access could be restricted, denying the opportunity of using the vulnerability as a springboard to probe behind a firewall.
I don't think this article is a troll. I think its just sadly misguided. I appreciate much of the sentiment though: for the average computer user (who is not technically inclined) the variety of choices offered by OSS is intimidating, and the perceived (or actual) poor quality (or state of completion) of much of this software is effecting efforts to bring OSS to the masses.
This is, however, not the fault of developers. OSS is doing what it was always intended to do, and doing it better than ever. Developers are encouraged to experiment, build and contribute, in whatever way they like. The fault lies in the presentation of OSS software by distributors and major hosting sites, for example RedHat and SourceForge.
RedHat comes on 4 CDs. A first-time user is given more than ample opportunity to shoot themselves in the foot with options, and to choose to use software that is "sub-standard" by general commercial standards. This makes the software look bad, and that reflects on OSS as a whole.
SourceForge and FreshMeat, in searching and browsing, do not by default filter out pre-release software. Worse (IMHO) they do not have a facility to rate software (as is common on shareware sites). That makes it difficult to chose a stable, functional and quality piece of software for a particular purpose. The filtering mechanisms (other than rating) exist, but are not newbie-proof by default.
The message here is that OSS needs to prevent a user- or market-friendly outward appearance, instead of defaulting to hard-core developer modes.
To address two particular issues in the article to which I take exception:
Why gTk? Qt is older and more complete, Wx beats Qt in maturity and comes close to matching it in functionality. Wx also supports many more platforms than gTk, making it far more suitable for cross-platform development -- something OSS needs to support if its platforms are to attact commercial attention. The Wx license is also far more friendly to commercial development than gTk (or Qt).
I am making the implicit statement that commercial == proprietary, because this is how most of the world operates, and that isn't going to change any time soon. Sure, there is software that doesn't follow this model. But not a lot of it.
Next, the idea that all editors should support the OpenOffice format. Besides the fact that many of these editors predate OpenOffice, again have the question: why? What makes the OpenOffice format superior? Is it because it is based on a suckydataencodingfailure called XML? Why not use a mature and powerful DTP standard like Tex?
Years of experience has shown that the golden goal of application interoperability is just not going to happen. Innovation demands going beyond standards and what has been done before. This is the only way that software -- OSS or proprietary -- has been able to progress over time. Linux's attraction compared to traditional Unix platforms comes from its differences.
What you've been asked (volunteered) to do is a risk analysis. This is a whole lot more than being a l33t admin of a high-availability site, and many Slashdotters seem to think. You've hit onto some of the non-technical risks (your CTO example), but to address this properly you need to concentrate on identifying risks, and how to handle them.
The first thing to realise is that you can't have a preformed contingency plan for everything. What you can do is identify every point of risk, weight it according to likelihood and severity, and develop plans for the "likely worst cases" that you discover. The rest of the risk is a business risk, that is, you insure yourself against it and deal with it if and when it happens.
You should also bear in mind that, from a technical viewpoint, there are no absolute guarantees. Almost all high-availability strategies protect against a single point of failure, but this isn't enough. What if you have multiple failures? How quickly can you detect and respond to a failure? How long can you suffer a complete outage (this is really important to know, and "we can't" is not an acceptable answer). Uptime costs money, calculate the point of balance.
Ask Google about "organizational risk" - you'll find a lot of information about auditing risk that can put you on the right path.
Nice idea, but it won't work. You are up against two legal problems: an employee and an employer are separate people; and the employment relationship is a non-trivial contract.
You are right in saying that corporations are juristic persons and can own property. Many corporations own licenses for software, for example. But to allow employees to use those licenses, the license must be assigned to a specific employee.
This is still a legal minefield: some licenses have conditions regarding assignment to other persons, duration of assignment, or even assignment to an instrument (a particular computer). It is generally recognised that an employer has the right to assign property to an employee for use in the course of his/her job, and to reassign that property as dictated by operational requirements. This constitutes fair use on the part of the purchaser.
But fair use and assignment do not give the employer the right to allow unrestricted or simultaneous use of intellectual property that is not owned by the employer (i.e. IP that has been purchased, such as a book, CD, or software).
The same is true for members of corporations -- the relationship between the corporation and a member is still a relationship between different persons having legal status, so a right afforded to a corporation does not necessarily extend to a member of that corporation.
The second problem regards the incorporation or employment. To say the least, a click-through "I agree to be employed" is not sufficient. No court will find that the contract of employment is valid unless the employer is clearly employing, and the employee is being employed and remunerated. While the latter may be true in this case, the former is not satisfied. Similarly the incorporation of a company does not occur on an ad-hoc basis; there are legal procedures that must be followed to introduce members.
Even then, you have to contend with the fact that may people in IT have contracts that prevent them from having other jobs at the same time... not to mention that this practice is questionable under common law.
Any scheme of this nature ultimately comes down to being some sort of (non-sanctioned) library, and therefore illegal under the basic provisions of copyright. Even if you DID find a way to do it, the courts would also certainly find that you are attempting to circumvent the law, and respond appropriately.
I agree wholeheartedly! In fact, it is existing government intervention that I see as the reason that it is so difficult to software engineering to establish itself as a recognised engineering profession.
In any free market economy, the answer to this problem is industry self-regulation. Trademark the "official" title, and have an industry body regulate its use. MS does this with its various "Microsoft Certified" programs.
The entire industry has a vested interest in upholding and advertising the quality of that title. If the regulatory body doesn't do its job properly, the profession falls into disrepute, and the public no longer trust it.
The benefit of this system is that it can work for ANY profession or title, and do so without denying anyone the right to practice without such title. So if you want to go to a programmer (code-monkey) to design your ERP system, that's your problem; if you want to consult a new-age spiritual healer (who can't use the title Medical Doctor), that's also your problem.
Clearly you know nothing about software engineering, or you would have concentrated on knowledge encompassing project management, best practices and quality assurance... in much the same manner as civil engineers don't lay bricks.
The Software Engineering Body of Knowledge is an international effort to codify the knowledge involved in software engineering, in the same way as other engineering disciplines do, with intent to standardise a SE qualification.
Maybe I've got the wrong idea from the interview, but what was discussed was rules, not metadata.
Business rules are a well known aspect of enterprise software development, especially in light of the many old(er) custom-build systems in which the rules were hard-coded. A business rule is "sales tax is 7%", or "customer pays a 1.5% surcharge is payment is more than 2 days late".
Metadata is a partner and also an opposite to a business rule. Metadata is quite simply "data about data". The fact that the value "7%" is "sales tax" is metadata; but the fact that the current value of the sales tax is 7% is not. The age-old concept of a "data dictionary" is an embodiment of what metadata is.
A rules engine is (rather simply) a powerful extension of the practice of declaring constants for significant literals (which are or could be subject to change); quite often one which allows runtime modification of the value rather than requiring a recompile. Rules engines also tend to provide mechanisms for evaluating compliance with the rule, or performing calculations based on rules.
XML falls far short of its goals. See my critique for a far more detailed analysis.
The short of it is that XML contains multiple redundant ways to store data, and implicit within XML is processing of data. This makes it an encoding/format that is prone to implementation errors and security concerns. It took years for the major vendors/creators of XML parsers to achieve interoperability.
A good grammar that meets the goals of XML should have a single, clear structure and be inert (no implicit processing).
The intent of XML is good; the execution is bad. XML could be greatly simplified without losing any of its power other than human readability, which is a goal of questionable virtue as it is.
I think there is a deeper problem being alluded to here, that of loss of intellectual property. Copyright, as if often pointed out, has two sides: the copyright owner gets to exercise control over thir asset, but in the end that asset becomes publish property.
It has long been law and/or practice in most countries that in order to publish a book (or any copyrightable material) a copy must be lodged with the state archive (in the US, the Library of Congress). In order to make a commercial gain off a work it usually requires publication, which means that most works are available in such libraries.
But the web changes that. Publication becomes a lot more informal, and there is no requirement or even encouragement to archive. How, in such a scenario, can we protect against publically accessible information disappearing forever? This material has been published and, at some point, the copyright will expire; it should fall into the public domain. But it most likely won't: over time it will be taken away, and never seen again.
Consider the loss we would face if a valuable repository like Slashdot vanished. Deride it all you like - this is nevertheless a meeting place of (amongst others) some very experienced people with insightful comments, leading to a wealth of information gathered on topics that are discussed. It it not at all uncommon to find a Slashdot discussion when searching for technical information.
archive.org is a start in the process of archiving to prevent this sort of loss -- but how can we move to tackle the problem in a proactive manner?
Any book on IT project management that doesn't cover human resources and project-related procedures and policies isn't worth its weight in water. Many IT activities are (or should be) approached as projects, even upgrades or security or policy changes. For those that aren't (such as routine administration), HR management should be no different from management of other employees: define their job and expect performance.
The responsibilities of IT management should be in your job description. Failing which you shouldn't have got into the position without having a passing knowledge. Your best source of information is in books on Information Systems, which will cover the necessity of IT to business, and the requirements to deliver IT/IS in a manner that supports business. Never forget that IT/IS is a support function.
Policies and procedures are perhaps the most difficult to tackle. It is important to remember that books only form a guidelines - you must tailor the policies for your business. Policies often fail because someone got a list from somewhere and tried to implement it, without understanding the needs of their business.
It is especially important to know what your policies are intended to achieve. Are you trying to make a process faster? More formal and requiring independant approval? More secure? ISO-9000 compliant? Less legally problematic? Start with knowing the goal, then work back to the process.
A quick search on Google and Amazon produced a list of potential titles: "IT Manager's Handbook: Getting Your New Job Done", "The IT Survival Guide", "Foundations of Service Level Management", "IT Policies & Procedures: Tools & Techniques That Work (3rd Edition)", "Information Security Policies, Procedures, and Standards: Guidelines for Effective Information Security Management", "Best Practices in Policies and Procedures", "Establishing a System of Policies and Procedures".
I don't know anything about these books -- they just look like suitable starting points
I am always amazed by the tendency of managers to think of IT people as "different" when it comes to management. This is simply not true. At a push one may point out that most IT people are highly qualified, and that highly qualified people are more mobile and less likely to accept poor working conditions. But the basic principles of HR management apply to ALL employees.
Alternatively, Google for keywords like "rack mount", storage, scsi and ide, and you'll find stuff like this, this, this and these. Terribly descriptive, I know ;)
Those are all links to rack mount storage (2U to 4U) that present a SCSI interface to a host, and provide managed RAID over 12 to 16 IDE drives. Most include double- or triple-redundant power, dual host controllers, hot standby with automatic rebuild, etc.
So buy yourself a 44U 19" rack, wired for power, with a UPS at the bottom, and properly ventilated. Add in 2 4U PCs with SCSI adapters and gigabit ethernet, and 9 rack mount storage units, for a total of 144 drives. Four such enclosures should take care of 70Tb of storage.
With some careful planning, fibre and clustering, you could have two datacentres a couple of hundred meters apart, each with three enclosures, and a single server presenting a view of one virtual 80Tb drive (assuming 200Gb IDE drives in the system). Clustering allows for hot failover to the other datacentre if this master server fails. The master servers can control replication between the data centres. Not quite offsite, but a whole lot more fault tolerant.
Ah, telecoms :) Is this the industry/application in question, or just hypothetical? There was mention of throughput (indirectly) and availability, but not of latency in the original question. Also there was mention of queuing queries to a back-end database ... this doesn't sound like a minimal-latency scenario.
Anyway, the technologies you mention are not likely to be acceptable in such a scenario -- but MOM is quite likely to be appropriate. In fact many cellular services are based on MOM (conceptually, although they may not use commercial MOM products), as you receive requests and send responses based on discrete messages from/to the handset.
Correct answer: #4: There is no absolute taxonomy. Every class hierarchy must be designed with due consideration for the application, that is, the manner in which it is to be used. Speculating about possibly representations is meaningless without such a context.
There is no silver bullet, but there are many golden hammers.
I'm glad someone said this :)
m e(HAMMER_WORKER_NAME)); // // //
Unfortunately real-world practice is usually quite diffenent, even when correctly modeled:
final HAMMER_WORKER_NAME = "worker.hammer";
try {
Worker w = Worker.getInstance(Messages.getInstance().getByNa
Hammer h = Hammer.getFromPool();
Nail n = NailStore.getInstance().createNail();
h.setNail(n);
w.setHammer(h);
Thread t = new Thread(w);
t.start();
} catch (WorkerException e) {
} catch (HammerException e) {
} catch (NailException e) {
}
Sad but true.
You haven't given any detail about the nature of the application. You also appear more concerned with achieving high performance than high availability (which you only mention in the title). If this is such a big application why are you even talking about socket connections?
I must assume that you are developing an enterprise application, given your performance and availability needs. Contemporary systems of this nature fall loosely into one of two categories: web technology based, or not.
If you're basing your application on web technology, get someone with appropriate skills (consultant, contract or permanent staff). There is firewall, routing and load balancing hardware available to deal with redundancy and hot failover; leaving you with a farm of application web servers talking to a high availability database (which you can set up as a cluster system or on redundant hardware like Stratus).
If you're not using web technology, then you should be looking at an alternative enterprise technology, not rolling your own and asking about sockets. DCOM, .NET, Java/RMI, EJB, CORBA and MOM are your primary options. Of those there is an increasing leaving towards MOM (Message Oriented Middleware) in enterprise systems, as it offers scalability and ease of integration that the other technologies don't.
So investigate appropriate middleware, including the fault tolerant options that are offered. IBM's WebSphere MQ for example has failover support, and MSMQ can be run as part of a cluster.
You also need to ask yourself why such a high load is required. Do you have a huge number of clients? Does each client send/request a large amount of data? How can you restructure the system to reduce the number and/or size of requests/responses, or at least distribute them so that you don't have a single choke-point?
I can't decide whether your asking this question because you don't really have the experience necessary to design a system of this nature, or because you have enough experience to be comfortable about asking others. Either way, you're probably best off identifying areas of possible technical deficiency, and hiring a domain expert to look at the issues.
I am far more interested in professional behaviour than friendship. Leave your emotions, personal problems, politics, ego and anything else I may not like about you and you may not like about me at home. I don't want to have the opportunity to dislike someone because of their interests, views or behaviour - it makes for trouble in the workplace. And this is a big danger in teambuilding, and why teambuilding often does not work well, especially in the tech community in which you find heros.
I consider myself Pretty Damn Good. That doesn't mean I'm above asking colleagues for help if I think they have experience with a particular problem and can help me solve it faster than I can alone. It also doesn't mean I look down on people who ask me. Being a professional means behaving in a manner which looks out for the interests of the job as well; so learning and teaching are all part of life.
This is actually quite good advice. There are a lot of systems architects that advocate role playing as a means to test your architecture and design. Not AD&D and family, mind you -- each team member is responsible for one or more components of the system (the definition of component depends on the level at which you are testing the system). The the "end user" initiates an action by asking the UI component for something, and (s)he must "ask" (send messages to) other components, until you demonstrate complete end to end functionality of that request, including failure conditions.
All designers go through this process to some extent in analysing their design. The role playing approach gets more minds involved, and its easy to learn the problems to look out for through collaborative experience.
An interesting aside: around 1995 I was running a Cyrix 486 DLC (40Mhz) and experimenting with Linux, but primarily running DOS.
A friend came over, and we played Doom for a while. He mentioned that it was somewhat slower than on his computer, and I was a little peeved, pointing out that he DID have a 66Mhz DX2 with loads more memory. He admitted that, given the difference in machines, the speed wasn't at all bad.
It was only a quarter hour later that I began to realise that Doom was going a little slower than even I was used to. So I figured I would quit, check the config.sys and autoexec.bat and reboot to see if there was any change.
But on quitting I was surprised to find X Windows! I had completely forgotten that I had been running XDoom. But that's not the good part. I had started XDoom as a way to entertain myself while waiting for the kernel to compile ... and it was still busy!
Doom on X running at near-DOS framerates ... with a kernel compile in the background. Gotta love it.
I'm sorry, but the "Planet Twylite" GPL on www.gnu.org provides that a person can only use GPL'd software in terms of the GPL and undertakes "to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code" (assuming that person had distributed the software with or without modification in source or binary format).
So let's see ... original creator establishes Copyright over software and licenses it under the GPL (making that person the licensor). Another party makes changes and distributes the modified software. That party is required under the license to provide to any "third party" the full source code. The original licensor is, relative to the party that made the changes, a "third party". Hence if the software is published (distributed), the licensor has a right to receive the source code modifications.
So it comes down to a semantic argument over the word "submission". In the Microsoft case the party modifying the code is expected to start the process of transferring the modifications back to them. In the GPL case the original licensor must make the first move to obtain the modifications. In other words, you can't refuse to make the modifications available to the original licensor.
While you are technically correct that the "Earth" GPL does not require submission of code back to the licensor; it requires that any third party can obtain that code so long as you have distributed ANY binary or source.
Hello? GPL? Are we on the same planet?
You may wish to read my statement in the context of the entire post, rather than on its own. Use of pointer arithmetic makes it impossible for a tool for analyse a program and determine when it is behaving incorrectly. It is thus impossible for a compiler to automatically insert protective code that will prevent the program from being coerced into behaving badly.
At best you can annotate regions of memory to prevent overflowing them -- but within stricter typing you cannot determine what the correct contents of those regions are.
A trivial example is the BSTR type, which is a NUL terminated wide character string prefixed with a length. The minimum allocatable memory for this type is the length of the string plus 6 bytes (4 for the length, 2 for the wide NUL). It is therefore entirely possible, even with memory range checking, to assign to the BSTR a pointer that points to the length (rather than the first character) or to the terminating NUL. Both are valid values for the pointer, but make the value of the BSTR invalid.
Other examples include buffers that begin with offset tables -- while it may be reasonable to assume that a compiler could have intimate knowledge of a significant type (on Windows) such as BSTR, it is unreasonable to assume that a compiler can have knowledge of every form of UDT that may use offsets. Unless you can force some sort of annotation of every value, and ensure it is not treated as some other type (which completely destroys the use of pointers), you can never protect your memory properly.
I am in no way arguing that C/C++ is being used incorrectly, or that it could be used correctly, or that it is confusing.
The issues we are contending are based on the understandings of two concepts: what is Java, and what is a VM? These sound like easy questions to answer, but they are not.
Java does not refer to the grammer, nor does it refer to the VM. Java refers to a platform. The grammar, execution machine and a basic set of libraries are all included in the specification, and cannot be meaningfully separated without altering the behaviour of the thing we call Java. This is perhaps most clear when you consider that there is a Java assembler, and that it is possible using that assembler to generate byte code that the VM will not execute, as it violate data or logic safety.
The concept of a virutal machine, on the other hand, is also rather blurred. In its most obvious form it is a layer of emulation that presents a complete environment to whatever runs on top of or within it. But almost all modern languages have some level of runtime environment, wihout which they canot function. Any interpretted language must rely on some emulation layer, or be compiled and no longer interpretted.
So when we look at the Java specification and say that it provides for array bounds checks and garbage collection, what is really providing this? It is a combination of the grammar, which facilitates declaring variables in sufficient detail, and the runtime environment, which dynamically checks that the constraints are obeyed.
When compiled to "native code", the scenario is no different. In order to provide array bounds checking and garbage collection, some runtime is required. Whether that runtime is a complete emulation layer, or a set of library routines that are called implicitly, it is still there, moderating and monitoring the behaviour of the software.
So, getting back to the original discussion. Java the grammar and Java the base class libraries make requirements of their execution environment. These can only be satisfied by an execution machine (real or virtual) or a "runtime".
Whether or not we choose to call this runtime a virtual machine does not change the fact that it exists and it presents to the code that uses it an environment that is different from the underlying reality. If you really want to stir the pop, throw in the fact that a virtual machine must be implemented in native code on a real machine (or in in another virtual environment), or the idea that just-in-time compiling in a virtual machine compiles on-the-fly to native code. Does this mean you are no longer running in a virtual environment? To go really over the top, the concept of protected memory and privledged instructions - as used on x86 architecture to enable safe multitasking in operating systems - is often referred to as a virtual machine, and the "native code" doesn't really have full access to the hardware.
So does all of this matter to security? Well, it can. Any code that you run is constrained by the environment in which it is executing. For "native code" that environment is a real machine in which certain privledges have been restricted by the OS in conjunction with the CPU or other hardware. For code in a virtual machine, the VM defines the limits of that environment.
So, taking a program written in any language and compiling it to run as native code, you get instructions that are constrained only by the permissions laid down by the operating system. So, on an x86 architecture where the OS does not force a non-executable heap and/or stack, this makes it possible to "execute data". Targetting a VM such as the Java VM with the same program, and assuming it were possible to inject arbitrary data into the program, it would still be impossible to execute that data, as the VM is aware of what is code and what is data. Even if you could inject and execute arbitrary instructions on Java VM, you could not automatically gain access to all data within that VM or all privledges the VM enjoys in its own execution context, as the VM enforces object-level access controls.
That's
This is complete bullshit. The Java VM is exactly what makes the language safe. Even if you use Java assembler you can't overflow an array boundary, because the VM knows what an array is, knows its limits, and executes the instruction to set an element of the array on your behalf. Most of the RuntimeExceptions and all of the Errors in Java are thrown directly from the VM, not from code libraries.
This means that you can take an arbitrary, untrusted Java executable and run it, and it will not be subject to traditional buffer overflow vulnerabilities.
With a non-interpreted language like C, you can only get this level of trust if you compile the binary. Even then, you can only fully trust C if there is no use of pointer arithmetic, and all pointers are to data of a defined size. Without this knowledge it is not possible to generate code that can check (for example) array bounds.
For example, char* c = (char*)malloc(1000); cannot be statically checked by the compiler. It is possible to develop a compiler that knows of every mechanism that can be used to allocate memory, and annotate a variable with the memory allocated to it, but there are numerous problems with this approach.
It is certainly quite possible to take any sufficiently constrained language and compile it to native code such that it is not susceptible to buffer overflows. But this is quite different from a language-aware VM (like that of Java) where even a malicious binary cannot cause a buffer overflow.
As for C programmers having to live with the various problems you describe -- there are many techniques for avoiding these problems. It is possible to use garbage collection in C++, and get away from direct memory management. Using sized data instead of arbitrary mallocs, and creating classes to represent arrays allows you to overload operators to behave in a safe fashion. The simplest of attention to initialising variables instead of making assumptions because you can squeeze out an extra instruction cycle addresses many string problems.
What you are asking for cannot be done. Worse, it is a dangerous route to go down, because it gives an illusion of safety.
From a VM level you cannot know what a program it up to unless that program obeys certain rules. When dealing with x86 architecture (specifically), those rules are not sufficiently verbose to allow for the sort of checking you are after.
While a VM could intercept all stack access and prevent modification to the return address (presenting stack smashing attacks), it cannot tell if a malicious attack has caused values within a valid range in the stack or heap to be altered in a way that is not supposed to happen. Thus a VM approach would suffer all of the deficiencies of StackGuard.
So while you may be able to protect against a classic buffer overflow attack (overwrite the return address on the stack and jump to your own code), there is no guarantee against arbitrary modification of the behaviour of the software by adjusting variables.
The dangerous part is that you are trying to partition security and look at one aspect of it in isolation. This is shortsighted.
Using permissions, a binary running in a user acocunt is less of a threat to overall system security than a binary running as root -- irrespective of whether there are exploitable vulnerabilities in that binary or not.
Tools like the ptrace-derived sandbox further improve this situation -- an arbitrary binary could be denied access to the file and IO functions in the kernel, preventing a malicious intruder from reading or modifying the hard drive. Or those open function could be filtered by directory. Network access could be restricted, denying the opportunity of using the vulnerability as a springboard to probe behind a firewall.
There is an interesting Usenix paper relating to these issues. There is a list of sandbox possibilities plus another one here, and you should also check out Medusa. this article also points to several resources on ACLs.
I don't think this article is a troll. I think its just sadly misguided. I appreciate much of the sentiment though: for the average computer user (who is not technically inclined) the variety of choices offered by OSS is intimidating, and the perceived (or actual) poor quality (or state of completion) of much of this software is effecting efforts to bring OSS to the masses.
This is, however, not the fault of developers. OSS is doing what it was always intended to do, and doing it better than ever. Developers are encouraged to experiment, build and contribute, in whatever way they like. The fault lies in the presentation of OSS software by distributors and major hosting sites, for example RedHat and SourceForge.
RedHat comes on 4 CDs. A first-time user is given more than ample opportunity to shoot themselves in the foot with options, and to choose to use software that is "sub-standard" by general commercial standards. This makes the software look bad, and that reflects on OSS as a whole.
SourceForge and FreshMeat, in searching and browsing, do not by default filter out pre-release software. Worse (IMHO) they do not have a facility to rate software (as is common on shareware sites). That makes it difficult to chose a stable, functional and quality piece of software for a particular purpose. The filtering mechanisms (other than rating) exist, but are not newbie-proof by default.
The message here is that OSS needs to prevent a user- or market-friendly outward appearance, instead of defaulting to hard-core developer modes.
To address two particular issues in the article to which I take exception:
Why gTk? Qt is older and more complete, Wx beats Qt in maturity and comes close to matching it in functionality. Wx also supports many more platforms than gTk, making it far more suitable for cross-platform development -- something OSS needs to support if its platforms are to attact commercial attention. The Wx license is also far more friendly to commercial development than gTk (or Qt).
I am making the implicit statement that commercial == proprietary, because this is how most of the world operates, and that isn't going to change any time soon. Sure, there is software that doesn't follow this model. But not a lot of it.
Next, the idea that all editors should support the OpenOffice format. Besides the fact that many of these editors predate OpenOffice, again have the question: why? What makes the OpenOffice format superior? Is it because it is based on a sucky data encoding failure called XML? Why not use a mature and powerful DTP standard like Tex?
Years of experience has shown that the golden goal of application interoperability is just not going to happen. Innovation demands going beyond standards and what has been done before. This is the only way that software -- OSS or proprietary -- has been able to progress over time. Linux's attraction compared to traditional Unix platforms comes from its differences.
What you've been asked (volunteered) to do is a risk analysis. This is a whole lot more than being a l33t admin of a high-availability site, and many Slashdotters seem to think. You've hit onto some of the non-technical risks (your CTO example), but to address this properly you need to concentrate on identifying risks, and how to handle them.
The first thing to realise is that you can't have a preformed contingency plan for everything. What you can do is identify every point of risk, weight it according to likelihood and severity, and develop plans for the "likely worst cases" that you discover. The rest of the risk is a business risk, that is, you insure yourself against it and deal with it if and when it happens.
You should also bear in mind that, from a technical viewpoint, there are no absolute guarantees. Almost all high-availability strategies protect against a single point of failure, but this isn't enough. What if you have multiple failures? How quickly can you detect and respond to a failure? How long can you suffer a complete outage (this is really important to know, and "we can't" is not an acceptable answer). Uptime costs money, calculate the point of balance.
Ask Google about "organizational risk" - you'll find a lot of information about auditing risk that can put you on the right path.
Nice idea, but it won't work. You are up against two legal problems: an employee and an employer are separate people; and the employment relationship is a non-trivial contract.
You are right in saying that corporations are juristic persons and can own property. Many corporations own licenses for software, for example. But to allow employees to use those licenses, the license must be assigned to a specific employee.
This is still a legal minefield: some licenses have conditions regarding assignment to other persons, duration of assignment, or even assignment to an instrument (a particular computer). It is generally recognised that an employer has the right to assign property to an employee for use in the course of his/her job, and to reassign that property as dictated by operational requirements. This constitutes fair use on the part of the purchaser.
But fair use and assignment do not give the employer the right to allow unrestricted or simultaneous use of intellectual property that is not owned by the employer (i.e. IP that has been purchased, such as a book, CD, or software).
The same is true for members of corporations -- the relationship between the corporation and a member is still a relationship between different persons having legal status, so a right afforded to a corporation does not necessarily extend to a member of that corporation.
The second problem regards the incorporation or employment. To say the least, a click-through "I agree to be employed" is not sufficient. No court will find that the contract of employment is valid unless the employer is clearly employing, and the employee is being employed and remunerated. While the latter may be true in this case, the former is not satisfied. Similarly the incorporation of a company does not occur on an ad-hoc basis; there are legal procedures that must be followed to introduce members.
Even then, you have to contend with the fact that may people in IT have contracts that prevent them from having other jobs at the same time ... not to mention that this practice is questionable under common law.
Any scheme of this nature ultimately comes down to being some sort of (non-sanctioned) library, and therefore illegal under the basic provisions of copyright. Even if you DID find a way to do it, the courts would also certainly find that you are attempting to circumvent the law, and respond appropriately.
I agree wholeheartedly! In fact, it is existing government intervention that I see as the reason that it is so difficult to software engineering to establish itself as a recognised engineering profession.
In any free market economy, the answer to this problem is industry self-regulation. Trademark the "official" title, and have an industry body regulate its use. MS does this with its various "Microsoft Certified" programs.
The entire industry has a vested interest in upholding and advertising the quality of that title. If the regulatory body doesn't do its job properly, the profession falls into disrepute, and the public no longer trust it.
The benefit of this system is that it can work for ANY profession or title, and do so without denying anyone the right to practice without such title. So if you want to go to a programmer (code-monkey) to design your ERP system, that's your problem; if you want to consult a new-age spiritual healer (who can't use the title Medical Doctor), that's also your problem.
Clearly you know nothing about software engineering, or you would have concentrated on knowledge encompassing project management, best practices and quality assurance ... in much the same manner as civil engineers don't lay bricks.
The Software Engineering Body of Knowledge is an international effort to codify the knowledge involved in software engineering, in the same way as other engineering disciplines do, with intent to standardise a SE qualification.
Maybe I've got the wrong idea from the interview, but what was discussed was rules, not metadata.
Business rules are a well known aspect of enterprise software development, especially in light of the many old(er) custom-build systems in which the rules were hard-coded. A business rule is "sales tax is 7%", or "customer pays a 1.5% surcharge is payment is more than 2 days late".
Metadata is a partner and also an opposite to a business rule. Metadata is quite simply "data about data". The fact that the value "7%" is "sales tax" is metadata; but the fact that the current value of the sales tax is 7% is not. The age-old concept of a "data dictionary" is an embodiment of what metadata is.
A rules engine is (rather simply) a powerful extension of the practice of declaring constants for significant literals (which are or could be subject to change); quite often one which allows runtime modification of the value rather than requiring a recompile. Rules engines also tend to provide mechanisms for evaluating compliance with the rule, or performing calculations based on rules.
XML falls far short of its goals. See my critique for a far more detailed analysis.
The short of it is that XML contains multiple redundant ways to store data, and implicit within XML is processing of data. This makes it an encoding/format that is prone to implementation errors and security concerns. It took years for the major vendors/creators of XML parsers to achieve interoperability.
A good grammar that meets the goals of XML should have a single, clear structure and be inert (no implicit processing).
The intent of XML is good; the execution is bad. XML could be greatly simplified without losing any of its power other than human readability, which is a goal of questionable virtue as it is.
Shameless self-plug, but I have a critique of XML's failure to meet its goals on my home page. You may find it interesting.
I think there is a deeper problem being alluded to here, that of loss of intellectual property. Copyright, as if often pointed out, has two sides: the copyright owner gets to exercise control over thir asset, but in the end that asset becomes publish property.
It has long been law and/or practice in most countries that in order to publish a book (or any copyrightable material) a copy must be lodged with the state archive (in the US, the Library of Congress). In order to make a commercial gain off a work it usually requires publication, which means that most works are available in such libraries.
But the web changes that. Publication becomes a lot more informal, and there is no requirement or even encouragement to archive. How, in such a scenario, can we protect against publically accessible information disappearing forever? This material has been published and, at some point, the copyright will expire; it should fall into the public domain. But it most likely won't: over time it will be taken away, and never seen again.
Consider the loss we would face if a valuable repository like Slashdot vanished. Deride it all you like - this is nevertheless a meeting place of (amongst others) some very experienced people with insightful comments, leading to a wealth of information gathered on topics that are discussed. It it not at all uncommon to find a Slashdot discussion when searching for technical information.
archive.org is a start in the process of archiving to prevent this sort of loss -- but how can we move to tackle the problem in a proactive manner?
Any book on IT project management that doesn't cover human resources and project-related procedures and policies isn't worth its weight in water. Many IT activities are (or should be) approached as projects, even upgrades or security or policy changes. For those that aren't (such as routine administration), HR management should be no different from management of other employees: define their job and expect performance.
The responsibilities of IT management should be in your job description. Failing which you shouldn't have got into the position without having a passing knowledge. Your best source of information is in books on Information Systems, which will cover the necessity of IT to business, and the requirements to deliver IT/IS in a manner that supports business. Never forget that IT/IS is a support function.
Policies and procedures are perhaps the most difficult to tackle. It is important to remember that books only form a guidelines - you must tailor the policies for your business. Policies often fail because someone got a list from somewhere and tried to implement it, without understanding the needs of their business.
It is especially important to know what your policies are intended to achieve. Are you trying to make a process faster? More formal and requiring independant approval? More secure? ISO-9000 compliant? Less legally problematic? Start with knowing the goal, then work back to the process.
A quick search on Google and Amazon produced a list of potential titles: "IT Manager's Handbook: Getting Your New Job Done", "The IT Survival Guide", "Foundations of Service Level Management", "IT Policies & Procedures: Tools & Techniques That Work (3rd Edition)", "Information Security Policies, Procedures, and Standards: Guidelines for Effective Information Security Management", "Best Practices in Policies and Procedures", "Establishing a System of Policies and Procedures".
I don't know anything about these books -- they just look like suitable starting points
I am always amazed by the tendency of managers to think of IT people as "different" when it comes to management. This is simply not true. At a push one may point out that most IT people are highly qualified, and that highly qualified people are more mobile and less likely to accept poor working conditions. But the basic principles of HR management apply to ALL employees.