A NAT router without special configuration has no way of accepting inward connections My old Intel Express ISDN router do. By default it makes reverse mapping for all ports to the inside PC that triggered the outgoing link.
My current company did something like this back in 2001 with real-time rating performance, which conceptually is much like what you want to do: receive a lot of items and store them in a database, real-time.
But you did not mention some of the more important details about problem:
How much processing has to be done per item?
How long can you delay comitting them to a database?
Do the clients wait for an answer? Can you cheat and respond immediately?
How many simultaneous clients must you support? 1? 5? 100?
What is the hardware budget?
2.000 items/sec means that you must do bulk updates. You cannot flush to disk 2.000 times per second.
So you program will have to store the items temporarily in a buffer, which gets flushed by a secondary thread when a timer expires or when the buffer gets full. use a two-buffer approach so you can stil receive while committing to the database.
Depending on you application it may be beneficial to keep a cache of the most recent items for all instruments.
You also have to consider the disk setup. If you have to store all the items then any multi-disk setup will do. If you actually only store a few items per instrument and update them, then raid-5 will kill you because it performs poorly with tiny scattered updates.
Do you have to backup the items? How will you you handle backups while your program is running? This affects your choice of flat-file or database implementation.
That is also my thought. Google is not evil. At least not now. But that is no guarantee that it will remain so. There is no guarantee that google will not require pay-per-view in 20 years time. Besides, libraries allow you to read anonymously. I am sure google has you IP-address logged somewhere.
I am willing to pay for this through the normal taxes.
Install a file server that is protected with an UPS. Configure automatic shutdown of it. Put the client PCs behind a surge protector (but not on an UPS). Make them boot from the file server drives.
That way the storage/filesystem is reasonably protected and is not smashed everytime there is a brownout. And you don't have to spend the money for a large UPS for the client PCs.
I have encountered some problems regarding the differences in the number of CPUs in the development+test environment, and in the production environment. The development and test systems usually have 1 or 2 CPUs. I am currently hunting a timing-sensistive bug that only occurs on logical partitions with more than 3 CPUs assigned, and we don't have that kind of hardware ourselves. We are not allowed to log into the production system (even though it is only a customer test setup) and we have to debug via additional logging and detailed instructions on how to do a few tests. This is not exactly ideal.
Network setup can also be a difference that is difficult to emulate in development/test setups. You have to emulate network latency, bandwidth, packet drop etc. NIST Net looks good, but it is a bit heavyweight for everyday use. I usually stick to suspending servers/clients with kill -STOP in order to test latencies, packet drop, and timeouts. In one case where latencies really mattered, I ended up being granted access to development servers on 2 different cities and also using my home computer as a remote server.
Another area which can be difficult to emulate is the data size. One thing just generating enough data to match production environments, say 3 million records, but another is generate the data in such a way that the database becomes fragmented in the same way it does in production environments. No real way emulate this properly, except possibly mishandling the database setup to the horror of your DBA:-)
Related to the data size is the difference in data, that being the variation in usernames, the fraction of invalid passwords, fraction of unused accounts, usage patterns, etc. You have to know your customer's environment. Your own customer support people are your friend here, as you can rarely get a complete data dump and traffic logs for the whole week - it is usually sensitive data. The most productive way to handle this is getting a very good grasp of the production environment and the usage patterns, and then spending the time it takes develop a test tool that emulates it closely.
Check out the local job sites (if you speak the language). Otherwise try going via some of the international companies, preferably the medium and small ones. Look at their company page. Do they have an office at the country you are interested in? If so, it will not hurt to contact their (international) HR department and ask if they know if there are any open positions in country X.
And will you guarantee that nothing bad will leach out of the fiber into the drinking water for the house?
I cannot find the article on./ but I seem to recall that a few companies are running fiber in the sewage system - probably for the same reason as you mention.
The previous ways are still valid - but the areas where they are appropriate disappear over time. One example is structured analysis/structured design with DFD, pseudocode, etc., where you go through analysis of current physical system, derive the current logical system, derive the new logical system and finally derive the new physical system. This is still valid in "green field" areas for document processing, but those fields are vanishing. Another example is JSP (Jackson Strutured Programming) which is pretty good for traversing non-recursive data structures and transforming them or generating reports. But today you usually have the data in a database and have a nifty GUI report builder; or you use some form of XSLT. So you rarely use JSP today because the areas where it is appropriate are almost gone.
There also methods that can be used at multiple levels, eg. prototyping, which can be used as a strategy, method, or as a tool. XP has that philosophy at its core, although I fail to see how high-availability is magically implemented by interacting with users.
The waterfall model is still valid for larger government projects where they in general insist on detailed specifications and signing that contract. Iterative development involving re-analysis a no-no because that requires re-signing the contract.
One finally interesting area is maintenance. It probably accounts for more than 80% of the development resources, yet I have never seen any formal method/strategy/tool for handling maintenance/change requests/bugfixes. Is this because maintenance is unsexy?
In Denmark it is very simple. A salesman is not allowed to contact you in person or via phone at your home, workplace or other non-public places, There are 4 exceptions for historic reasons (books, insurance, newspapers, and a limited subset of car help/sick transport services).
No direct links, but you would be looking for the law "Lov om visse forbrugeraftaler", section 2, paragraphs 6-8
You may want to look into Progress. I has its own relational database, but can use Oracle, DB/2, MS-SQL,... Its 4GL is powerful, and you can use its user interface builder to make data entry screens very quickly. Its printing capabilities are moderate, not good, not bad. I don't know the price though. Unfortunately it is mostly geared toward business and is a bit pricey. But who knows - maybe they are hungry and willing to cut a deal?
I am quite happy with my Håg Capisco chair with a "saddle seat" - it automatically makes you adjust your position every now and then. You can get versions with an extra tall lift suitable for use with an elevated desk. You can get different casters depending on the floor type.
I have found kill -s SIGSTOP and kill -s SIGCONT on the server process useful for simulating a temporary network congestion / single packet-drop on a TCP connection.
I have to add a few things on you comments about RPC.
SOAP is a hack to ram things through HTTP
I completely agree. It was borne out of the need to tunnel RPC through HTTP due to misguided and zealous firewall administrators, added with the then-current hype: XML. The result is a bloated protocol.
sunrpc complicated and ugly
It isn't. The interface specification is close to C:
struct Foo {
int x;
string x<>; }; program Boo {
version something
void Bar(Foo f) = 1; }
and after the stubs have been generated by rpcgen you can do stuff like:
However, sunrpc is lacking authentication and its age is showing because it has not been developed the past few years. But the on-the-wire protocol is lighweight (XDR), the encoding/decoding routines very fast, and everything is documented. And everything Unix system supports it. You can also get it for Windows, Java, etc.
CORBA is complicated
Yes. The OMG (Object Management Group) unnecesarily tied it to a naming service, evangalized UML, and assumed that you had full control of the environment where it is running. It doesn't help that the IDL-to-C++ mapping is using non-standard classes (no STL). (but to be fair, OMG did not have a choice at the time). But CORBA IDL is very nice:
But the memory ownership in the IDL-to-C++ is complicated. It does map rather nicely to Java though...
Ideally, I would like to see an IDL like CORBA, with stub generation like rpcgen or idl, with a on-the-wire format like XDR. And without the tie-ins to other components. And forget about objects being portable across the network. Posibility of fine-grained access control would also be nice (defaulting to full access for ease of testing).
The ideal would be if you could get your hands on some of the data from telcos - they store grotesque amounts of information: CDRs, invoices, intermediate processing,... They love data. However, the chances of getting access to it is practially nil due to privacy concerns.
But it should be reasonable easy to generate using a script. Even the process of the creating such a script will teach you a lot. Let's take the CDRs: about 50 columns per row (username, account, usage data, CLID, ANI,...), 3.5mil per day. 1 month. =~50GB. That should be enough to get you started.
However running multiple instances of X on a single computer is pretty new.
No it isn't. The first server listens at port 6000, the second server listens on port 6001, and so on. You specify which server to use with the DISPLAY variable (or the -display parameter) x.x.x.x:y.z where y is the server number. Multiple displays has been supported by X for a long time. Multiple input devices have been at bit less supported, but I guess that some of the CAD engineers early in the '90 have used it.
Virtual displays (ctrl-alt-F1...F6 in xfree86) are newer (middle '90 ?)
I read C't too whenever I travel. It is a very well-balanced magazine having both articles for the beginner (ok, not completely newbies) and for the advanced. It has very comprehensive product comparisons and tests. The Q&A sections are accurate as far as I can tell.
In addition, when I read the magazine on planes chatty people leave me a alone (non-germans thinking "oh no! a german", while germans think "oh no! a computer nerd":-).
A NAT router without special configuration has no way of accepting inward connections
My old Intel Express ISDN router do. By default it makes reverse mapping for all ports to the inside PC that triggered the outgoing link.
2.000 items/sec means that you must do bulk updates. You cannot flush to disk 2.000 times per second. So you program will have to store the items temporarily in a buffer, which gets flushed by a secondary thread when a timer expires or when the buffer gets full. use a two-buffer approach so you can stil receive while committing to the database.
Depending on you application it may be beneficial to keep a cache of the most recent items for all instruments.
You also have to consider the disk setup. If you have to store all the items then any multi-disk setup will do. If you actually only store a few items per instrument and update them, then raid-5 will kill you because it performs poorly with tiny scattered updates.
Do you have to backup the items? How will you you handle backups while your program is running? This affects your choice of flat-file or database implementation.
That is also my thought. Google is not evil. At least not now. But that is no guarantee that it will remain so. There is no guarantee that google will not require pay-per-view in 20 years time. Besides, libraries allow you to read anonymously. I am sure google has you IP-address logged somewhere.
I am willing to pay for this through the normal taxes.
Use UTC internally. Simple as that.
Install a file server that is protected with an UPS. Configure automatic shutdown of it.
Put the client PCs behind a surge protector (but not on an UPS). Make them boot from the file server drives.
That way the storage/filesystem is reasonably protected and is not smashed everytime there is a brownout. And you don't have to spend the money for a large UPS for the client PCs.
You have obviously never tried a certified test, such as the "advanced progressive matrix" which relies on cognitive skills and not memorization.
Network setup can also be a difference that is difficult to emulate in development/test setups. You have to emulate network latency, bandwidth, packet drop etc. NIST Net looks good, but it is a bit heavyweight for everyday use. I usually stick to suspending servers/clients with kill -STOP in order to test latencies, packet drop, and timeouts. In one case where latencies really mattered, I ended up being granted access to development servers on 2 different cities and also using my home computer as a remote server.
Another area which can be difficult to emulate is the data size. One thing just generating enough data to match production environments, say 3 million records, but another is generate the data in such a way that the database becomes fragmented in the same way it does in production environments. No real way emulate this properly, except possibly mishandling the database setup to the horror of your DBA :-)
Related to the data size is the difference in data, that being the variation in usernames, the fraction of invalid passwords, fraction of unused accounts, usage patterns, etc. You have to know your customer's environment. Your own customer support people are your friend here, as you can rarely get a complete data dump and traffic logs for the whole week - it is usually sensitive data. The most productive way to handle this is getting a very good grasp of the production environment and the usage patterns, and then spending the time it takes develop a test tool that emulates it closely.
Does this mean that there is a chance that we will get a CVS implementation that supports IPv6 out-of-the-box? I am getting tired of patching it.
You can't go wrong with IBM and anyone who they have interest in.
Taligent?
It seems that the list favors practitioners and not those who researched the theories. I am missing Codd, Dijkstra and deMarco.
Check out the local job sites (if you speak the language). Otherwise try going via some of the international companies, preferably the medium and small ones. Look at their company page. Do they have an office at the country you are interested in? If so, it will not hurt to contact their (international) HR department and ask if they know if there are any open positions in country X.
I cannot find the article on ./ but I seem to recall that a few companies are running fiber in the sewage system - probably for the same reason as you mention.
The previous ways are still valid - but the areas where they are appropriate disappear over time. One example is structured analysis/structured design with DFD, pseudocode, etc., where you go through analysis of current physical system, derive the current logical system, derive the new logical system and finally derive the new physical system. This is still valid in "green field" areas for document processing, but those fields are vanishing.
Another example is JSP (Jackson Strutured Programming) which is pretty good for traversing non-recursive data structures and transforming them or generating reports. But today you usually have the data in a database and have a nifty GUI report builder; or you use some form of XSLT. So you rarely use JSP today because the areas where it is appropriate are almost gone.
There also methods that can be used at multiple levels, eg. prototyping, which can be used as a strategy, method, or as a tool. XP has that philosophy at its core, although I fail to see how high-availability is magically implemented by interacting with users.
The waterfall model is still valid for larger government projects where they in general insist on detailed specifications and signing that contract. Iterative development involving re-analysis a no-no because that requires re-signing the contract.
One finally interesting area is maintenance. It probably accounts for more than 80% of the development resources, yet I have never seen any formal method/strategy/tool for handling maintenance/change requests/bugfixes. Is this because maintenance is unsexy?
In Denmark it is very simple. A salesman is not allowed to contact you in person or via phone at your home, workplace or other non-public places, There are 4 exceptions for historic reasons (books, insurance, newspapers, and a limited subset of car help/sick transport services).
No direct links, but you would be looking for the law "Lov om visse forbrugeraftaler", section 2, paragraphs 6-8
You may want to look into Progress. I has its own relational database, but can use Oracle, DB/2, MS-SQL, ... Its 4GL is powerful, and you can use its user interface builder to make data entry screens very quickly.
Its printing capabilities are moderate, not good, not bad. I don't know the price though.
Unfortunately it is mostly geared toward business and is a bit pricey. But who knows - maybe they are hungry and willing to cut a deal?
I am quite happy with my Håg Capisco chair with a "saddle seat" - it automatically makes you adjust your position every now and then. You can get versions with an extra tall lift suitable for use with an elevated desk. You can get different casters depending on the floor type.
I have found kill -s SIGSTOP and kill -s SIGCONT on the server process useful for simulating a temporary network congestion / single packet-drop on a TCP connection.
have yet to figure out how to get the client's address from accept() without causing a compile warning on at least one platform.
My guess is that it is on HP-UX with aCC? In that case you have to cast over void*:
SOAP is a hack to ram things through HTTP
I completely agree. It was borne out of the need to tunnel RPC through HTTP due to misguided and zealous firewall administrators, added with the then-current hype: XML. The result is a bloated protocol.
sunrpc complicated and ugly
and after the stubs have been generated by rpcgen you can do stuff like:However, sunrpc is lacking authentication and its age is showing because it has not been developed the past few years. But the on-the-wire protocol is lighweight (XDR), the encoding/decoding routines very fast, and everything is documented. And everything Unix system supports it. You can also get it for Windows, Java, etc.It isn't. The interface specification is close to C:
CORBA is complicated
And it can be used in C++ like this:But the memory ownership in the IDL-to-C++ is complicated. It does map rather nicely to Java though...Yes. The OMG (Object Management Group) unnecesarily tied it to a naming service, evangalized UML, and assumed that you had full control of the environment where it is running. It doesn't help that the IDL-to-C++ mapping is using non-standard classes (no STL). (but to be fair, OMG did not have a choice at the time). But CORBA IDL is very nice:
Ideally, I would like to see an IDL like CORBA, with stub generation like rpcgen or idl, with a on-the-wire format like XDR. And without the tie-ins to other components. And forget about objects being portable across the network. Posibility of fine-grained access control would also be nice (defaulting to full access for ease of testing).
The ideal would be if you could get your hands on some of the data from telcos - they store grotesque amounts of information: CDRs, invoices, intermediate processing, ... They love data. However, the chances of getting access to it is practially nil due to privacy concerns.
...), 3.5mil per day. 1 month. =~50GB. That should be enough to get you started.
But it should be reasonable easy to generate using a script. Even the process of the creating such a script will teach you a lot. Let's take the CDRs: about 50 columns per row (username, account, usage data, CLID, ANI,
No it isn't. The first server listens at port 6000, the second server listens on port 6001, and so on. You specify which server to use with the DISPLAY variable (or the -display parameter) x.x.x.x:y.z where y is the server number. Multiple displays has been supported by X for a long time. Multiple input devices have been at bit less supported, but I guess that some of the CAD engineers early in the '90 have used it.
Virtual displays (ctrl-alt-F1...F6 in xfree86) are newer (middle '90 ?)
I read C't too whenever I travel. It is a very well-balanced magazine having both articles for the beginner (ok, not completely newbies) and for the advanced. It has very comprehensive product comparisons and tests. The Q&A sections are accurate as far as I can tell.
:-).
In addition, when I read the magazine on planes chatty people leave me a alone (non-germans thinking "oh no! a german", while germans think "oh no! a computer nerd"
The appeal will probably take around 5 years. In the meantime ...
I had completely forgotten that case. The ruling was overturned http://www.hypocrites.com/article1931.html
Because that is a law. Ignorance does not release you from following the law.