actually there is parallelism at all sorts of grain sizes, all the way down to instruction level parallelism and loop unrolling/vectorization.
but the interaction between grain size and the shared memory/message passing paradigms is much weaker than you suggest in theory. of course if you're talking about marshalling messages over ethernet clusters then i have to agree with you.
actually there is another class of machines which is shared-memory non-coherent. this breaks the programming model which you're apparently so fond of, but may provide a good middle ground. the synchronization primitives are implemented with atomic ops serialized by the target memory system. i dont think there has been much programming model work in this area.
going back to the comment about disambiguation, i'm assuming here that we're talking about a shared memory hardware implmentation (a precondition for your 'scalable shared memory model' in the first place). and its its a tightly coupled distributed memory machine, then much of the overhead you're talking about is handled by the message passing hardware (especially if it can be handled by a polling model)
and no, i'm not talking about MPI, which is basically imposssible to get any kind of static niceness out of. but rather languages like concurrent smalltalk, oz, erlang, concurrent prologs, derviates of the pi calculus etc.
it's not a trivial issue, but coming out with a blanket statement that 'shared memory scales and message passing doesn't' is what is naive. large heirarchical cached shared memory machines have had a real problem exploiting concurrency. the MTA scaled very well on paper, but had such poor constant time performance that it was difficult to really evaluate in practice.
i think you are missing parents point (or what i infer to be parents point).
if 'messaging' is a compile-time abstraction rather than a runtime library, then this copy only needs to be made if the caller maintains a reference to the data and may mutate it afterwards. think of it is compile-time cow. this property of 'linearity' (bad term) is very helpful in composing abstractions, but it doesn't need to cost.
it's helpful to disambiguate the processor-visible data sharing model from the programming model. they most often map directly, but they dont necessarily need to.
except that only the first page is mirrored, and the next pointers go to the dead site.
if you care, use ib. the linux support is still a little funky, but in terms of application performance for the dollar, its hard to beat. tcp is gorgeous for sharing buffer space in the wide area, but its alot of work for a tightly coupled machine.
hardware has a far more robust copyright and trade secret infrastructure than software. many of the chips you buy include the cost of all of the licenced IP that were used to construct them.
it's not just performance. would you rather spend 2 weeks screwing around with some giant linux build that contains a huge amount of crap you will never use...or write the device specific paths you are going to have to do anyways.
unless you really need address spaces, threads, a shell, and don't mind stuffing a huge amount of flash to support it all..there just isn't any point.
isn't this providing media interoperability at the wrong layer?
the framing and termination guts of the wireless transceiver aren't all that expensive. there are already perfectly good layer 2 and 3 approaches to the problem of distributing the same content over wireless and wired networks'
2) get or build a socket shim and appropriate decode modules for your cpu fsb(assuming intel)
3) track down the physical addresses for your graphics device and memory. the bios should map these the same for each boot if you are lucky.
4) get some traces and write some analysis software to correlate bus issues with responses. one good metric would be the time spent waiting for memory vs the time between issues
5) look at the driver for the graphics card to figure out the indication of when the graphics command pipe stalls. extend your trace analyzer to track these
6) dig through the intel performance event documentation and write or run monitoring code which logs these over time
this should give you a general indication of whether its your cpu, memory system, or graphics card that is the bottleneck. it may be none of the above. you may have to dig deeper because interpreting all that data can be difficult.
good luck!
(note that your system may not work at speed with the analyzer hooked up..in that case stop whining, buy reasonably high end parts and forget the whole thing)
i am wanted to print out a copy of a large RFC because i was working with it heavily. the local kinkos called back and asked if i worked for ISOC, because it had a copyright. i told them that no, although i didn't, the material was free to copy and i could point them at a statement issued by the ISOC to that effect.
no, that wasn't sufficient. i would need to get a signed statement from them explicitly granting that particular shop permission to make physical copies. if they didn't want to limit distribution, why did the document contain a copyright notice?
of course i just found another shop..but as mch vitriol as people have for rms here, dont you think that he is right in this regard? that as the infrastructure gears up to support this property-based model, with increasingly complex rules and ambiguous enforcement, that the exchange of information in general suffers?
its not difficult to imagine moving to a place where it costs money and implies signficant legal risk just to distribute some information. given the rapacity and collusion in the industry, this could easily become a per-instance issue, effectively making it impossible to distribute information freely, regardless of the desires of the author.
i agree about the funding issue. but at one point, university of washington and hank levy in particular did research into fundamental concepts of system design and performance. its very sad that this kind of thing is what the grad students are working on these days.
I'm suprised this didn't come up in any of the other discusisons. If the application requires stability, then you will need time. lots of it. it doesn't matter how many developers you have its going to take time to cook. if the powers that be cant give you 2-3x over what would the nominal development cycle would be, then just forget it outright (or maybe stability isn't all that important?). enough time any you could possibly employ the write-it-twice strategy (which i've never known to succeed in practice)
aside from using a better environment (sounds like erlang might be perfect for you), the only other thing i could suggest that i haven't seen here would be reviews. long boring group sessions with a projector. 1 on 1 peer reviews. print the thing out and take it to the bar with you and make sure that its obviously correct before you even bother having someone else go over it.
your suggestion about restarting failed components is really looking at the problem from the wrong angle i think. what about correctness? build it properly in the first place
if you're going to do something, why do so under the conditions that one hand is tied to your forehead and the other around your ankle?
given a task, i could write an operating system to boot from processor reset, and the application to do that task, in considerably less than the time taken to deal with that. get some balls.
training seminars and certification programs are generally barely useful. they may serve as a general introduction, but really are a form of scam. the person 'teaching' mostly has poorly developed materials, and only a slightly better grasp of the material than you (otherwise they would be actually doing those things and making more money). there are of course exceptions.
the norm in the industry is invariably to teach yourself. get the manual (not the training materials), set up a sand box and just figure it out. all the necessary resources (books, time to study them, test equipment) are directly part of your job and should be treated as such.
if management cant realize that you aren't currently qualified to do your job, and aren't willing to bring in someone else or give you the time to get your hands around it, then i would start looking. and yes, the cultural expectation is that IT isn't a 40 hour job. i'm not defending it, but thats what people have come to expect.
the most valuable quality of someone in your position is the willingness to just dig in and get it done. experience with specific systems, although thats what most places claim they want, isn't anywhere as useful. after working your way through enough seemingly unsolvable problems, you will start to get a knack.
aren't you just drastically increasing the number of system calls you have to pay for?
if you have some knowledge about the natural grouping of data, it would be better to just turn nagle off and do buffering in user space (collect up enough data and send it all in one go)
actually i can recall several fundamentalist rants that directly attacked 'secular humanism' as a force of evil. which i could never understand, there are so few people who would all themselves that. they dont prostyletize. they beleive much of what christians are supposed to beleive, they just dont give money to any church. i would love for a beleiver to try to explain why these people are so damn nasty.
i think its worse than you make out. if i were a graduate student at a university, previously i could publish my results and my source code however i wanted.
now however, there are a group of university ip people sniffing around trying to find out what of my work they can patent in the name of the university.
many professors are complicit in this arrangement, because they are in a great position to buy the patent outright from the university for a nominal fee and start a venture of their own...another activity that the university enthusiastically encourages.
all of which is really to the detrement of the student. i knew a phd student in his fourth or fifth year whose thesis work was pulled out from under him. the unversity patented it, sold it to his advisor who went on indefiniate leave to do a startup with it, and he just had to start over again from scratch.
i know the giant horde of slashdot libertarians will scream and gnash their teeth, but business and education are two fundamentally different endevours.
Surely with such a high volume of traffic you can use a sniffer and get port numbers and packet bodies. its amazing what you can paste in to google and get an immediate answer.
i just scanned the spec, and it looks alot like css...except* that it has built-in support for revocation. which means that the one weak device that leaks the key could possibly be disabled in all future releases of content.
i dont know what this is supposed to mean for the poor people that own a current instance of the weak device, but they certainly spent alot of time thinking about how to do it efficiently.
actually there is parallelism at all sorts of grain sizes, all the way down to instruction level parallelism and loop unrolling/vectorization.
but the interaction between grain size and the shared memory/message passing paradigms is much weaker than you suggest in theory. of course if you're talking about marshalling messages over ethernet clusters then i have to agree with you.
actually there is another class of machines which is shared-memory non-coherent. this breaks the programming model which you're apparently so fond of, but may provide a good middle ground. the synchronization primitives are implemented with atomic ops serialized by the target memory system. i dont think there has been much programming model work in this area.
going back to the comment about disambiguation, i'm assuming here that we're talking about a shared memory hardware implmentation (a precondition for your 'scalable shared memory model' in the first place). and its its a tightly coupled distributed memory machine, then much of the overhead you're talking about is handled by the message passing hardware (especially if it can be handled by a polling model)
and no, i'm not talking about MPI, which is basically imposssible to get any kind of static niceness out of. but rather languages like concurrent smalltalk, oz, erlang, concurrent prologs, derviates of the pi calculus etc.
it's not a trivial issue, but coming out with a blanket statement that 'shared memory scales and message passing doesn't' is what is naive. large heirarchical cached shared memory machines have had a real problem exploiting concurrency. the MTA scaled very well on paper, but had such poor constant time performance that it was difficult to really evaluate in practice.
i think you are missing parents point (or what i infer to be parents point).
if 'messaging' is a compile-time abstraction rather than a runtime library, then this copy only needs to be made if the caller maintains a reference to the data and may mutate it afterwards. think of it is compile-time cow. this property of 'linearity' (bad term) is very helpful in composing abstractions, but it doesn't need to cost.
it's helpful to disambiguate the processor-visible data sharing model from the programming model. they most often map directly, but they dont necessarily need to.
except that only the first page is mirrored, and the next pointers go to the dead site.
if you care, use ib. the linux support is still a little funky, but in terms of application performance for the dollar, its hard to beat. tcp is gorgeous for sharing buffer space in the wide area, but its alot of work for a tightly coupled machine.
perhaps the answer lies in the intent to borrow even more.
speeling aside, just small point.
hardware has a far more robust copyright and trade secret infrastructure than software. many of the chips you buy include the cost of all of the licenced IP that were used to construct them.
it's not just performance. would you rather spend 2 weeks screwing around with some giant linux build that contains a huge amount of crap you will never use...or write the device specific paths you are going to have to do anyways.
unless you really need address spaces, threads, a shell, and don't mind stuffing a huge amount of flash to support it all..there just isn't any point.
i think a more precise interpretation would be:
(iteratively) adapt the system to the application (mix)
isn't this providing media interoperability at the wrong layer?
the framing and termination guts of the wireless transceiver aren't all that expensive. there are already perfectly good layer 2 and 3 approaches to the problem of distributing the same content over wireless and wired networks'
I've been through this before, several times.
expect grossly inflated estimates for work items, accompanied by demands for increases in IT staff and budget
expect negoitated work items to be put aside for other more 'critical' needs and delayed indefinately
expect to need several passes to justify what needs to be done, as IT declares them imposible. the 'security' flag will be thrown at every step
you have entered the realm of company politics
1) get a high-end logic analyzer
2) get or build a socket shim and appropriate decode modules for your cpu fsb(assuming intel)
3) track down the physical addresses for your graphics device and memory. the bios should map these the same for each boot if you are lucky.
4) get some traces and write some analysis software to correlate bus issues with responses. one good metric would be the time spent waiting for memory vs the time between issues
5) look at the driver for the graphics card to figure out the indication of when the graphics command pipe stalls. extend your trace analyzer to track these
6) dig through the intel performance event documentation and write or run monitoring code which logs these over time
this should give you a general indication of whether its your cpu, memory system, or graphics card that is the bottleneck. it may be none of the above. you may have to dig deeper because interpreting all that data can be difficult.
good luck!
(note that your system may not work at speed with the analyzer hooked up..in that case stop whining, buy reasonably high end parts and forget the whole thing)
i am wanted to print out a copy of a large RFC because i was working with it heavily. the local kinkos called back and asked if i worked for ISOC, because it had a copyright. i told them that no, although i didn't, the material was free to copy and i could point them at a statement issued by the ISOC to that effect.
no, that wasn't sufficient. i would need to get a signed statement from them explicitly granting that particular shop permission to make physical copies. if they didn't want to limit distribution, why did the document contain a copyright notice?
of course i just found another shop..but as mch vitriol as people have for rms here, dont you think that he is right in this regard? that as the infrastructure gears up to support this property-based model, with increasingly complex rules and ambiguous enforcement, that the exchange of information in general suffers?
its not difficult to imagine moving to a place where it costs money and implies signficant legal risk just to distribute some information. given the rapacity and collusion in the industry, this could easily become a per-instance issue, effectively making it impossible to distribute information freely, regardless of the desires of the author.
i agree about the funding issue. but at one point, university of washington and hank levy in particular did research into fundamental concepts of system design and performance. its very sad that this kind of thing is what the grad students are working on these days.
try 'locate', its been around for a while
I'm suprised this didn't come up in any of the other discusisons. If the
application requires stability, then you will need time. lots of it. it
doesn't matter how many developers you have its going to take time to
cook. if the powers that be cant give you 2-3x over what would the nominal
development cycle would be, then just forget it outright (or maybe
stability isn't all that important?). enough time any you could possibly
employ the write-it-twice strategy (which i've never known to succeed
in practice)
aside from using a better environment (sounds like erlang might be perfect
for you), the only other thing i could suggest that i haven't seen here
would be reviews. long boring group sessions with a projector. 1 on 1
peer reviews. print the thing out and take it to the bar with you and
make sure that its obviously correct before you even bother having someone
else go over it.
your suggestion about restarting failed components is really looking at
the problem from the wrong angle i think. what about correctness? build
it properly in the first place
really. really.
if you're going to do something, why do so under the conditions that one
hand is tied to your forehead and the other around your ankle?
given a task, i could write an operating system to boot from processor
reset, and the application to do that task, in considerably less than the
time taken to deal with that. get some balls.
training seminars and certification programs are generally barely useful. they may serve as a general introduction, but really are a form of scam. the person 'teaching' mostly has poorly developed materials, and only a slightly better grasp of the material than you (otherwise they would be actually doing those things and making more money). there are
of course exceptions.
the norm in the industry is invariably to teach yourself. get the manual (not the training materials), set up a sand box and just figure it out. all the necessary resources (books, time to study them, test equipment) are directly part of your job and should be treated as such.
if management cant realize that you aren't currently qualified to do your job, and aren't willing to bring in someone else or give you the time to get your hands around it, then i would start looking. and yes, the cultural expectation is that IT isn't a 40 hour job. i'm not defending it, but thats what people have come to expect.
the most valuable quality of someone in your position is the willingness to just dig in and get it done. experience with specific systems, although thats what most places claim they want, isn't anywhere as useful. after working your way through enough seemingly unsolvable problems, you will start to get a knack.
aren't you just drastically increasing the number of system
calls you have to pay for?
if you have some knowledge about the natural grouping of data,
it would be better to just turn nagle off and do buffering
in user space (collect up enough data and send it all in one
go)
actually i can recall several fundamentalist rants that directly attacked
'secular humanism' as a force of evil. which i could never understand,
there are so few people who would all themselves that. they dont prostyletize.
they beleive much of what christians are supposed to beleive, they just dont
give money to any church. i would love for a beleiver to try to explain why
these people are so damn nasty.
the student or the advisor? (grin)
actually, i think that would be the anarchist response, an ideology which
i wholeheartedly approve of.
i think its worse than you make out. if i were a graduate student at
a university, previously i could publish my results and my source code
however i wanted.
now however, there are a group of university ip people sniffing around
trying to find out what of my work they can patent in the name of the
university.
many professors are complicit in this arrangement, because they are
in a great position to buy the patent outright from the university
for a nominal fee and start a venture of their own...another activity
that the university enthusiastically encourages.
all of which is really to the detrement of the student. i knew a phd student
in his fourth or fifth year whose thesis work was pulled out from under
him. the unversity patented it, sold it to his advisor who went on indefiniate
leave to do a startup with it, and he just had to start over again from
scratch.
i know the giant horde of slashdot libertarians will scream and gnash their
teeth, but business and education are two fundamentally different endevours.
even better, when you take that 3d graph embedding and project
it down on a 2d screen, you're exactly where you started from.
there is actually some substantial utility in being able
to draw complicated graphs clearly...but 3d in this case
is really a red herring.
Surely with such a high volume of traffic you can use a sniffer and
get port numbers and packet bodies. its amazing what you can paste in
to google and get an immediate answer.
i just scanned the spec, and it looks alot like css...except*
that it has built-in support for revocation. which means that
the one weak device that leaks the key could possibly be
disabled in all future releases of content.
i dont know what this is supposed to mean for the poor people
that own a current instance of the weak device, but they
certainly spent alot of time thinking about how to do it
efficiently.