check your memory usage, your probably thrashing.
the system has to do work to get free pages available for firefox.
With lots of flash active, you end up with memory
leaks. install noscript & flashblock and only permit stuff to run from sites when you need them.
That will slow down the memory leak substantially.
restarting firefox once in a while will help too.
how many meetings have I been in where someone would say...
"why bother configuring a router as a firewall, just get a Cisco PIX and it's all set for you..."
-- folks who think the device will give you security regardless of how it is used.
We need an IDS, an IPS, a web-filter, a layer 7 filter, in-line, out-of-band, etc...
meanwhile the entire corporate network is flat, wireless is bridged into the copper nets on many sites, and folks are using 'drowssap' to secure half the accounts, and systems are two or three years behind current patch levels.
It doesn't matter what stuff you buy if you don't know what you are doing, and don't follow through on the basics first.
Proprietary IP is a very important Asset to VC's and such.
The accountants will definitely need to account for "giving it away" (I prefer the term investing in the open source community... natch;-)
What's wrong is that this should not be
a single project. It should not be a single
suite. projects work well when they can be easily de-composed into independently useful
components. ooo looks like a huge monolith, and nothing is comprehensible to me.
the whole thing needs to be restructured
and done differently.
-- document interaction needs to be a library.
have a libodt or some such. handle document
i/o composition & links, compression, etc...
should be at least two methods of using the library: in memory image & stream. The in-memory image would be used by word processors, and the stream method would be used by filters/processors (such as for printing.)
should have no GUI dependencies. Should be documented as an API, and have multi-language support (especially python)
that's one layer there. Share the layer... with the KDE people, Google people, etc..., etc... get a really good document i/o layer. That's a project all by itself.
Someone can use the library to do a cli-based spell checker, or have a simple 'view' application without starting up the whole ooo.
The key thing is that it is a component that is useful in and of itself, and doesn't need to be integrated in anything.
A level up is analysis & transformations on the document. libodtscan is a generic library to find a bunch of things that match criteria, return the list, then update the list to keep it current as changes are made. This generic function would underlie: spell-checking, tofc/table generation, grammar checking, search & replace.
That layer will operate on in-memory representations of ODT. Again, if exposed as an API, this could be a shared component among all the projects, and get good critical mass easily.
So this libodtscan is a project in an of itself, can use the libodt to do file i/o, but fundamentally operates on the in-memory structure.
After that you have a whole bunch of users of the transformation layer that implement different functions: grammar checking, in a spreadsheet sum
functions etc...,
The UI, should be a layer on top of that.
my old fogey side would like a curses one to start with, and there should be lots of them... one in QT, one in GTK, etc...
the UI's are yet another layer where one can have competing implementations, and communities can build around it.
For all I know, OOO is already somewhat structured like this, but I cannot tell, it's too impenetrable. Open source projects work best when they are small and focused on minutiae, to polish that stone really perfectly. OOO is like polishing Mont Blanc.
and... We are not eels or sharks. humans cannot detect electric fields. to gather EEG's, people put electrodes on your head and listen for very faint signals... It's faint on your scalp, and signals strength drops with the square of the distance.
that makes sense, but it sounds like an implementation issue with the GC. It should be fixable. defragmentation and compaction ought to be part of what GC does once in a while, which ought to address this sort of thing.
I like to brag about klocs too... spent five years
with a team replacing a 700 kloc application with
another one that is 22 klocs, and has more functionality. klocs are important... If it's too big, it's very likely wrong.
I was kind of conflicted in the original answer. I agree with what you're saying, memory management is tricky, and Java will use much more of it in general.
The only part that tripped things up was the point you used to support it was factually dubious. It isn't that C programs return memory when not in use, it's that one tends to be far more careful about allocating memory in C, because the cost is obvious. In Java, it is abstracted away, and assumed to be infinite.
the results are predictable.
Folks need to start up splashfrock.org
where amateur physicians reports what a patient has presented as symptoms, and than self-appointed experts and random inidividuals can write little two paragraph advice about how to amputate limbs, or the proper application of leaches.
There is nothing preventing the OS from paging out 300 MB of unused space in a Java program. So there is still no difference.
On the case of mmap is interesting, but none of the normal user tools (malloc and free) normally make use of it. To do that so that space can be reclaimed, one would have to have garbage collection in place to re-arrange memory blocks in use and permit munmapping. Possible, but certainly not common, and most people claim the lack of GC as one of the principle advantages of using C in the first place. So I don't think that stands up either.
um... someone does not know about python API's. What is missing in pyQT, pyGTK ? If you're only dealing with linux, then pyKDE? in python 2.5 via ctypes, you can access any C-library you want, or using standard, well documented techniques, you can call C from python, or vice versa.
I think the feature set is richer than Java's swing by a country mile.
The other bad thing about Java is that if your program ever needs to use 300 MB, it will *always* use 300 MB forever after.
I am far from being a Java fan, but... if you use 300 MB in a C program, it will *always* user 300MB forever after too. free just returns storage to the process heap, for eventual re-use by malloc within the same process.
agree that FP is not the problem. but folks are blowing the parallelism issue way out of proportion by talking about multithreading and such. There are many opportunities for parallelising things that are easy wins.
take something like the gimp.
There are lots of bits of image processing that can be trivially parallelized. All you need to do is tile the image out, and stitch the results together at the end. Data transfer... just make every transfer a separate job/process/program. You really don't need an uber multi-threaded multi-gizmo downloader. You want a single threaded tool that follows the download, and sends update messages, as it progresses, and exits when it's done. A separate program handles posting the messages to the client. Each download has it's own likely blocking read process. It's far simpler, and trivially intuitive to break many things down into simple tasks communicating asynchronously.
completely agree. FP is really cool and and don't want to take anything away from it, but it completely orthogonal to programming for parallelism. For paralellism. You can use any language, taking to heart using multiple, simple programs in the place of monolithic code hairballs people are used to.
It's not a language thing, all you need are a good approach and a method of doing message passing. If you want the dumbest possible message passing & queueing, just put files in directories. Have used files to do asynchronous message passing at rates of several hundred messages per second, in python...
Each channel on which messages are received is a process. The receiving process decides what tasks need to be done, and "queues" (hard links the message into an input directory) it for downstream actions. This operation is daisy chained as needed. Each "queue" reader is a separate process. We run with a few dozen receivers, a few hundred senders, and half a dozen filters. so a single application with a few hundred tasks, that scales as we add load (additional load is additional senders or receivers.)
each process is just a normal python program, individually quite simple. It's multi-core ready, and dead simple. Testing of smaller programs is easier too. Parallelism is not a problem, it makes life easier, provided you approach it properly.
In the 90's in supercomputing,you had no end of research into better compilers to make distributed memory go away. It turned out that, no matter what one tried to do, stuff went a lot faster when folks programmed to the architecture/hardware, explicitly understanding that memory was distributed, than trying to use tools to hide it.
The availability of multiple CPUs is a fundamental aspect of commodity architectures. If you try to let software people pretend it isn't there, you will get slow, crappy code.
people really do need to embrace paralellism, but folks are making it a lot harder than it needs to be. Multi-core just gives us all kinds of flexibility that we never had before. as a straight-forward example, suddenly, the efficiency arguments of monolithic vs. microkernel, where running services with IPC now becomes a lot more exciting.
At the application level, people need to stop thinking about single programs. It's only natural because people were trained that way, but for the next generation, it will seem odd. Individual programs will be simpler. Forget multi-threading, don't try to build large single applications. This is right up the linux/unix tool based alley. Lots of communicating sequential processes (apologies to Hoare) with individually simple elements, and each component being easy to test individually.
Build algorithms with loosely coupled, asynchronous communications, and simple components. People get themselves tied into horrible knots because they try to get serialism out of multi-processing, and in doing so, they kill the very thing they are looking for: performance.
No, you don't need a single log file, 200 log files + dsh & grep will do. No you do not want to serialize writes into queues, just give the queue entries hashed indices, so that inserts don't clash. Etc... It's a bit of a mind flip, but the water is fine.
google the author, and dig the person out, then
send him/her a checque or paypal him. Send him 5$ or something, for the use of the pirate copy. That will get the author some compensation (likely more than what a normal royalty would per copy.), and get you ethical access the book.
Even better, tell the publisher...you paid the author because the publisher is not distributing the work. Publisher then has a choice:
They can sue you and the author, and try
to make the case that what you're doing is illegal... (how do they prove harm? they don't publish it?) my guess about that option is: Striesand Effect. They widely publicize an arrangement that undermines their business model. Not too smart...
Or publisher can consider it marketing data, and try to figure out how to make nothing go "out of print" (have everything available.)
Depending on the details of the author's deal, the last option might be to cut the publisher out completely, and he might be able to sell copies from a web site. You might be able to convince him to sell them on, lulu, E-bay or something.
People are always coming out of the wood work to claim supercomputer performance with such and such
a solution, go back and look at GRAPE (which is really cool.)
http://arstechnica.com/news.ars/post/20061212-8408.html
or a lot of other supercomputer clusters. When you want something flexible, you look for "balance"
that means a good relationship between memory capacity, latency & bandwidth, as well as computer power.
in terms of memory capacity, the number people talk about is:
1 byte/flop... that is 1 Tbyte of memory is about right to keep 1 TFLOP flexibly useful.
this thing has 4 G of memory for 4 TF... in other words:
1 byte / 1000 flops.
it's going to be hard to use in a general purpose way.
This also brings the issue of how dose one get experience.
I take my experience once a day, every day.
If I skip a day, I might take two, but that can be dangerous. When bad things happen, I take, like ten. After a while I get habituated.
Thanks... I was fascinated at apple having Andrew File System on by default. That is a really cool, theoretically campus-area network file system.
I tried it about ten years ago, and it was a bear
to set up back then and not very stable on linux (it was a UNIX thing back then.)
Anybody used it recently?
Plus, the NSA would probably shit a brick if the Pres had a Blackberry since every BES packet flows through a foreign country.
ahem, going through Canada would mean NSA would be allowed to intercept it. NSA isn't allowed to intercept domestic communications. They usually asks the Brits to spy on US soil
http://news.bbc.co.uk/2/hi/uk_news/politics/3488548.stm
This gives them a short cut. If 20% of americans know in their gut that Obama is Muslim, (and all muslims are fire breathing terrorist dragons) they need to keep an eye... err... ear on him.
I upgraded my myth server last week and my laptop this morning. It works very well for me. nothing crashes. everything works better than before. got wobbly windows (KDE 4.1) which seemed pointless to me, but I actually really like it. looking forward to 9.04 having the cube and cylinder effects.
my 945G based laptop runs fine under kubuntu8.10 using KDE 4.1 with fully wobbly windows. I tried glxgears and got 736.569 FPS, which doesnÂt wow me, but it seems to work OK.
a worn gear in a mechanical voting machine to do the same. A human could 'mis-count'...
the solution is always the same, multiple counts should be routine. Ideally, another computer system could OCR the printed receipts... now sure the bad people can modify two systems, but it's starting to get complicated
check your memory usage, your probably thrashing. the system has to do work to get free pages available for firefox. With lots of flash active, you end up with memory leaks. install noscript & flashblock and only permit stuff to run from sites when you need them. That will slow down the memory leak substantially. restarting firefox once in a while will help too.
how many meetings have I been in where someone would say... "why bother configuring a router as a firewall, just get a Cisco PIX and it's all set for you..." -- folks who think the device will give you security regardless of how it is used. We need an IDS, an IPS, a web-filter, a layer 7 filter, in-line, out-of-band, etc... meanwhile the entire corporate network is flat, wireless is bridged into the copper nets on many sites, and folks are using 'drowssap' to secure half the accounts, and systems are two or three years behind current patch levels. It doesn't matter what stuff you buy if you don't know what you are doing, and don't follow through on the basics first.
this month's cheque bounced. they got raided. nobody else is going to skip a payment...
Proprietary IP is a very important Asset to VC's and such. The accountants will definitely need to account for "giving it away" (I prefer the term investing in the open source community... natch ;-)
the whole thing needs to be restructured and done differently.
-- document interaction needs to be a library. have a libodt or some such. handle document i/o composition & links, compression, etc... should be at least two methods of using the library: in memory image & stream. The in-memory image would be used by word processors, and the stream method would be used by filters/processors (such as for printing.)
should have no GUI dependencies. Should be documented as an API, and have multi-language support (especially python)
that's one layer there. Share the layer... with the KDE people, Google people, etc..., etc... get a really good document i/o layer. That's a project all by itself.
Someone can use the library to do a cli-based spell checker, or have a simple 'view' application without starting up the whole ooo. The key thing is that it is a component that is useful in and of itself, and doesn't need to be integrated in anything.
A level up is analysis & transformations on the document. libodtscan is a generic library to find a bunch of things that match criteria, return the list, then update the list to keep it current as changes are made. This generic function would underlie: spell-checking, tofc/table generation, grammar checking, search & replace.
That layer will operate on in-memory representations of ODT. Again, if exposed as an API, this could be a shared component among all the projects, and get good critical mass easily. So this libodtscan is a project in an of itself, can use the libodt to do file i/o, but fundamentally operates on the in-memory structure.
After that you have a whole bunch of users of the transformation layer that implement different functions: grammar checking, in a spreadsheet sum functions etc...,
The UI, should be a layer on top of that. my old fogey side would like a curses one to start with, and there should be lots of them... one in QT, one in GTK, etc...
the UI's are yet another layer where one can have competing implementations, and communities can build around it.
For all I know, OOO is already somewhat structured like this, but I cannot tell, it's too impenetrable. Open source projects work best when they are small and focused on minutiae, to polish that stone really perfectly. OOO is like polishing Mont Blanc.
and... We are not eels or sharks. humans cannot detect electric fields. to gather EEG's, people put electrodes on your head and listen for very faint signals... It's faint on your scalp, and signals strength drops with the square of the distance.
that makes sense, but it sounds like an implementation issue with the GC. It should be fixable. defragmentation and compaction ought to be part of what GC does once in a while, which ought to address this sort of thing.
I like to brag about klocs too... spent five years with a team replacing a 700 kloc application with another one that is 22 klocs, and has more functionality. klocs are important... If it's too big, it's very likely wrong.
I was kind of conflicted in the original answer. I agree with what you're saying, memory management is tricky, and Java will use much more of it in general. The only part that tripped things up was the point you used to support it was factually dubious. It isn't that C programs return memory when not in use, it's that one tends to be far more careful about allocating memory in C, because the cost is obvious. In Java, it is abstracted away, and assumed to be infinite. the results are predictable.
Folks need to start up splashfrock.org where amateur physicians reports what a patient has presented as symptoms, and than self-appointed experts and random inidividuals can write little two paragraph advice about how to amputate limbs, or the proper application of leaches.
On the case of mmap is interesting, but none of the normal user tools (malloc and free) normally make use of it. To do that so that space can be reclaimed, one would have to have garbage collection in place to re-arrange memory blocks in use and permit munmapping. Possible, but certainly not common, and most people claim the lack of GC as one of the principle advantages of using C in the first place. So I don't think that stands up either.
um... someone does not know about python API's. What is missing in pyQT, pyGTK ? If you're only dealing with linux, then pyKDE? in python 2.5 via ctypes, you can access any C-library you want, or using standard, well documented techniques, you can call C from python, or vice versa. I think the feature set is richer than Java's swing by a country mile.
The other bad thing about Java is that if your program ever needs to use 300 MB, it will *always* use 300 MB forever after.
I am far from being a Java fan, but... if you use 300 MB in a C program, it will *always* user 300MB forever after too. free just returns storage to the process heap, for eventual re-use by malloc within the same process.
take something like the gimp. There are lots of bits of image processing that can be trivially parallelized. All you need to do is tile the image out, and stitch the results together at the end. Data transfer... just make every transfer a separate job/process/program. You really don't need an uber multi-threaded multi-gizmo downloader. You want a single threaded tool that follows the download, and sends update messages, as it progresses, and exits when it's done. A separate program handles posting the messages to the client. Each download has it's own likely blocking read process. It's far simpler, and trivially intuitive to break many things down into simple tasks communicating asynchronously.
http://metpx.sf.net/
Each channel on which messages are received is a process. The receiving process decides what tasks need to be done, and "queues" (hard links the message into an input directory) it for downstream actions. This operation is daisy chained as needed. Each "queue" reader is a separate process. We run with a few dozen receivers, a few hundred senders, and half a dozen filters. so a single application with a few hundred tasks, that scales as we add load (additional load is additional senders or receivers.)
each process is just a normal python program, individually quite simple. It's multi-core ready, and dead simple. Testing of smaller programs is easier too. Parallelism is not a problem, it makes life easier, provided you approach it properly.
The availability of multiple CPUs is a fundamental aspect of commodity architectures. If you try to let software people pretend it isn't there, you will get slow, crappy code. people really do need to embrace paralellism, but folks are making it a lot harder than it needs to be. Multi-core just gives us all kinds of flexibility that we never had before. as a straight-forward example, suddenly, the efficiency arguments of monolithic vs. microkernel, where running services with IPC now becomes a lot more exciting.
At the application level, people need to stop thinking about single programs. It's only natural because people were trained that way, but for the next generation, it will seem odd. Individual programs will be simpler. Forget multi-threading, don't try to build large single applications. This is right up the linux/unix tool based alley. Lots of communicating sequential processes (apologies to Hoare) with individually simple elements, and each component being easy to test individually.
Build algorithms with loosely coupled, asynchronous communications, and simple components. People get themselves tied into horrible knots because they try to get serialism out of multi-processing, and in doing so, they kill the very thing they are looking for: performance.
No, you don't need a single log file, 200 log files + dsh & grep will do. No you do not want to serialize writes into queues, just give the queue entries hashed indices, so that inserts don't clash. Etc... It's a bit of a mind flip, but the water is fine.
Even better, tell the publisher...you paid the author because the publisher is not distributing the work. Publisher then has a choice: They can sue you and the author, and try to make the case that what you're doing is illegal... (how do they prove harm? they don't publish it?) my guess about that option is: Striesand Effect. They widely publicize an arrangement that undermines their business model. Not too smart...
Or publisher can consider it marketing data, and try to figure out how to make nothing go "out of print" (have everything available.)
Depending on the details of the author's deal, the last option might be to cut the publisher out completely, and he might be able to sell copies from a web site. You might be able to convince him to sell them on, lulu, E-bay or something.
People are always coming out of the wood work to claim supercomputer performance with such and such a solution, go back and look at GRAPE (which is really cool.) http://arstechnica.com/news.ars/post/20061212-8408.html or a lot of other supercomputer clusters. When you want something flexible, you look for "balance" that means a good relationship between memory capacity, latency & bandwidth, as well as computer power. in terms of memory capacity, the number people talk about is: 1 byte/flop... that is 1 Tbyte of memory is about right to keep 1 TFLOP flexibly useful. this thing has 4 G of memory for 4 TF... in other words: 1 byte / 1000 flops. it's going to be hard to use in a general purpose way.
This also brings the issue of how dose one get experience.
I take my experience once a day, every day. If I skip a day, I might take two, but that can be dangerous. When bad things happen, I take, like ten. After a while I get habituated.
Thanks... I was fascinated at apple having Andrew File System on by default. That is a really cool, theoretically campus-area network file system. I tried it about ten years ago, and it was a bear to set up back then and not very stable on linux (it was a UNIX thing back then.) Anybody used it recently?
Plus, the NSA would probably shit a brick if the Pres had a Blackberry since every BES packet flows through a foreign country.
ahem, going through Canada would mean NSA would be allowed to intercept it. NSA isn't allowed to intercept domestic communications. They usually asks the Brits to spy on US soil http://news.bbc.co.uk/2/hi/uk_news/politics/3488548.stm This gives them a short cut. If 20% of americans know in their gut that Obama is Muslim, (and all muslims are fire breathing terrorist dragons) they need to keep an eye... err... ear on him.
I upgraded my myth server last week and my laptop this morning. It works very well for me. nothing crashes. everything works better than before. got wobbly windows (KDE 4.1) which seemed pointless to me, but I actually really like it. looking forward to 9.04 having the cube and cylinder effects.
my 945G based laptop runs fine under kubuntu8.10
using KDE 4.1 with fully wobbly windows. I tried
glxgears and got 736.569 FPS, which doesnÂt wow me, but it seems to work OK.
Why not simply make voting a public auction? I'm voting for Obama. There. Done.
there, fixed that for you...
a worn gear in a mechanical voting machine to do the same. A human could 'mis-count'...
the solution is always the same, multiple counts should be routine. Ideally, another computer system could OCR the printed receipts... now sure the bad people can modify two systems, but it's starting to get complicated