r6144 · Slashdot Mirror

The compiler can't do all micro-optimizations on Programming As If Performance Mattered · 2004-05-05 21:22 · Score: 2, Informative

Some good habits in coding helps the compiler to do its job better, and also results in clearer (at least not uglier) code.

Example 1: in C, if you use "int" for a variable "x" that should have a type of "unsigned", "x/4" will not just be a simple shift, instead three or four instructions are involved. Indeed, it would be very hard for the compiler to infer that "x" is always non-negative and optimize for you, except in the simplest cases.

Example 2: in floating-point math, "divide by 10" is not exactly the same as "multiply by 0.1", thus many compilers (gcc 3.4 without "-ffast-math", icc8 by default, and probably the Java VM) won't optimize the former into the latter, even in the many cases where it won't matter. This results in code that is 10-40 times slower on the P4.

Example 3: in Haskell, since lazy evaluation has much more overhead than eager evaluation, compilers always try to optimize the former into the latter. However, in many cases it is impossible for the compiler to do that, since it can't decide if using eager evaluation will prevent the evaluation from terminating.

In short, it is good to rely on the compiler to do the optimization (such as register allocation) that is known to be done well, but what the compiler can do is very limited, since (1) it can't know your intent if you had not expressed it, so (for example) it has to make sure that every floating-point operation conforms to very stringent error bounds, often at the cost of significant speed, even if you don't really care about that; and (2) some code-optimization problems take extortionate time to solve, or might even be theoretically infeasible in general. Therefore, when writing code that is going to take some significant CPU-time, it is good to have some good habits that helps the compiler, as long as the code isn't uglified too much.

Why can't they do it the old/natural way? on MSNBC Looks At Patent Abusers' Victims · 2004-05-03 03:25 · Score: 2, Insightful

Before patents are available, people still invent new stuff, they just try to make thing secret. Then someone thinks that patents are better than making people keep things secret, so we have patents now. Things have changed so much, I think we should reconsider the problem: is having patents better than letting people keep things secret? Patents may be a good idea, but it is very hard to implement correctly --- it is hard to decide what is patentable, hard to specify a good term for patents in each field (having the same term for fields developing at wildly different paces is way sub-optimal), hard to decide what is obvious, and hard to search for prior arts and patents.

In the medical scanner scenario you have mentioned, the company might as well sell the machine after making the buyer sign an agreement in paper-and-ink --- no reverse-engineering, no disclosure, etc., just like today's NDAs. This is a bit of trouble involved, granted, but should be feasible for most things involved in "reasonable" patents today, such as a medical scanner or some new medicine, which cost quite a bit already. If the company still find it too much trouble, it may choose to do nothing and hope noone will reverse engineer their product too quickly (like the companies who don't want to spend money on applying for patents), otherwise I think it is a fair price for the company to pay in order to hoard its knowledge --- if your stuff is good enough people will still buy it even with the trouble of signing papers. Anyway the situation is quite similar to that with patents, just that everyone knows what are they allowed to do, and noone will get stabbed in the back.

You still need self-discipline... on New & Revolutionary Debugging Techniques? · 2004-05-02 17:11 · Score: 2, Insightful

C (and especially C++) are sufficiently good languages in the hands of those who know how to program cleanly (for example, they know why returning a pointer to a automatic variable is bad in C, and why you need to define copy constructors, or make the destructor virtual, for certain classes in C++) --- just look at the many well-written projects in C, you rarely hear the core developers screaming that the language is painful to use. A good compiler helps for giving warnings about certain constructs, but some of the more subtle types are very hard to detect by a compiler.

In high-level languages, you usually don't have memory-allocation or buffer-overflow problems, but quite often there are other traps. In Perl, numerous gotchas are mentioned in the manual. In Python, unexperienced developers often make shallow copies of lists when deep copies are needed. In Lisp, beginners often accidentally modify quoted lists in program sources, and they may write macros that captures variables. In Haskell, hastily-written programs may leak memory because of incorrect handling of laziness. I can't quickly think of an OCaml example, but at least it is easy to get hard-to-find typing errors during compile time if you are not careful... As for Java, I bet lots of beginners write applets that locks up randomly because they are not well aware of AWT/Swing threading issues.

All these, like memory problems in C/C++, are avoidable if the gotchas of the language is well taught and learnt --- and indeed they are mentioned on most books about the language. However if people happen to forget one of these, they will all lead to very hard-to-find bugs. So in this respect, you need self-discipline when programming with present-day languages, even high-level ones.

A problem with functional languages is that they are quite hard to learn (which also makes them interesting if you like computer science). One have to read quite a number of CS papers if he wants to use Haskell well (otherwise he will see cryptic type errors if he tries to do anything advanced, or if he did anything wrong). C is much easier in this respect, and even C++/Perl aren't that hard --- they are just complex.

What about watchpoints? on New & Revolutionary Debugging Techniques? · 2004-05-02 16:41 · Score: 1

I do not know about the best commercial tools in the 1980's. However, I didn't see many debuggers supporting watchpoints (stop when some data gets changed) before late 90s, and you didn't mention it explicitly, so I hope someone would tell me when did watchpoints become commercially available.

Code cleanups on New & Revolutionary Debugging Techniques? · 2004-05-02 16:23 · Score: 1

When writing experimental programs during research (that is, the actual algorithm is not planned beforehand, but adjusted in a trial-and-error way), I often need to clean up the code after a lot of algorithmic changes, such as some code and data structures that I suspect should not have any effect anymore. I want to make sure that the results are exactly the same after the change, so I log the execution path in detail (needed for ordinary debugging anyway --- there is too much data for an interactive debugger), and compare the 10+MB log file before and after the change using "cmp", or eye-browsing if there are floating-point issues.

I think this can be seen as a kind of relative debugging. It can be used during algorithmic optimization as well. Of course, a version-control program such as Subversion is important when using such techniques.

Cursor issue on Blender 2.33 Re-enables Game Engine · 2004-05-01 04:31 · Score: 2, Insightful

As another poster has pointed out, this is possibly caused by bad interactions between new pretty X cursors, video drivers (i845 for me) and Blender. Just try adding "Xcursor.core: true" to ~/.Xresources, reload it using xrdb (or restart X), and see if it gets better (the mouse cursor would return to the good-old black-and-white style though).

Something is better kept secret on JPEG Patent Could Impact The Gimp · 2004-04-24 15:19 · Score: 1

In many cases (for example, software) selfish inventors do have a choice between patenting something and keeping it secret for his own profit, so patents really do not give much more incentive to inventors --- inventors still could make money before patents existed --- and rights of other people can be sacrificed, since the inventor will still keep the things secret if it's better for him, and when he patents them, patents may give him more profit, which often means the public have less rights. Therefore, I can't see how abolishing patents in such fields can be bad --- if the inventor of LZW keeps it secret and uses it in a closed-source software which he sells, the situation is no different than the old DOS days when there are tons of closed-source compression software with undocumented algorithms. Who cares --- there is only so many ways to do such a thing, and good methods will be discovered by other people in time.

s/JPEGs/Dynamic pages on Monitor Linux Performance With The Tools At Hand · 2004-04-20 00:52 · Score: 1

You just tested the pipe size rather than the box's 1337-ness.

OT: Haskell on Apocalypse 12 From Larry Wall · 2004-04-17 20:29 · Score: 1

Well, if you want a innovative language (rather than a pragmatic one), Haskell should definitely not be left out. Granted, it may be a little hard to use if you just want to write real-life programs (quite a lot of computer-science stuff are involved just for mutable state), but its interestingness beats OCAML and even Lisp if you are getting tired of 20 slightly different imperative languages. The language is also well supported in the free software community.

Of course, I'm not disparaging Larry's work here. Perl is a good and pragmatic language, and I'm glad to find it getting rid of the historical ugly parts..

Always drivers... on Linux 2.6.5 is Released · 2004-04-04 03:28 · Score: 3, Interesting

Well, I'm running 2.6.x on two of my machines now, and they are running mostly perfectly (user-mode-linux doesn't work well for me yet, as of 2.6.3). Anyway I did have a (very old) machine in which 2.4 kernels fails to detect the network card correctly even after tons of isapnp tweakings, so I had to downgrade the kernel to 2.2 after upgrading RH7.0 to RH7.3.

Such things depends mostly on luck, since obviously it is the drivers that are problematic, and some hardware are owned by few kernel hackers, so hard-to-fix kinds of bugs in them can take much time to fix, while it is reasonable of Linus et al to start flagging the kernel as "stable" if it works on 50~75% of the machines.

It seems that there are more hardware companies than excellent kernel hackers for many operating systems (maybe even Windows), so driver quality will always be a problem on any OS for a long time to go...

No, it is because no one optimizes the code on Coding The Future Linux Desktop [updated] · 2004-03-17 04:46 · Score: 2

When writing in C/C++ with approprite libraries (such as glib/Qt), the straightforward solution is usually reasonably fast at the machine code level (of course the fastest algorithm may still be non-straightforward, but this is the same for most imperative languages, and functional languages seems to be at a disadvantage here), in my experiance it is usually within 50% of the optimum speed, since most C programmers know that the machine will actually be doing. However, when coding in a high-level languages, since low-level details get glossed over, many programs as it is straightforwardly written is much much slower than the optimum way (just look at numerous Java tutorials that makes some perfectly innocent-looking code 10x faster after some optimization work --- and often the original code is better in style). Benchmarks often show that Java is as fast as C, but this is when properly optimized code. In many projects, there is constant development, and optimization is usually done as an afterthought when the program has become too slow for a significant percentage of the users, which means that most of the code doesn't get much of opportunity to be optimized. With C-alikes, the best-looking code produced by competent programmers (which is quite abound) can still perform reasonably without such optimization (assuming the algorithm is good), while good-looking Java/Python code may well be much too suboptimal in performance.

I still think that much UI code do need to be programmed in a performance-conscious way. Many users have got rather slow machines, and we constantly hear cries about "THIS IS SO SLOW!", almost as often as the number-cruching scientists. Of course, for in-house work/rapid prototyping work which are used only by a few people on dedicated machines, it may not be worth the effort, but for programs ready to be used by the millions, making it 10% faster can be as valuable as making LAPACK 10% faster.

Also, C isn't that bad. Of course you need to keep track of all these memory, and I don't deny that Java/Python do make me more comfortable, I have to say that typing "g_malloc()" etc. doesn't take much thought and allows my brain to relax a bit while my hands do the mechanical job (by contrast, it does drain brain cells when trying to write Java code that don't allocate three times as much memory as actually needed or write Ocaml code that are as functional as possible. Debugging mixed-language code is much harder --- and what about your favorite language binding don't work perfectly?). Debugging memory problems can be a bit hairy, but with modern tools it is much more tolerable, and anyway subtle algorithmic bugs can also be very hard to track down. As for threading problem, well if you want to do threading and don't want to go purely functional, you pretty much have to cope with it no matter what language you use.

Disclaimer: I think I know about more computer languages than most, and I do use high-level languages in many tasks and enjoy their benefits. I just think the anti-C crowd has gone a bit too far.

We may need more complete site directories on In Google We Trust · 2004-03-14 03:19 · Score: 5, Interesting

Indeed, searching (whether on the Web or on IEEE journals and similar academic things) is useful when you just want to have a basic idea about something popular, but it is easy to miss things this way, probably because others use a different wording, a different spelling, or simply because the actual authors are not the ones naming their ideas (you will probably not get Newton's original super-groundbreaking article on Newton's laws, except through trees of citations, just by searching for "Newton's laws" on any search engine :) When doing academic research, if we want completeness (for example to look for some new ideas) we ca n at least browse the contents of all recent issues of journals of interest, but there is no such thing on the web. Google Directory is an opportunity to get the things more complete for those who really need the completeness, but it is currently woefully incomplete.

Currently many interesting sites, such as wikipedia, everything2, groklaw, are spread by words-of-mouth (mostly on slashdot :) Surely many people has taken the pain to collect a set of links that is hopefully quite complete by the time of writing (which is much harder than simple googling), but such pages usually show up only in obscure places at google. Maybe the community can invent some way to make an easy-to-use distributed link-list service where everyone can easily share the results of their searching efforts.

Matlab/Octave are useful but not cure-for-all on Mono Poises to Take Over the Linux Desktop · 2004-03-12 05:31 · Score: 1

I have considered all your suggestions, but none of them helps in the cases where I did use C.

Matlab/Octave are of course used extensively, usually during the "exploration" phase. However, many algorithms aren't as simple as a couple of SVDs (indeed, if I can find such a simple-and-beautiful algorithm in a new problem field, assuming one exists, I would be instantly famous --- such things are usually beyond my ability), and Matlab-like apps becomes slow when manual loops are involved, so it is often only useful to show that an algorithm "is not terribly broken and might have promise", when a C implementation is written to actually evaluate its performance. Also, such algorithms are usually quite heuristic, so it needs tweaking regardless of what it is written in, and certainly won't look good on paper until some good results are obtained and I have some time to clean the thing up.

Someone said that I can use a hybrid approach, using high-level languages for non-speed-critical stuff. Well, this is a valid methodology in many user-oriented applications that contains a lot of GUI-related or other non-performance-critical code. In my research programs, we usually have a pretty flat profile, with only 50% or less of the code replacable with slower-and-higher-languages, and such parts are usually the easiest parts (argument parsing, data reading/writing, etc.) that aren't modified often and doesn't take a significant percentage of the time to write, even in C. The core algorithm can be several hundred lines long (it is the kind with a bit of heuristics and some other messy stuff, not something solvable by several SVDs which can be handled well by Matlab-like stuff), and gprof says that most of the lines takes significant time to execute --- many functions takes over 5% time in the profile (the top one is at 15~20%), and most take over 1%, so there isn't much room to slow that down.

Another problem is that factoring out the speed-critical part to C means some complexity, which can be significant considering the whole program isn't big anyhow. I have written such glue codes, and although this isn't difficult, it does take some time and care. Sometimes it isn't at all clear what and how to factor stuff out --- if this is done carelessly one ends up with a super-messy mixture of Your-favorite-language and C, and a lot of time is spent in passing large amounts of data around. What's worse, when something goes wrong in the glue code, it is usually harder to debug.

Not just C geeks... on Mono Poises to Take Over the Linux Desktop · 2004-03-12 01:49 · Score: 1

Last week I wrote a 2-D DWT program in C. It has a buffer which is a "double **buf[ORDER][ORDER][2]" (should be a pretty natural solution I think), so I end up with pretty constructs like "buf[0][0][0][0][0] = 0.0;"

In another research program I wrote, pointers-to-arrays are used so much that anyone who can understand that (no, the program isn't hard-to-read) can teach that part of C, which is IMHO the most difficult part in my C class.

No, from an EE student on Mono Poises to Take Over the Linux Desktop · 2004-03-12 01:42 · Score: 2, Informative

Almost every time I tried to program something useful with some high-level languages, I end up having to do some signal-processing or other computation-intensive stuff in them, and I regretted for not using C since it would really help if the program will finish in 30s instead of 2min. Actually, the languages I used are even the "fast" ones --- Ocaml(compiled to assembly), Scheme(Bigloo, compiled to C), Lisp(CMUCL, with good type tagging), Java... what if I had written the thing in some even slower languages such as Perl/Ruby, I would not want to think about it! Trust me, the feature in high-level languages such as Scheme are of some use in my programs, but writing in straight C isn't that painful either, and certainly endurable if that makes my program 2-4x faster.

Maybe you mean the applications used by sysadmins or business managers --- such applications are usually I/O or database-bound, so it is good to use a high-level language; but not everyone write their codes for such purposes.

People here still prefer Midnight Commander... on A Look at the Upcoming GNOME 2.6 · 2004-03-08 17:18 · Score: 1

The reviewer said that he drag-n-dropped for 30 minutes to reorganize his huge home directory? Seems really tiring to me...

After all these years I still prefer Midnight Commander (two-window) style interfaces. Every useful feature in modern file managers are available --- you can change directories by typing TAB-completed "cd" commands, (most often) by Ctrl-S which resembles incremental search or type-ahead-find, or by arrowing and selecting, or by using the history (Meta-P, Meta-N). Selection is made individually with the Insert key, or they can be made with wildcards. Moving files and directories are done with one function key. It is also possible to type commands which can refer to the files selected. By the way, it takes up little memory --- you can open as many xterms containing it as you are comfortable with (I usually use about three or four; Of course sometimes I run out of workspaces, but then "screen" comes to rescue) on a 32MB machine.

MC is something loved by command-line freaks like me, but it isn't exactly hard to use. My mom (which is hardly a geek) uses Windows Commander in WinXP (which is quite similar to MC), so does most of her fellows, all without any form of advocacy or special training.

In short, if you don't like Nautilus or other Windows-Explorer-like interfaces, give MC a try. It can almost be called an innovation, except that it actually has a rather long history.

On a side note, another thing I as someone who uses linux for Real Work (TM) can't live without is Links, a text-mode browser. Great for writing java apps when there are such a lot of libraries with API documents to read, since "screen" in an xterm, when used correctly, still feels better than a tabbed galeon window, and is definitely less resource hungry.

I don't deny there are quite a few sore spots in MC and Links (e.g. sometimes MC says "you are already running a command", when I have to do C-o, C-c, C-o, M-P), but they are like crashing bugs in MS Word --- you hate the bug but you still can't live without the applciation. Anyway, the bugs are not crashing bugs that eats files, so it is quite possible to live with. Also it doesn't look as good (or run as snappily) in anything other than a vanilla Xterm with font VGA and white-text-on-black background, neither does it have good i18n support (which is related to the fonts problem), but I hope someone (or myself, if I have time) will get to fix that.

It helps just a little on Ease Into Subversion From CVS · 2004-03-08 00:46 · Score: 4, Interesting

I have used Subversion in quite a few (small, mostly one-man) research projects during the last six months. Before then I used RCS/CVS. Subversion does make me somewhat more comfortable, and I have little to complain about it, which means I probably won't ever look back.

However, IF there is no free software like Subversion, I'll rather do with CVS than using non-free stuff even if someone else pay the money for me. For example, CVS does not have atomic commits, so I use tags instead (ironic since CVS does tagging quite slowly, but still acceptable for one-man projects). Other weak points of CVS can also be worked around. It isn't pretty, but not THAT painful either. Actually, before I discovered RCS, I just did version control manually by saving a tarball after each day's work, which is tedious but still sufferable.

Of course, for large projects, version control is much more important.

Having open-source drivers helps the hardware... on Intel to Increase Linux Support, Release Centrino Drivers · 2004-02-20 02:32 · Score: 3, Informative

When I bought this new computer in Nov 2003, my options were basically (1) all intel + integrated graphics or nvidia (2) all AMD + nvidia graphics card --- since older ATI cards with full open-source drivers are hard to obtain here, I will not consider them. I chose intel+i845G because it is well supported under linux without all those closed-source driver hassles, although it is quite a bit more expensive than an AMD solution, and the 3D performance advantage of a low-end Geforce4 versus i845G (whose performance is about the state of the art five years ago according to my experience) would be somewhat useful to me. Now, seeing all those people having trouble with nvidia drivers (even though they are probably the best closed-source drivers around), especially those tinkering with new kernels (I am one), I think I have made the right decision.

Therefore, I think the availability of open-source drivers should help the hardware sales quite a bit, in that people like me are willing to accept somewhat worse price-to-performance ratio for a open-source (therefore well-supported) driver. Considering that more and more people are trying to install linux on their desktop, and most distributions are unlikely to include proprietary drivers anytime soon, closed-source drivers will be a significant minus for people planning to install linux on the system.

Don't underestimate the value of having the drivers open-sourced, Intel...

The brute-force way on 4 Years Later, The Mozilla Tide Has Turned · 2004-02-11 18:38 · Score: 1

Start your program.
Make your program sleep. For things like mozilla you can just hide/minimize the window, and a SIGSTOP will always work.
Run some memory hog until the system is constantly swapping. Your program should be mostly swapped out.
Stop the memory hog, use your program for a little while, then look at the RSS (resident set size) of your process.

With enough swap (which most people can have if they want), the only thing that matters is the working set size, which seems to be impossible to measure directly on Linux without kernel modifications.

Well, "throws Exception" on How C# Was Made · 2004-02-07 15:15 · Score: 1

If you don't want to handle the checked exception mess, just use "throws Exception" in every method. Of course, you can use more specific classes if the method isn't supposed to throw anything else, according to the interface.

In this way I can avoid handling most exceptions when I don't really know how. Certain APIs constrains the exceptions you may throw (for example EJBException only), in which case I have to do some chaining. Catching exceptions without doing anything, or with only a stacktrace, seriously hampers debugging, so I never do that.

I think C#'s approach, where every method implicitly "throws Exception", is practical in most cases. Catching exceptions when I do not really want to is both cumbersome and dirty, and declaring "throws Exception" in every method looks a bit funny.

Look out for drive failure on Meet Linux Kernel 2.6.2, 'Feisty Dunnart' · 2004-02-04 04:09 · Score: 1

Check /var/log/messages for DMA errors.

When some blocks of my hard drive failed, it would retry for a long time. During such time anything attempting to access the drive locks up, the drive makes a half-second noise every 5 seconds (the HD LED is always on), and a lot of DMA errors show up in the kernel logs.

I can't explain the enlarging fd number though.

Re:ACL? on Meet Linux Kernel 2.6.2, 'Feisty Dunnart' · 2004-02-04 02:31 · Score: 4, Informative

It has been supported in the vanilla kernel for quite some time now, on ext3 (IIRC xfs is supported too).

Note that you need to add the mount option "acl" for the ext3 filesystem. It is documented in the latest tune2fs manpage. Then you can use "setfacl" (the version in RH9 is usable) to set the ACL like this:

a@foo$ touch test a@foo$ chmod go-r test a@foo$ setfacl -m 'b:r--' goose.c

The user named "b" can now read goose.c.

Anything broken? Otherwise why upgrade? on Meet Linux Kernel 2.6.2, 'Feisty Dunnart' · 2004-02-04 00:55 · Score: 4, Insightful

I don't think many people will find upgrading to a stable release of the kernel interesting. For those who upgrade often, is anything broken for you (including security fixes of course, but there doesn't seems to be anything serious recently), or if not, why do you upgrade to a stable release without significant new features?

Personally I upgraded from 2.6.0-test11 to 2.6.1-rc3 in order to fix the famous local security exploit. User-mode linux still doesn't work well, but since the 2.6.0-test3 version of the virtual machine on 2.6.1 hosts works mostly (newer umls don't work), I decide to ignore the problem for now. Unluckily the SMTP server of my mail provider has trouble contacting lists.sourceforge.net, so I can't even submit a bug report :(

It's useful independently... think Gtk# on Mono 0.30 Released · 2004-02-03 14:08 · Score: 3, Insightful

It is already quite comfortable to write Gtk/Gnome programs with Mono. Great for those finding Gtk programs in C too verbose or manual free'ing too cumbersome.

The only thing holding me back is the debugger which did not work well last time I tried (just usable, frequent lockups). Seems that it has been fixed, I'll give it a try...

Make a simpler format for human on MusicXML DTD Hits 1.0; Browser Support Next? · 2004-01-27 20:49 · Score: 1

We can have it both ways, if we use a XML format for tools and a more concise format (such as moo2midi) for manual editing, and write a good converter between them. Of course, making the conversion work bidirectionally means some difficulty in designing the text format.

Slashdot Mirror

User: r6144

Comments · 410