Slashdot Mirror


Is Profiling Useless in Today's World?

rngadam writes "gprof doesn't work in Linux multithreaded programs without a workaround that doesn't work that well. It seems that if you want to use profiling, you have to look for alternatives or agree with RedHat's Ulrich Drepper that "gprof is useless in today's world"... Is profiling useless? How do you profile your programs? Is the lack of good profiling tools under Linux leading us in a world of bloated applications and killing Linux adoption by the embedded developers? Or will the adoption of a LinuxThreads replacement solve our problems?"

35 of 221 comments (clear)

  1. Profiling Again? by stirfry714 · · Score: 4, Funny

    Why can't my code be judged by the content of its characters, and not by the color of its extension?

    Down with profiling! :)

  2. Profiling is Useful by Anonymous Coward · · Score: 5, Insightful

    Maybe gprof, as an implementation might not be useful. But profiling, especially under Java, can make a world of different to an application.

    Saying "profiling isn't useful" is similar to saying "having information isn't useful".

    That's just dumb.

    1. Re:Profiling is Useful by Anonymous Coward · · Score: 3, Funny

      Most, if not all, ./ readers have never written a line of code more involved than 10 print "hello". They spend their time trying new enlightnement and gnome themes and rebooting into windows 98 to play games and post to slashdot (since they can't figure out how to configure pppd).

    2. Re:Profiling is Useful by anonymous_wombat · · Score: 5, Informative
      In single threaded programs, just one type of profiling needs to be done, the kind that standard profiling tools measure. In multi-threaded programs, the relative execution times of the various threads may be more important. The first thing to do is to figure out which threads are using most of the resources. After this is done, and any optimizations made, the old-style profiling and optimizing of slow methods is just as important as ever. If your program is spending 80% of its time sorting, then optimize your sorting code.

      Of course, for many applications, multi-threading achieves the vast majority of the speed increase, and profiling will only be of marginal utility. The profiler is just one tool of many, and is not a silver bullet.

  3. OProfile by mmontour · · Score: 5, Informative

    Take a look at OProfile. It's quite a nice tool, although it's not a direct replacement for gprof. From their 'About' page:

    OProfile is a system-wide profiler for Linux x86 systems, capable of profiling all running code at low overhead. OProfile is released under the GNU GPL.

    It consists of a kernel module and a daemon for collecting sample data, and several post-profiling tools for turning data into information.

    OProfile leverages the hardware performance counters of the CPU to enable profiling of a wide variety of interesting statistics, which can also be used for basic time-spent profiling. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel, shared libraries, and applications (the only exception being the oprofile interrupt handler itself).

  4. Hell, yes it's useful by PissingInTheWind · · Score: 3, Insightful
    Maybe the problems with today's profiler is that the compiler implementors spend too much time making a compiler that is going to try to optimize everything by itself, which then might not even get the best code in that case.

    What could be more useful is if the compiler implementor would spend as much time on the profiler than on the compiler: you would then be able to easily see faulty parts in your software and be able to determine what needs to be optimized.

    Good profilers would means efficient code. Don't think profilers are useless because most implementations of them sucks.

    --

    A message from the system administrator: 'I've upped my priority. Now up yours.'
    1. Re:Hell, yes it's useful by maxwell+demon · · Score: 4, Insightful
      While imroved profilers would surely be useful, don't think optimizing compilers are useless.
      • Hand-optimized code tends to be less clear and less readable. Also, it makes it easy for new bugs to creep in.
      • Hand-optimized code would be machine-specific. While it would work on other machines, it would be dog slow there. So you'd basically be back to per-architecture versions of your program.
      • Some optimizations cannot be done by the programmer, because they ocur at levels below the language. For example, the POWER architecture has a multiply+add instruction. Most common programming languages don't have a multiply+add command. So how would you optimize the use of that instruction?
      • Hand-optimization at the level the compiler does it could even hinder hand-optimization in the area where it is most effective and the compiler cannot do it at all: algorithmic optimization. To do that efficiently, you need highly structured code so you can exchange algorithms easily. However microoptimizations of the sort the compiler does them tend to destroy such structures.
      However, with the compilers getting more sophisticated in optimization, profilers get even more important: While you may be able to add some "profiling instructions" for your own use, profiler-driven optimization in the compiler cannot use such a replacement.
      --
      The Tao of math: The numbers you can count are not the real numbers.
  5. Profiling will always be useful by Wesley+Everest · · Score: 5, Informative
    I work as a game developer, and we have to make sure that everything that is done for each frame takes less than 33ms. So we're always profiling our code to cram more functionality into a limited amount of time.

    But even if you aren't doing something that is speed intensive like games, you always have tradeoffs when you choose your data structures and algorithms. Generally you first code up the easiest algorithm that you think will use an acceptable amount of memory and CPU time. Then, later, if something is too slow, you have to identify where the problem is. If could be that you chose an O(N^2) algorithm not realizing that N might be 1,000 instead of the max of 100 you were counting on, forcing you to switch to an O(NlogN) algorithm that is more complex.

    Now, if it is a small application, you might have enough familiarity with the code to be able to guess where the problem is -- then you fix it and see if it is still slow. If that works, then you're set and profiling isn't necessary. But if the fix doesn't speed it up enough, then you're stuck. You have to profile it somehow.

    You might try simple tricks like changing the code to loop on a suspected bit of code 100 times and see how much longer it takes. Or maybe throw in some printf's that spit out the current time at different points. Or maybe create your own profiling code that you manually call in functions you want to time. Or, you might use an actual profiler without modifications to the code. But lacking a profiler doesn't mean you can't or won't profile your code.

    And even with CPU speed doubling every couple of years or so, that doesn't mean speed is no longer an issue. You can easily choose the wrong algorithm and have something take 1000s of times longer to run than the proper algorithm.

  6. I used gprof by Zo0ok · · Score: 3, Informative

    I used gprof quite much during my Master Thesis work this spring. gprof tells what functions consumes most cputime, and those functions could be optimised. Usually very small parts of the code consumes most of the cpu-time.

    This program was parallellised on network level - all clients were singlethreaded. If someone has multithreaded for performance (to utilize more than one cpu) I suppose gprof will still work well on a single cpu machine with just one thread.

    For programs that consumes lots of cpu time for well-defined computations it should not be hard to profile a single threaded version (a single threaded version is needed for debugging anyway).

    More complex applications (for example a web browser) I imagine are more dependant on multi-threading, and should pose a larger problem.

    gprof, is probably not dead - if you need it you can adapt the program...

  7. Programmers, not tools by sane? · · Score: 4, Insightful
    The problem is not that certain tools have issues; but rather that today's programmers have no interest in creating efficient code.

    Those of us that started programming in 1k and sub megahurts can really feel the time taken by badly coded applications. We know that forgetting what is happening on the silcon can kill how well our code will run.

    However, those who started coding after ~1987 don't really have a gut feeling for it. To them the latest processor will make up for their bad coding. To a certain extent they are right. Today's advances STILL keep up with Moore's law, still make up for their lack of skill. However, when one looks at what is actually performed with all that power, one tends to question why we are paying so much, for so little.

    Can you actually say that MS WordXP is much better than the non-WYSIWYG wordprocessor of yesteryear (itself a blast from the past) ?

    We don't need profilers, we need coders have have that tacit knowledge of what really counts, where they should put real effort.

    Unfortunately that doesn't come in a software box.

    1. Re:Programmers, not tools by Malc · · Score: 3, Insightful

      You talk such twaddle. Why waste your time trying to write efficient code from the start? It's much better to write easily unstandable, easily maintained, quickly written and minimal bug code. Unless you have a real need, such as with an embedded system. If the code doesn't perform well enough, then come back and optimise it later. It's a matter of where you want your efficiencies: memory and CPU utilisation, or development process. The latter is cheaper for a business, and so long as the product works on hardware that the users have, then the former is a waste of time and money.

      I used WordPerfect 5.0 (or whatever it was) on a dual 360K 5.25" floppy disk drive machine. Plain blue text screen only. I have to say, I *much* prefer Word XP. If given the choice, I would not go back to those crappy DOS days.

      By all means, be sentimental and reminisce about the old days. But things have changed - accept it.

  8. Here's how I profile my code.... by shayne321 · · Score: 3, Funny
    User: This program is slow
    Me: Really? Which part?
    User: When I click the "report" icon
    Me: Oh (tinkers with report code). Try it now.
    User: It's still slow
    Me: (shakes BOFH excuse 8-ball) Hrmm, must be interference from sunspots, try it again tommorrow

    :)

    --
    Today I didn't even have to use my AK; I got to say it was a good day -- Icecube
  9. Re:I don't know... by Wesley+Everest · · Score: 3, Informative
    The flaw in your argument is that only a small portion of the code takes most of the time. If you spend a lot of time on upfront design instead of profiling, much of your effort will be wasted. 90% of the time you spend making your code fast should be spent on the 10% of the code that takes 90% of the CPU time. If you spread that out, you'll do a lot of unnecessary work speeding up code that rarely runs and have less time to optimize the code that running most of the time.

    You could argue that with good up front design, you'll know in advance what 10% of the code to focus on, but I don't think that works that well in practice. At best, you're making educated guesses about where bottlenecks will appear, and you'll be wrong some of the time -- requiring profiling at that point.

    And lacking tools doesn't mean you can't or won't profile -- it just means you'll have to do more work to profile the code.

  10. Not useless by pthisis · · Score: 5, Insightful

    Profiling in general certainly isn't useless. I'll usually write new code primarily in a high-level, high-productivity language (e.g. Python), and if it's too slow I'll profile it and rewrite applicable parts in C. Some projects require a lower level (C) approach from the start, though those are pretty rare. Without profiling you'll spend a lot of time optimizing code that isn't a bottleneck.

    Remember the words of Knuth: "Premature optimization is the root of all evil." Without profiling, you don't know what optimization is really needed and what isn't.

    That said...
    BEGIN RANT
    I've used gprof successfully with plenty of recent code. It works perfectly fine in non-threaded code, which _should_ be the majority (99%+) of code out there. Yes, that includes big network servers (the last one I wrote just recently passed the 6 billion requests served mark without blinking). Threads are a really nasty programming rathole that should be applied in a limited way; they take much of the time and effort spent developing protected memory OSes and toss it out the window. They also tend to encourage highly synchronized executions instead of decoupled execution, which often makes things both slower and more bug-prone (locking issues are _tough_ to get right when they become more than 1-level) and slower to implement than a well-designed multiprocess solution with an appropriate I/O paradigm. Just because two popular platforms (Windows and Java) make good non-threaded programming difficult doesn't mean you should cave in.
    END RANT

    --
    rage, rage against the dying of the light
    1. Re:Not useless by pthisis · · Score: 3, Interesting

      I'd think you appreciate the quote on threads attributed to Alan Cox on Larry McVoy's page.

      (The quote in question is:
      "A computer is a state machine. Threads are for people who can't program state machines." Alan Cox)

      Except I'd assert that threads are far harder to program correctly than state machines. Easier concept at first, and easy to come up with a design for the 90% solution, but the devil's in the details and threads have a ton of details. Not to say that state machines don't, but they seem to cause less problems in practice.

      Sumner

      --
      rage, rage against the dying of the light
  11. VTune and Quantify by Codex+The+Sloth · · Score: 4, Informative

    If you want tree profiling (i.e. information about function and child performence) then Rational Quantify is a reasonable alternative to the crap profiler that comes with MSDev.

    If you want a flat profiler or need to analyze the cost of specific low level operations then you MUST get Intel VTune.

    --
    I am not a number! I am a man! And don't you ... oh wait, I'm #93427. Ha ha! In your face #93428!
  12. Re:I don't know... by pthisis · · Score: 5, Insightful

    You could argue that with good up front design, you'll know in advance what 10% of the code to focus on, but I don't think that works that well in practice. At best, you're making educated guesses about where bottlenecks will appear

    And a lot of smart people, from Knuth and Kernighan to Linus and Guido, will freely admit that predicting what to optimize is nearly impossible. Even people at that level of programming prowess are often surprised by where the bottlenecks appear (and where they don't appear). You certainly want to design for flexible optimization from the start, but you'll often discover that the stupid O(n) scan you put in is good enough for now and that you better optimize the I/O system before you think about replacing it with a tree or hash table or whatever.

    Sumner

    --
    rage, rage against the dying of the light
  13. ACE has the answer by Ricdude · · Score: 3, Informative
    There is a simple profiling capability in the ACE toolkit, the ACE_Profile_Timer. Easy to wrap in a class with basic Start, Stop, and Elapsed methods. If you can guess what function or two the bulk of your program's time is being spent in, this can help pinpoint the worst offenders within that section of code. If not, create several timers, and time each function in your main loop, and print the information after the loop is finished. Drill down into subfunctions as needed. See where the milliseconds tick away. You might be surprised.

    And remember, in the immortal words of Michael Abrash, "Assume Nothing. Measure the improvements. If you don't measure, you're just guessing."

    --
    How's my programming? Call 1-800-DEV-NULL
  14. Re:There is no question that profiling is necessar by pthisis · · Score: 5, Insightful

    But, the bottom line is that if you don't profile your code (and unit test it, and integration test it, and...), you are not writing good code.

    That's hardly true. Certainly you shouldn't waste time optimizing code until you know where the bottlenecks are. But it a lot of cases--I'd even venture to say most cases--code gets written and is fast enough. In such cases, profiling is a waste of time. Profiling is only indicated if there's a legitimate performance problem.

    To a lesser extent, the same is true of unit testing and integration testing. If you're writing some code to convert one image to a GIF and you run it successfully to get the GIF, there's no reason to unit test. Even if the code has horrible bugs on some inputs, the job is done. One-off code isn't (unfortunately) uncommon. Prototype code is also very common and often you don't need to do extensive testing on it, either. Any code where the total cost of code failure is lower than the cost of QA probably doesn't need to be QA'd (which is not to say that you should spend an amount on QA equal to the failure cost; if spending $1000 on QA reduces the chance of failure by 99.999% and spending $1000000 reduces the chance of failure by 99.9999%, the $1000 expenditure suffices in all but the most demanding applications)

    Sumner

    --
    rage, rage against the dying of the light
  15. Quantify! by ptomblin · · Score: 3, Insightful

    I've solved some important real-world problems using Quantify and Purify, especially when dealing with a huge system with a lot of developers fingers in the pie. One of the programs was handling 100,000+ transacations a day, and Quantify helped shaved enough off so we didn't have to force all of our customers to upgrade their hardware.

    Faced with a similar problem in Linux, I'd probably port the program to Solaris, Quantify it there, and hope the results are similar under Linux.

    --
    The next Cmdr Taco duplicate will be ready soon, but subscribers can beat the rush and see it early!
  16. Re:So threads are evil -- now what? by pthisis · · Score: 5, Insightful

    Okay, so let's say threads are evil.

    Okay.

    But processes as provided by current operating systems are too expensive to use.

    No, they aren't. Have you measured fork() speeds under Linux vs. pthread_create() speeds()? Sure, Windows and Solaris blow at process creation (and Windows doesn't have a reasonable fork() alternative--it conflates fork() and exec() into CreateProcess*()), but that doesn't make all OSes brain-dead.

    If I have a network server (e.g. a httpd) that has to create a process for each network request, it will never scale.

    Right. And if you create a new thread for each network request, you'll never scale--give it a try some time. Good servers that use a thread/process for every connection do so with pre-fork()'d/pre-pthread_create()'d/whatever pools. Apache, for instance, uses multiple processes (but no multithreading, except in some builds of 2.x) but pre-forks a pool of them. This is really basic stuff, even an introductory threading book will talk about pooling and other server designs.

    Really scalable/fast implementations don't even do that. They use just one process (or one per CPU) and multiplex the I/O with something like select, poll, queued realtime signals (Linux), I/O completion ports (NT), /dev/poll (Solaris), /dev/epoll, signal-per-fd, kqueues (FreeBSD), etc. (select and poll don't scale well to 10s of thousands of connections when most are idle, but some of the others are highly scalable). See e.g. Dan Kegel's c10k page for specifics.

    Obviously, the OS needs to change, and give use something (maybe a hybrid between processes and threads) that more closely meets applications needs

    http://www-124.ibm.com/pthreads/ proposes an M:N threading model and offers an implementation, but it still has the shared memory problems of threads. multiprocessing may not be sexy but it's really a lot cleaner for most problems and can be more efficient in a lot of domains.

    Sumner

    --
    rage, rage against the dying of the light
  17. write event driven programs; threads for CPU work by Splork · · Score: 4, Informative

    minimize the use of threads whenever possible. write your code in an event driven fashion as your friendly AC suggested. the poll() system call [superior to select(), though select() works well within its fixed size filedescriptor array limits] makes this possible.

    the basic mentality to switch from threads to event programming is this: anytime you're using a thread solely so that it can sit around and block on high latency events (network or disk I/O) most of its lifetime, it should not be a thread.

    its acceptable to have worker threads/processes that you hand computational tasks to and they trigger an event in your event loop when they hand a result back, but don't use threads of execution to manage your state. you'll pull your hair out and still have a nonfunctional program.

  18. working code, not pipe dreams by Splork · · Score: 3, Interesting

    i'll always choose a program that exists and works with a good user interface over one that is never released because the author(s) thought it could be faster.

    listen to your profiler. everything else lies.

  19. Re:gprof far from useless by tuxlove · · Score: 4, Insightful

    But in practice, multithreaded programs are almost always interactive, and thus are primarily limited by user response times,

    I would disagree with this wholeheartedly. What about databases like Oracle, MS SQL Server, and so on? They're internally multithreaded, and most definitely not "interactive" after you initiate a SQL query.

    I believe apache 2.0 is threaded. HTTP by nature is not interactive. And so on. There are many other examples, left as an exercise to the reader.

    While it is true that threads are very useful for interactive programs, in fact critical, their use does not stop there by a longshot. Any program which needs to do two things at once without fear of blocking on a system call is a candidate for threads. Threads are also useful for distributing compute cycles over multiple processors within a single process, allowing it to gain the benefit of concurrency.

    The project I'm currently working on is a custom database application, and without threads it would be useless. And there are no users talking to it directly, that's for sure.

    reducing the amount of input required from the user will always pay off better than any optimizations.

    I find this perplexing. Nobody cares about optimizing a user dialog. Reducing user input or optimization of user input code would serve little purpose in most multithreaded applications I'm aware of. Generally, interactive multithreaded programs use threads so they can interact with users while simultaneously performing some other task that shouldn't be stalled by waiting for user input. For example, a network monitor might have three threads: one for watching network traffic, one for resolving IP addresses to hostnames, and one for taking user input. It doesn't matter how long the user input thread sits around waiting for the user to type/click something. There are two other threads working away in the meantime, watching traffic and displaying it for the user, oblivious to whether or not the user is doing anything. In such a case as this, profiling the watcher/resolver threads might be very useful indeed, since they need to be more or less realtime.

    This gprof problem is a serious issue, and minimizing it by saying that threaded programs generally wouldn't benefit from profiling is naive.

  20. Re:There is no question that profiling is necessar by pthisis · · Score: 3, Interesting

    However, "fast enough" is a really bad metric to use. Yes, utility "X" is fast enough. But oh, I didn't realize it was going to be used in conjunction with utility "Y" and "Z". Now, everything is really slow. Hey, can you say Microsoft?

    Hey, I need this report on my desk every morning. It takes 3 hours to run. Let's kick it off every night at midnight.

    Fast enough, even though a well-coded, well-designed implementation might take seconds to run. And mission critical. No point wasting programmer time speeding it up when we can do another project with big upside instead.

    This sort of thing is not uncommon at all.

    Sumner

    --
    rage, rage against the dying of the light
  21. Re:'pstack' on Solaris by WolfWithoutAClause · · Score: 3, Informative
    Yes, the company I work for used this technique to build up gprof style call trace information on a huge embedded, persistent, realtime, multitasking, concurrent system we built (yes it is/was horrible ;-).

    Anyway, we ran the equivalent of pstack at frequent intervals (like once per millisecond) and then collected the addresses of all functions in the call tree present each time we polled the system. Got a humongous file. Then postprocessed the file to record which functions called what other functions, and how often and looked up the addresses in the symbol table to give usable names.

    It turns out that polling the system like that usually gives all the important information you could want- it tends to show not the most called functions but the heaviest users of the processor because they are much more likely to be running when the pstack happens- the number of times they will appear is proportional to the total time they run for, statistically. And the technique is minimally invasive and doesn't require recompilation of the code under test.

    Then we printed the summary out in a huge printout, each function sorted by percentage ticks spent in it; and then spent a week or two staring at it. It showed some amazing features like certain functions were spending an order of magnitude longer in them than originally designed, that kind of thing.

    It is really quite a useful technique.

    --

    -WolfWithoutAClause

    "Gravity is only a theory, not a fact!"
  22. Re:So threads are evil -- now what? by pthisis · · Score: 3, Interesting

    That said, thanks for the information, it has certainly helped to clear some things up.
    No problem.

    I guess the key point I want people to remember (if I only clear one thing up...) is that a decision about whether to use threads or processes should be based on whether they want all (or mostly) shared memory, in which case threads are in order, or some protected memory (and possibly some shared) in which case processes are the way to go.

    Windows has hoodwinked people into thinking threads are fast and processes are slow (and that processes have to start new executables), when that's really not the interesting detail and isn't really very true under well-designed operating systems. And you lose a lot by giving up protected memory (even only giving it up wrt other threads in your memory space).

    Sumner

    --
    rage, rage against the dying of the light
  23. Re:There is no question that profiling is necessar by pthisis · · Score: 3

    Um, but...I think there's a confusion of context occurring. The situation you describe happens when you're writing little chunks of one-off code to perform one task and be done with. Usually it'll be used once, or is part of a stopgap "until there's a real solution."

    With testing, that's generally right. If something's going to run often, it can potentially fail a lot of times and so even a small cost of failure will be compounded to the point where QA is worthwile.

    With performance, that's often not true. There are a lot of jobs that don't need anything approaching "good" performance (batch reports--I need a web usage report every morning on my desk/in my inbox--where the quick-and-dirty multipass solution that takes 3 hours to run can be scheduled at midnight, and the programmer can then do another project with big ROI instead of spending time writing a faster solution that takes only seconds to run) are one extremely common example of this (as is other batch processing). Many applications fall into that domain, many of them absolutely mission critical and responsible for millions in revenue but also not worth spending time optimizing when it could be better spent testing, adding features, or working on another project entirely.

    And many (I'd say most) interactive application are fast enough from the get-go and never need optimization. Sure, there are some apps that either do a lot of computation (mp3 players, games, compilers, etc), or are run many times at once (web servers), or are too slow when first run for unknown reasons. But a lot of programs are fine from the start and profiling them is a waste.

    Sumner

    --
    rage, rage against the dying of the light
  24. Re:I don't know... by fermion · · Score: 3, Insightful
    The flaw in your argument is that only a small portion of the code takes most of the time. If you spend a lot of time on upfront design instead of profiling, much of your effort will be wasted.

    Wrong. You design your code as a compromise between factors such as speed, maintainance, reusability, readability, and, most importantly, the resources you are allowed to expend.

    If speed is a critical factor, then you might try to do some predictive profiling using exisiting principles to make sure the code is fast. Otherwise, you write the best damn code you can, which generally means using good practices to insure that you don't waste time, and then profile it. Profiling will work best if the code is written is such a way(read a lot of reusabled functions) that allows simple optimization.

    BTW, the biggest wrinkle in this is that programmers time has become more valuable the clock cycles. We will now waste some clock cycles to same programmers time, which is why profiling is not nearly as important as it used to be.

    If the code is not written well, and has to be rewritten when the profiler says it sucks, then you wasted your time.

    --
    "She's a scientist and a lesbian. She's not going to let it slide." Orphan Black
  25. OProfile + Prospect by irix · · Score: 4, Informative

    And for getting even more useful information out, try Prospect. It works with OProfile - there was a talk about it at this year's Ottawa Linux Symposium, which you can find in the conference proceedings (gzipped PDF).

    --

    Do you even know anything about perl? -- AC Replying to Tom Christiansen post.
  26. Re:So threads are evil -- now what? by pthisis · · Score: 3, Insightful

    Umm, fork() is the one that's braindead. Who the hell dreamed up a system where creating a new process would copy the entire state of an existing one only to have it wiped out when the other process did an exec()? fork() requires all sorts of nasty stuff (like copy-on write in the VM) that is ditched if the OS follows a process/thread model.

    Uh, COW isn't ditched in a process/thread model. Shared libraries would suck without it. Demand paging of executables wouldn't work with it. It's a fundamentally good thing used by Unix, MacOS X, Windows, and almost all other modern OSes which support protected memory. Definitely not "nasty stuff", and by itself it eliminates 99% of the fork() overhead vs. threads.

    You really want to be able to create a new process with the same state as the existing one, and fork/exec allows that. There's system() if you want an entirely new executable (which might call fork()/exec() or might call spawn(), vfork()/exec(), or whatever...). I don't feel like arguing over whether a spawn()/CreateProcess*()-style syscall is good, but not having a fork()-style syscall is simply braindead. There are things you can do with fork()/exec() that you can't do with spawn() or CreateProcess*(); the reverse isn't true.

    Sumner

    --
    rage, rage against the dying of the light
  27. what's the problem? by g4dget · · Score: 4, Interesting
    You say that there is a problem with profiling multithreaded code with gprof. But the issue you point to seems to apply to both single and multithreaded code: Linux gprof doesn't seem to count time spent in system code.

    Now, compute intensive code tends not to spend a lot of time in system calls, so it isn't clear that it matters whether a profiler counts time spent in system calls. I kind of prefer if it doesn't because it doesn't clutter up the profile with I/O delays (which are usually unavoidable).

    If you want to find out where your code is spending time in system calls, you can use "strace -c".

    There are also gcov-like tools that can be used for profiling via code insertion (as opposed to statistical profiling like gprof), although I'm not sure whether PC hardware has the necessary timer support.

    Overall, the answer is: yes, profiling still matters for programs that push the limits of the machine. But fewer programs do. I think most people would be a lot better off not programming in C or C++ at all and not worrying about performance. Too much worry about "efficiency" often results in code that is not only buggy but also quite inefficient: tricks that are fine for optimizing a few inner loops wreak havoc with performance when applied throughout a program. Too much tuning of low-level stuff also causes people to miss opportunities for better data structure and program logic. This is actually an endemic problem in the industry that affects almost all big C/C++ software systems. Desktop software, major servers, and even major parts of the kernel should simply not be written in C/C++ anymore.

    The thing with profiling and optimization is to know when to stop, and few people know that. So, maybe the best thing to say is: "no, profiling doesn't matter anymore". That will keep most people out of trouble, and the few that still need to profile will figure it out themselves.

  28. Re:There is no question that profiling is necessar by pthisis · · Score: 3, Interesting

    If you don't take a cursory run with a profiler on it, you'll never know the real cost of speeding it up.

    Right. It's obviously a cost/benefit tradeoff. If you start the report at midnight and need it at 8:00 in the morning, then if it takes 15 minutes to run you probably don't even want to think about profiling. If it takes 7 hours, it's still fast enough for now but you may want to concern yourself with whether it'll always be fast enough. What's the cutoff? 1 hour? 4 hours? Depends on how crucial the report is and what other projects are on your plate at the moment.

    Obviously "performance problem" is tough to quantify in general, but I still contend that you should normally only profile if there is a potential performance problem (or if you have idle resources, etc). Otherwise, go do some QA. Work on a new project. Clean up the nasty hack you wrote late at night to get it going. Write some documentation. Whatever.

    Sumner

    --
    rage, rage against the dying of the light
  29. No, he's right by Anonymous+Brave+Guy · · Score: 4, Insightful
    Why waste your time trying to write efficient code from the start? It's much better to write easily unstandable, easily maintained, quickly written and minimal bug code.

    Why are these mutually exclusive? There's efficient and there's optimised, and one is a much easier subset of the other.

    He's not claiming that everyone should hand-optimise from the word go. He's saying programmers should have a basic knowledge of their craft. It doesn't take much extra effort to use an efficient sorting algorithm or store data in a fast look-up structure, rather than writing a naff, hand-crafted shuffle sort and using arrays for everything whether they're appropriate or not. And yet, through ignorance or plain laziness, most programmers in most languages take the latter approach. (If you've never seen any of the source code for big name applications/OSes, trust me, it's scary.)

    Similarly, it is just careless to pass large structures by value unnecessarily in a language that has reference semantics. You have to know the basics of what is efficient use of your tools of choice if you want to write good code, and the old Moore's Law excuse is just a cover for laziness and failure to do the requisite amount of homework.

    Note that, very importantly, none of these things requires more than a small effort. They certainly don't compromise maintainability, bug count or any other relevant metrics, and a competent programmer (if you can find one) will take these things in his stride, and still be faster than the others.

    I used WordPerfect 5.0 (or whatever it was) on a dual 360K 5.25" floppy disk drive machine. Plain blue text screen only. I have to say, I *much* prefer Word XP.

    Interesting... We have just acquired a new P4/2.2GHz with 512MB RAM and running WinXP as a development machine at work. You know what? It's way, way slower than the 1.4GHz P4 running 2000 we already had. And that in turn is way slower than the 1GHz P3 running NT4. This is not subjective, it is based on obvious, objective measures. For example, my new machine (the fastest of the above) sometimes takes 3-4 minutes to act on an OK'd dialog in Control Panel. The NT4 box reacts instantly when you configure the equivalent options. Something is wrong at this point, and I'm betting it's a combination of code bloat and feature creep.

    --
    If you disagree, post your argument. (-1, Overrated) isn't your personal censorship tool for views you don't like.
  30. Re:This will cost me karma... by Znork · · Score: 3, Insightful

    "Some problems are conceptually parellel; it almost always easist to write a procedure in a way that mirrors the way it's conceptualized."

    In that case... fork and use IPC. It's not substantially more expensive and you wont have to ensure your parallel code is thread safe.