Ask Slashdot: Why Is It So Hard To Make An Accurate Progress Bar?
hyperorbiter writes "How come after 25 years in the tech industry, someone hasn't worked out how to make accurate progress bars? This migration I'm doing has sat on 'less than a minute' for over 30 minutes. I'm not an engineer; is it really that hard?"
Comment loading ...
even on a small scale
Yes it is "that hard".
I did it all for the penguins!
Yes it is. And to be fair, it's a lot more accurate than Nostradamus ever was.
Things are asyncronous. You wait for things from disk, ram, user input, over the network etc. How long it will take is non-deterministic. So a task composed of a bunch of these little pieces will be non-deterministic too.
Because doing so, in most cases, would make the process take a bit more time. The first step would be to assess how much time it should take, and that would generally add to the total time.
Just take a simple recursive file copy, for instance. In order to make an accurate progress bar, you'd first have to recursively go through all the directories and sum the sizes, which would take a bit of time, and you'd be duplicating work.
Why is it so hard to prove than NP = P?
One reason is the progress bar starts out as just a generic tool to show that your loading hasn't froze. At first it is parsed correctly with the elements to be loaded, but as scope increases and more things load, it can get sketchy later on.
Another reason is it is difficult to estimate time left. If you look at some old FTP programs, they'd estimate the rest of the download's time based on how fast the previous has taken. Future lag, fragmented files, etc aren't taken into consideration.
There's a bunch more reasons, but namely the progress bar's main purpose is to show you that the whole system isn't locked up, which they've been doing well for the past 30 years or so.
God spoke to me
Also see http://www.popularmechanics.com/technology/how-to/tips/why-the-progress-bar-is-lying-to-you
For, eg, HTTP downloads, it's easy to make a progress bar. It sends a Content-Length and every byte received is accounted for. But other problems, like disk IO, are harder, but not intractable.
I'm willing to bet it's just crappy software.
Copying files. Sure, get a list of the files to be moved, get the size, as files go across, start the % progress meter. What if the network starts slowing down as you start to copy? New files are added. You used a rough calc to get a vague idea as it was 10x faster, but when you start copying, there's a lot of files bigger than you thought. Network's fast, but the end machine you're copying too is having problems keeping up. You start hitting cache, it was fast (and skewed the result) till then, now it's crawling. Installations. All the fun of copying files, but you're updating existing files too, file system may be fragmented, some of the .ini files as you get to may need extra work. Drivers to install may take longer than expected. Once installed, you have to generate/compile/download extra, that's more rough guesses.
As long as the hourglass/cursor/spinner is spinning, and the %'s is going up now and then, probably the best you can ask for. The trend for guesstimating time remaining seems to be diminishing, as surely the main thing most people want is to know 'is this still working or has it hung?' for anything else, logcat/catch stderr'more details' to find out what it's actually doing.
It COULD be more accurate perhaps, but you'd spend so long working it all out in advance, for 9/10 things, it'd have been quicker to just do it.
Waiting for an amusing sig.
It's very hard to predict how long something will take, particularly in relation to other things, if what you're writing is going to be on any number of platforms with different processors, storage, memory and network situations.
You can be reasonably accurate with it, far more than my favorite 99% in 1 second, the last 1% in one hour scenario. There are cleverer and cleverer ways of making it ever more precise, but those methods usually involve spending time on getting it right, and not many people do it.
The AMD Catalyst installer progress bars are my favorite comedy example. It's jarring that such a high-profile product can have such a hacked solution in place.
I think that I can say it seems like some have gotten better, like large file copy dialogs no longer seem like a suggestion
But, I have to agree that more complex operations like OS installations seem to be out to lunch, I can remember years ago joking about "Microsoft Minutes" during what seemed like Windows Marathon installations. I think with the array of processors, HD speeds, SSDs, etc, perhaps we should dump the x minutes remaining, and rather look at a percentage completion factor, although this too is not perfect because we all like thinks to install quickly so we can get on with using the software
It has come to pass that I sincerely prefer the hourglass (or spinning discus or beachball, or whatever) to seeing the various permutations of horror inflicted on the progress bar.
From serial progress bars that use the same bar, to progress bars that empty again (though the empty-on-uninstall is just brain-twinging, not actually wrong) to progress bars that change function halfway through, I find that I cannot stand the abuses of user interface design that some idiots perpetrate.
For crying out loud, why cannot a simple progress bar actually display some indication of !@#$ing progress for once?!
There's probably a pantent for a "method or apparatus for an accurate display of progress", nobody wants to mess with that (but seriously most of my innacurate progress bars deal with unpredictable things like I/O, or non-uniform sets like loading textures and meshes and animations all together, so who knows how much time it will actually take to process the same ammount of data?)
--
Stay tuned for some shock and awe coming right up after this messages!
too many calls to too many black box parts that you have no control or even no telemetry of.
i'm just guessing.
Unfortunately, one does not know the exact running time "a priori" - it varies widely with different hardware configurations, network congestion, hard drive speeds and is therefore often easier to measure than to predict.
The progress bar has always been a "best effort" guess as to the amount of time remaining, I think they have gotten a lot more accurate over the years - but perfection is a long way off I suspect.
There's just too many factors involved. Are you installing something? Well the read/write speed of your hard drive can impact that. Even if you KNEW what the read/write speed was averaging there's no guarantee that it will maintain that speed; if the drive was severely fragmented the head could be jumping all over the place to get the necessary data and write it out. That's just a small sample, I'm sure you'll get more examples.
It's not hard to make an accurate progress bar for file transfers in within a filesystem. The problem is that it doesn't really matter in small transfers and creating an accurate estimation of big and many filed transfers would make for a significant delay on its own, as each file had to be scanned.
Because everyone has a different set of variables.
Making it any more accurate (say by prescanning all files ahead of time) would dramatically increase the amount of time the operation would take.
Comment loading 12%
My team lead often wonders why my own "progress bar" seems occasionally to freeze.
You can make an accurate estimation for a given system. However, everyone else's PC and network environment will be significantly differently from that system. Even things that you think might be predictable can take unpredictably long, given the "right" setup.
Of course, sometimes it's just due to bad programming.
Also, oblig xkcd: http://xkcd.com/612/
The computer is able to measure it's data throughput, read/write times, etc. Whether programmars actually do this measurment I don't know. But if the computer knows how fast it is reading or writing a disk or transferring over the LAN then there is no reason why it shouldn't be able to make those calculations. Even if the environment changes from task swapping or adding overhead or whatever then the measurement, being dynamic, can be recalculated on the fly and the 'finish time' updated accordingly.
Mandatory Car Analogy: I know that if my speedometer indicates 60 miles/hour, that in one minute I will have travelled one mile. That's predicting the future son!
Progress bars are all about using past history to predict future performance. The problem is that past history doesn't always say anything about what will happen in the future.
If you only use very recent history then you can usually better predict the very near future but it also makes the progress prediction and remaining time prediction very unstable and jump all over.
You're a human so use your own intuition to predict progress in part on what the program tells you and in part based on your knowledged of the work involved and the work yet-to-be-done.
It is sometimes hard to make a good progress bar for certain procedures. In such case, please don't even try. No one wants to see those bars that sit at 0% for three minutes and then jump to 100%. Leave the progress bar out completely or use one of those "infinite" bars that just have block sliding from left to right, or the Win8 spinning pearls animation.
You know someone is going to take your suggestion literally as a tutorial on how to implement a progress bar - later they'll come back with some mystical crash always happening at 0%.
Patience is a virtue, but haste is my life.
See http://scribblethink.org/Work/kcsest.pdf and http://scribblethink.org/Work/Softestim/softestim.html
(No, I'm not being serious. The topic just reminded me of when I once jokingly justified a poorly estimated ETA on a "simple" development project by referencing the above paper.)
My favorite terrible progress bar was Internet Explorer, back in its early days of essentially being a renamed version of NCSA Mosaic. When attempting to load a site that wasn't available, the progress bar would slowly creep towards complete, despite the server being completely unresponsive. Then after a long while the browser would give up and stop the progress bar. Why on earth would the progress bar move if the server is completely unresponsive? Who programmed this? I would hope that they, like the inventor of Clippy, suffered a terrible death by fire.
It is quite trival actually.
This one is always perfectly accurate:
zenity --progress --text="Testing..." --title="Test" --auto-close --pulsate &
PID=$!
Do whatever...
kill $PID
Excuse me, but please get off my Pennisetum Clandestinum, eh!
Consider this: Once you've put progress on a bar, you can't take it off. Suppose you start a process that should take 20 minutes, and do the first 5 minutes, progress is now at 25%. But then, partway through, something unexpected happens and you realize the process is actually going to take 40 minutes. You can't take the progress "back" now, that would disorient the user. So you have to rescale the remainder of the bar.
...that we meet every time.
To meaningfully show progress, you have to do work and plan. This is an anathema in software development these days.
If you are going to have a progress bar that simultaneously shows steady progress through the amount of time required and actually represent real progress, you need to identify similar-duration chunks of actual progress. Without that effort, you stand the chance of the issue seen by the submitter.
Even given that work, sometimes the software executes in an environment with different performance characteristics (e.g. SSD instead of the HDD you tested on, 1G of RAM instead of the 8G on the development system). I suggest that it isn't always easy to represent everything with one bar.
That is still NO EXCUSE for failing to find a better solution. You could have achecklist that checks off TODO items, a series of individual progress bars, or ANYTHING indicating what resources are being used without the end user resorting to opening some diagnostic software.
Progress bars do not make sequences of actions complete any faster. In fact, they make them slower.
That being said, take for example an installer that must perform the following steps during an upgrade:
0. Figure out how many files need to be replaced.
1. Replace 30 files of varying sizes.
2. Add 10 files.
3. Update a half million rows inn a table with a million rows setting a column to a computed value based on some predicates.
4. Run a third party installation mechanism (MSM?) for a supporting library, etc.
Modern computers are time-sharing systems. Each process that involves computation is at the mercy of the scheduler in the kernel to give it the cycles it needs to complete. That means that even if you measure the time it takes to complete some process, it's not going to be the same a second time, because the installation process doesn't get undivided attention.
Steps 0 - 2 - you're at the mercy of the IO buses, hard disk, antivirus software interfering, etc.
Step 3 - What shape are the database statistics in? How efficiently can you apply the predicates? What does the distribution of the data look like? You can't tell this ahead of time...
Step 4 - Does this third party installer provide you some sort of metrics as it runs?
These are the sorts of problems to be overcome to do an accurate progress bar. In short, they aren't worth overcoming.
But the data it represents. In your case, I assume, it's estimate time to completion.
Computing something like this is difficult indeed. It's like planning a car route. You can have an estimate of the time it will take, but the end time will depend if the current traffic condition at each segment if your route when you are travelling.
For over 50 years rocket launch countdowns have not run in a linear fashion, sometimes even being set backwards.
For complex operations, doing accurate progress may take a lot of development time you can't afford. Mostly everyone has deadlines...
I apologize for the lack of a signature.
OS/2 got it right. I once saw an installer fail and the progress bar went backwards as it cleaned up the bad installation.
...would be a PB combined with some way for the app and OS to tell me why things have slowed down, in plain language. This would not be impossible to do.
-- This sentence is false.
I'm convinced that it's just a random number generator. Half the time it gets to 50% in an hour, jumps to 75% in ten seconds, then finishes. Other times it grinds away at a job for the full estimated time. Which changes constantly.
It's just a matter of rounding up/down to the right scale. I usually use a standard approximation equal to 100% floored: Until the given set of operations is within the range of 0-99.9999999% completed, the progress indicator is rounded down to "It's not done yet!" so the bar stays empty. Once all the operations are finished, the bar fills up completely and displays "Now it's done! Go treat yourself to a cookie!" Sounds pretty damn accurate, don't you think?
You can work out where you are (% completed) or how fast you are going (rate at which the progress bar is growing), but not both at the same time.
It's simple quantum mechanics.
I am anarch of all I survey.
That's all very true in the world of single tasking. Remember the days of DOS? When a file transfer said it would take 10 minutes, it took 10 minutes, dammit!
But once you enter the world of multitasking, your program has no idea what slice of the CPU pie it's going to get in the future. And surprise, in every modern OS, those file transfer time estimates tend to be significantly off.
There's no -1 for "I don't get it."
Unknown Unknowns
http://www.youtube.com/watch?v=NUuzxjwXVXE
I think the only reasonable response here is two-fold:
1. Obviously, the problem with perfect progress bars is that one cannot predict the future behavior of nondeterministic systems -- and the ever-increasing abstraction of software is only making this harder. If you're complaining about poor estimates of OS-level (or worse, network-level) operations, there's so much going on that there's simply no hope without a psychic to provide supervised training.
2. In fact, I write (non-graphical) progress meters all the time that do a damned good job. However, I am usually just estimating the amount of time to completion of a computation that is broken into chunks of similar difficulty on a system whose behavior is pretty consistent (in-memory operations on a virtualized machine). Most often (total number of steps - current step count)/(current step count) * (time passed so far) is completely adequate. Sometimes it makes more sense to use an exponentially weighted moving average version of (time passed)/(current step count), and all it takes is a little forethought (or observation of inadequate estimates) to identify when this extra step is needed. Since I don't expect to predict the future very accurately, what's left ain't rocket science but generally works pretty well.
What you're describing isn't a progress bar. It's a time remaining indicator. Progress bars tell you how far long a process is on its path to completion. It was never meant to tell you how much time remains, and using it as such will predictably lead to end user confusion and frustration.
Even when a progress bar goes from 0 to 99% in an instant, then takes forever to go from 99% to 100%, it is doing what it is intended to do: tell you how far long the process is. That's all it is supposed to do. It should NEVER be used to tell you how much time remains. You can sometimes infer timing from progress if the progress bar progresses smoothly, but that's a side effect.
As for the time remaining indicators: those will ALWAYS be inaccurate (usually severely so) because most factors that contribute to how much longer a process must run to completion are unpredictable. It's like trying to predict when someone is going to turn off the hall light tonight based on when they turned it off last night. It might be accurate sometimes, but that's just from pure chance.
We live in a world of multi-tasking, or task-swapping hardware. It would seem to me that we've not figured out how to simply stop or buffer other tasks so we could give you something a bit more like an acual progress bar and not just a best guess.
Well, that, or that's a feature that they're planning to put in next version of the OS.
Awk! Pieces of eight. Pieces of eight. Pieces of seven... ERROR: General Protection Fault. [Paroty Error.]
N=1
They have their progress bars sorted perfectly. Great game too!
I'm not signing anything
Flame mode: ON
The Windows XP Windows Explorer and IE8 progress bars were perfect. They took recent data transfer rates, did a simple division of bytes remaining / bytes/sec and reported the expected remaining time at the most recent transfer rates. Yes, transfer rates fluxuate over time. That is the nature of things.
The Windows 7 Windows Explorer file copy progress bar, by contrast, is absolutely worthless. No doubt the engineers screamed under pressure to implement that way, under duress of pressure from the Microsoft marketing dept, who undoubtedly heard undue amounts of whining by people like you who couldn't understand a simple formula involving one division. Now when I go to copy files in Windows 7, it saws 2 hours remaining, and stays stuck at that readout for five minutes, before it finally comes up with a more reasonable estimation like 12 minutes.
The only way to make the progress bar more informative is for it to provide error bands, based on both short-term transfer rate history and longer-term transfer rate history. That would be pretty slick, actually. But to expect the general public to understand the math used in high school Chem I is just too much.
but computers can't predict the future.
The problem is that most software engineers are too dense to implement this properly.
It probably will take up too much processing just to calculate a potentially more accurate estimate. Which will still be wrong some of the time at least.
at the lack of progress in this technology. ( Yeah, I know, I even had mod points, I'm sorry ).
When it comes to the back-end of the "progress bar", it's essentially a widget that shows a percentage from 0 - 100%.
So the question actuall is, does the thing the progress bar is showing the "progress" of output a "percentage done" or not.
Sometimes the "percentage done" can be estimated or calculated, but sometimes it can't. Some programs (like 'dd') don't give any progress output at all.
What's annoying is when programmers show a "progress bar" that changes but doesn't actually show any progress -- so whatever they show is fake. That's irritating, because when that happens it means the user is essentially being lied to, and defeats the purpose of the progress bar. I'd personally rather see a message of "please wait while (blah)" than see a fake progress bar.
The latest place I saw a fake progress bar was in the Java setup GUI in Untange (which is based on Debian). Every single area a progress bar is shown in this GUI shows looping "fake" progress. Ugh.
If you want to know "how much longer", ask Nostradamus, but if you want to know more about what is taking place during the 30 minutes while the progress bar is at 1%, tell your programmer to add some more detail.
I don't know why Windows' progress bars have such chaotic behavior, but I've generally had better luck with ones which run on other operating systems or at least aren't written by Microsoft. :-)
i'm a computer programmer. it's easy to make an accurate progress bar. take the total, take the current, divide. done. i don't know why windows progress bars and time estimated are so messed up. they're clearly doing something totally wrong. if not many things. as usual.
Clearly you're not a very good one.
I remember in the 80s when Apple had this great idea. It would calculate how much space was needed before doing the copy and prevent you from wasting your time. Before then you'd only find out it was not possible when the system choked. The way around this problem is first recognizing it is a problem as the author has done. The next step is to write the code in such a way there are multiple split second tasks that can be checked off. Then these tasks are timed on various machines. Hopefully crowd sourced. As the machine in question compares how long it is taking to finish each of the task it determines which of the probably curves best fits and chooses to extrapolate from there. If crowd sourcing is not available then it would attempt to use the 20 or so time logs the developers included with the software.
We need some standards to make this catch on.
You have to implement a BAR, showing the progress of M operations, executed in N seconds. :)
Your BAR has 100 units. Your time is.....the computer time.
The computer time is INCORRECT. Believe me, it really is.
Now, You dont have M, and you dont have N. So, the questions is, how to divide M by N, when you dont know anyone of them!!!
Short answer, by measuring the calculated Mt, for a fixed amount of period T. Every t seconds.
So, in the begining you will have pretty funny results, but as the calculations are going, so will your BAR.
A few problems here, the fixed "t" amount of time is fixed for you, but not for the computer, and there is simply NO way of having always the exact "t" period of time, all the time. Except if you have a real time OS. Even then it is not an easy task.
Second problems is that the calculations made so far are not keeping the progress of the calcultaions. So, just for example, a better solutions it to keep all the Mt and Tt pairs, and to make one nice, good, pretty complicated math calculation, called interpolation, for the purpose of finding out the exact LAW your bar is progressing on.
So, easy task, huh? And i have not even started talking math.........Again, trust me, it is not an easy task
A program you could code in 2 days would take a week with actual, proper error checking. Add another week if you want a precise progress bar, since that's a whole different program to write.
Programmers work on an schedule, and therefore we have to cut corners. Let's say you need to collect some information from a form, then process it, send it to a web service, then report back to the user.
You could do this:
-> Pass all fields to the server (set progress to 20%)
-> Check they aren't blank (set progress to 30%)
-> Call the webservice, pass the data (set progress to 60%)
-> Check the return against a case with error cases: 1) Everything ok 2) Service not responding 3) Anything else (set progress to 90%)
-> Report to the user. (set progress to 100%)
There, it works. But you are cutting corners.
You should also check the fields against regular expressions to validate the data, and you should implement the complete API with all return codes, which could be hundreds, and each would require a different response from the program, to that you have to add all possible responses outside the API (TCP errors, etc.). Also, if the problem was data, you should try to find out what particular field caused the problem, and show the user what the particular problem is. If you want accurate progress, you need to make measurements of each atomic process you are doing, and then check for a lot more conditions, and either increment progress or report special circumstances (hey, IO is stuck, we don't know how much we'll have to wait). Doing that also means implementing threads, in order for the IO operations to be non-blocking so you can report back to the user. By this point, we've already transformed the complexity of the program from something a single guy can do in half a days work, to a complex piece of code that will take at least a week to complete, and it'll be more complex to maintain later. And we're still cutting corners here, there's a lot more we could be doing to be more accurate.
So, you can implement a hundred error-probing statements, or you can just throw it out there, hope for the best, and tell the user "it went ok" or "everything's lost".
It isn't up to you, it's up to management, up to the customer, and determined by your budget.
So, no, it's not impossible, and it's certainly not "hard", as in, it's solvable with well known methods. But it's time consuming. So you'll have to deal with it, programmers usually measure with a micrometer, mark with chalk, cut with an axe. Not because they want to, but because it's the only way to deliver ontime, within budget, and most of the time, it's the reasonable thing to do.
WTF am I doing replying to an AC at 5 A.M on a Friday night?
The progress bar knows where it is at all times. It knows this because it knows where it isn't. By subtracting where it is from where it isn't, or from where it isn't from where it is, whichever is greater, it obtains a difference or deviation. The progress subsystem uses deviation to generate corrective commands to drive the progress bar from a position where it is to a position where it isn't, and arriving at a position where it wasn't, it now is. Consequently, the position where it is is now the position that it wasn't, and it follows that the position that it was is now the position that it isn't.
In the event that the position that it is in is not the position that it wasn't, the system has acquired a variation, the variation being the difference between where the progress bar is and where it wasn't. If variation is considered to be a significant factor, it too may be corrected by the GEA. However, the progress bar must also know where it was. The progress bar computer scenario works as follows: because the variation has modified some of the information the progress bar has obtained, it is not sure just where it is, however it is sure where it isn't, within reason, and it knows where it was. It now subtracts where it should be from where it wasn't, or vice versa. And by differentiating this from the algebraic sum of where it shouldn't be and where it was, it is able to obtain the deviation and its variation, which is called "error".
Why is it so hard to understand that?
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
It's simple, you load every task up into Microsoft Project. plot out a few hundred gantt charts. And you'll know the precise date when everything will be complete.
What kind of toy programs are you writing son?
"take the total" indeed.. what do you think we are progressing towards?
I've made quite a few progress bars in my time over the last 24 or so years of programming, and I've tackled this problem in a variety of ways.
There are two fundamental problems. (1) How much is there to do? and (2) how long does each of those things take to do?
(1) can't always be known. Sometimes it's impossible to even make an educated guess. That's what those "barberpole" bars are all about. "We really have no clue how long this is going to take."
(2) you'll usually at least have a guestimate on.
From there it's just very easy math, figuring out how much time has elapsed and what percentage of work has been done. Figuring out when to change from barberpole to a reasonably useful bar depends on when you figure you've done enough of the work to start getting accurate guesses on (2).
I think the problem that's causing people the frustration is when (1) isn't anywhere near expected. Maybe there's 15 subfolders in that folder, and you've already got 14 of them copied. The last one shouldn't take too long, right? Unless it's the folder with the raw video in it. And in that case you may be stuck at "about a minute remaining" for the next hour.
The only way to prevent that is to do a more thorough investigation of the amount of work to do. (if possible... and sometimes it's just plain not possible) When you drag and drop a folder on a mac to start a copy, you'll get a "preparing to copy...". And that's what it's doing. It's building a complete list of Things To Do. Once the copy gets started, it will almost immediately give you an accurate estimate. But the penalty is you had to wait for it to start. Sometimes I don't want to wait for it to count how many 4k files are in that huge node tree and just get started and I will use ditto instead of Finder. It may get done with the copy before Finder has even started.
Windows seems to take the other approach for file copies, and that's what's earned it that notorious "about a minute remaining" for an hour reputation. But the file copies start immediately, no time is wasted. So in the end, it copies faster.
So, pick your poison. Do you want it done as fast as possible, or keep you better informed? You can't have all of both. You can have one, the other, or a compromise.
As for the bars themselves... a script I wrote at work that gets a lot of use is my disk cloner. It does a file copy using ditto, which provides NO estimates, but is very fast. I could simply wait for it to get going, and then simply do the work/time estimate. But I found that wasn't accurate. It takes a lot longer to copy 100mb worth of 40k files than to copy one 100mb file. So what I ended up doing is displaying two progresses. One goes on an estimate of how long it's taken from start to now, divided by amount copied so far. The other considers only the amount done in the last ten seconds. The two numbers tend to bounce around a bit. It's not a bar, both display time to complete (15 minutes left) as well as an estimated time of completion (4:25 pm)
Sometimes they are pretty close. Sometimes not. When it runs into a folder of tiny files, the windowed estimate gets longer, and when it hits big files, it gets shorter. It bounces around quite a bit. The overall estimate is a lot more stable. It's been my observation that the windowed estimate is more accurate at the start of the copy, and the overall estimate is more accurate in the middle. Near the end, the windowed time is more accurate. So, the person running the script can place their expectations wherever they want to.
I considered displaying an average of the two estimates instead. It would simplify things for the users. That may be the best way to go. When a customer calls and wants to know when it will be done, we usually say it will be between the low estimate and the high estimate, whichever is which. (either could be the current low estimate)
The final problem is when there are several discre
I work for the Department of Redundancy Department.
It is hard to make an accurate progress bar because it shouldn't be a bar at all - it should be a graph.
Consider the humble download: bytewise, it might be 97 percent complete, but at the last moment, the bps rate has fallen. With a progress bar indicating a percentage and an estimated time, it might say 97% complete, 3 seconds to go. If the progress indicator was a graph, you could tell that the bps rate has fallen, and that the 3 seconds to go estimate (probably based on a linear extrapolation of progress to date) does not apply.
I have never seen it done though. Partly, because I have never done it.
It's just often a tedious and time consuming task, that is relegated for some savings, time-to-market or plain laziness.
The engineers made it go 5min, 4min 3 min, 2min, 1min, 55 seconds, 50 seconds...
Less than 2 minutes is shown as a 1 minute, so for nearly a minute it shows 1 min, then goes to 55 seconds.
It's GUI design incompetence.
There's coding incompetence here too, if you copy 2 big files to flash, say 1GB+1GB. The click cancel while it's copying, it will continue to fake copy and only stop BETWEEN the two files. So you can be waiting for ages for a simple cancel. Instead of fixing it, the designers changed the message to 'cancelling'!
It smacks of incompetence inside Microsoft. A simple file copy and it's done with such incompetence.
The best styrofoam cup in the world, you won't want to use it just once! When was the last time you bought something because of the quality of its progress bars?
Good leaders run toward problems, bad leaders hide from them.
As you observe, progress bar does not mean time estimate.
Apple has applied for a patent for an accurate progress bar when web browsing.
We have a winner.
That said you can do something like this.
0. Show progress of search (x% of search database completed) .net? as you watch the hard drive grind, then stop for long periods of time, only to start grinding and stopping again and again)
1. x file completed of y total
2. x number of files completed of y OR x GB of files copied of y GB
3. Not easy, though doing some simple queries first you could say 'Updating database, X possible canidates' being if X is large that this might take a while.
4. Yea, pretty much screwed here, since it's likely you're installing something from Microsoft who is the seemly worst at calculating how long something will take (WTF RU Doing
... when we solve the halting problem. I'm not entirely joking. The main problem with progress bars is that, quite often, it is not possible to accurately estimate how much time is needed to complete a problem (i.e. for the program to halt).
Loban Amaan Rahman ==> Anagram of ==> Aha! An Abnormal Man!
You want an accurate progress bar? Hire Nate Silver to design it.
The public opinion of the Progress Bar would be considerably more favorable if programmers would simply treat 100% as if it were 75%.
In other words, do all the stuff you have to do, measuring progress and whatnot, but when you're actually at 80%, report yourself at 60%. Likewise, when you're at 95%, say you're at 70%.
Then, only when you really are completely finished, you jump from 75% to 100% in under a second.
Complaints gone.
-David
A common mistake I have seen is implementing the approximate time remaining is 'work to do'/'current estimate of rate of work' - or even something a bit more fancy involving moving averages etc, without considering that work to do and rate of work are usually not single numbers.
File transfers often calculate their approximate time remaining in bytes remaining/bytes per second - when in practice the work to be done is bytes remaining, rounded up to the nearest multiple of sector size for each file + per file overhead. And the speed of transfer might be 80MB/sec but the per file overhead is 10ms. (Or it could be even more complicated when you involve network transfers or when there is parallelism involved.)
If you are doing a large transfer consisting of many files, and early files are large in size and so the MB/sec is the dominant limitation. So based on the bytes/bytes per second the progress bar gets an estimated total time remaining of 5 minutes, until you hit the thousands upon thousands of tiny files and until the progress bar throws away its previous estimate of rate of work based on the large files the estimated time remaining doesn't change much and so the whole thing appears to stall when there was never any chance of it completing in 5 minutes. If instead the progress bar estimates the time remaining originally based on total size remaining / transfer rate + number of files remaining / per file overhead - then the original estimate is much higher and any deviations in reporting are reduced to the environmental factors that were originally quite rightly tried to be handled using moving averages etc.
Basically a poor system description results in a poor simulation results in a poor prediction - no matter how much fancy averaging and statistics you try to apply over the top.
The #1 things I focused on in college programming classes were AI helpfulness on input validation and accurate progress bars. I basically mixed a hybrid of retroactive total time vs data processed mixed with anything that would throw that off while benchmarking like stops and gaps in data while looking forward to the total count and removing any initial startup time from the equation as well. It's actually REALLY, REALLY accurate to do it that way and took about 3 minutes and maybe 5 lines of code.
You know what I'd like to see more than a working progress bar? A "Cancel" button that actually stops the f*%! process! .
I don't want to finish the sub-process I'm currently doing (which has probably stalled)... just FREAKING STOP.
If you (programmer) want to close connections, or save the changes to the disk, do it in the background. Making me sit there for another 10 minutes while you're "cancelling..." is not helpful. I will force close your program. Failing that I will hard-reset the computer. Seriously.
Unlike porn, which yada yada rimshot hey-ooh!
'nuff said.
Don't complain about syntax, grammar, or spelling. There is no.hell like input on android.
Too many variables and external influences out of your control. Come back when you have solved that one.
If you're writing progress bars, I feel bad for you son.
I've got 99... wait, 45... wait... 85 problems and being trolled ain't one.
http://cheezburger.com/47473153
I've got better things to do tonight than die.
Because generally speaking the estimated time to completion is based on knowing two things: how much "stuff" you've got to do and how long the average time to do a unit of "stuff" takes.
When the process starts out you don't have enough data to actually come up with a sensible average, so the time will bounce around; as it progresses it should settle down a bit, assuming that each unit is actually approximately the same in complexity, which is an assumption that isn't always true. If you run in to a unit which takes an inordinate amount of time compared to other units your estimate goes out of the window. Compounding this is you probably don't update your ETA until after each unit has been processed.
Also the more time you spend worrying about your ETA is more time you spend not actually doing "work", so you potentially increase the amount of time it takes to complete the real task - you might not notice this on small batches, but it can become more noticeable on larger ones.
Yeah, I had a sig once; I got bored of it.
No, wait. It seems to have stalled.
Have gnu, will travel.
You think *your* examples are bad??! Well, brace yourselves for *this* horrendously inexcusable progress bar failure!! It has been *FIVE YEARS* since this perfectly reasonable and absolutely vital functional operation was requested: https://trac.transmissionbt.com/ticket/1000 ... and *FIVE YEARS!* later, not a single release candidate has been forthcoming from the slacker devs of the Transmission project!
It is an outrage! How much longer must the Entitled End Users of the world suffer at the idle hands of callous and indifferent devs??!!
Pardon me, I must go and lie down before I dehydrate myself from the righteous mouth-frothing of the oppressed.
Twice as crazy as I would be if I was half as crazy as I am.
Progress bars do not make sequences of actions complete any faster. In fact, they make them slower.
Exactly, most people don't get this. This is why I use mv or rsync (without --progress) instead of a file manager at times, else it reads the whole thing in to determine size first. Some commands like tar allow you to see progress but without a total; this way you get pseudo-progress without the initial wasted read. It will output per file or per 1MB written, etc.
The G
The answer is simple: the developer responsible for the progress bar did not actually know the progress she program the progress bar.
During development, getting the work done has a much higher priority than the progress bar. After development, fixing functional bug has a much higher priority. After bug fixing, launching is has a much higher priority. After all, there are priority 2 or 3 bugs, who will care priority 4 bugs?
BTW, the progress bar I liked most reached 30000% and the job still not complete.
"How come after 25 years in the tech industry, someone hasn't worked out how to make accurate progress bars? This migration I'm doing has sat on 'less than a minute' for over 30 minutes. I'm not an engineer; is it really that hard?"
Yes, because all progress bars are inherently a prediction of things that will happen in the future. If there is any error condition, unusually large blob of data or weirdly structured hard drive to read from, varying bandwidth bottleneck, fritzy peripheral not responding as expected, etc., etc. times a million, then the unusual event will make the prior prediction incorrect and look silly in retrospect. As long as there is any "if-then" clause or error handling in the branches in the system, then the unexpected can happen and make the prediction (progress bar) invalid.
It's analogous to weather prediction. It can't be perfect, it's an extrapolation, but people will always complain about it.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Ever start something you've done a million times (say brushing your teeth) that you know is only going to take 5 minutes, then get interrupted partway through (say your faucet handle breaks off), and suddenly, 5 minutes is an hour?
This is effectively what happens to the computer. The developer knows that under normal conditions something will only take a set amount of time, but this was on his machine running his programs. Users computers are vastly different in terms of hardware and software, and as such, there are many different things that can go "wrong."
Say you're installing a game, but it takes too long, so you watch a movie (on your computer). For some reason both your movie player and the installer need to use a file on your computer sporadically. Due to this files importance only one program can use it at a time, so when one program is using it the other has to wait. If the movie player has priority the installer will need to wait for however long the movie player locks the file, and vice versa.
Due to the seemingly random time spend waiting on the movie player the installers progress bar will appear to behave "badly," when in reality, if you removed all the time the installer waited, the progress bar would be moving along nicely.
There are many finite resources in a computer that programs might need to wait for while another program is using them. Since a modern computer has a lot of programs running at one time there are many opportunities for this to occur.
I see logic like this all the time, but people seem to forget that they drive to work everyday just fine -- and that's a linear process. You can't predict what the other drivers will do, there may be an accident, you may be late... But you're going to drive the same distance every day.
People are getting hungup on measurements of time, and progress bars, like life, is based on distance. Whether it's number of operations, or number of miles, the principle is the same. Progress bars that move backwards is stupid: It means the programmer is attempting to measure the wrong thing.
#fuckbeta #iamslashdot #dicemustdie
Exactly the reason there's a tool called TeraCopy which does this the way it should be done. I'm not sure how things are on linux or macs, but windows progress bar has remained terrible all the while and TeraCopy is what I recommend around.
You actually have to flip the variables around.
progress = current / total
They take a really really really long time to debug. So any extra bells and whistles will be really really expensive to write. I suspect that the same applies for other long running processes.
he started with 99 problems and interrupt 0x00 is one.
Back when I made a kit for building bootable ISOs that worked on both x86 and Sparc, the first stage initrd code loaded initramfs from another file. It included a progress bar with 128 steps (64 columns of '=' characters with '-' at the end for one step) that was tied exactly to the true progress, because the same loop was copying data of a known size and outputting the bar. In that case it is easy.
now we need to go OSS in diesel cars
You actually have to flip the variables around.
progress = current / total
You actually have to flip the variables around.
progress = current / total
But this doesn't address the issue of why bars are difficult.
Saying "take the current" or "take the total" might be easy if you're just talking about moving a fixed number of files around. But even then, do you want it to be the file count (easy) or the total size (not quite as easy to interpolate between unless your OS gives you progress on each file's copy progress). Now introduce any interference to either of the devices (source, destination). Oh, you want to look at a webpage on the same computer while the files are copying... well, the browser is now caching to disk, that's going to take up some of the IO (not to mention how interesting multithreaded processing can be).
And all that for a simple file copy progress bar.
Now, let's imagine that it's an installer that has to copy files from both an optical media source, the internet, and to a hard-drive... and then execute some cpu bound tasks.
I think we should just make them all throbbers/spinners/whatever you want to call them, and ignore the people who complain. /rantoff :D
There are 10 types of people in the world. Those that understand this sig, and those that beat up people who do.
It's remarkably easy to do so, as evidenced here:
http://stackoverflow.com/questions/6392516/how-do-i-keep-a-list-of-only-the-last-n-objects/6392609#6392609
You can only estimate what's ahead of you. And you can't always predict how long the subtasks will take. This is affected by many items which are out of your control, e.g. network throughput, amount of cache, amount of main memory (swap destroys progress predictions!).
Of course it can be done better than Microsoft's famous last 1% which took 99% of the time, but even this phenomenon can be explained in many ways: One that I experienced quite often is a disk cache flush at the end of an install process, which can really take ages.
The point of the progress bar is NOT really to measure progress. It's to tell the user if the software is working (and they should continue exercising waiting) or if it silently failed and they should end task the broken application. Until either users start exercising infinite patience or all developers start writing defect free software this will be required.
Progress as actual progress (ie. predicting what you are going to do with your internet connection while you are waiting for a download to complete) is not possible, but not required to convert a "crash" into an annoyance. That said, on my current project the first step of creating the progress bar is to estimate how long the process will take but this is mainly because some tasks can take days and users get antsy if the progress bar updates less frequently than every 1-2 minutes.
And I'm sure I'm not the only one to have said this by now, but adding a very accurate progress bar is a lot of f'ing work, with all the multi-threading and actual decisions on where 1% progress actually is when you're running through code. Should we use just the use of the line of code divided by the total lines? What do you do if the progress bar can actually slow down the processing in the first place? Would the user rather wait longer and see a progress bar or just get it over with? Programming is an art form in many ways. There is no formula to this stuff.
A usable progress bar is based on basic statistics. Given that we know where we are and when we're done, it's just a matter of estimating when we get there. Now, it can be proven that all factors affecting the progress can be considered static after enough samples have been collected, it's just a matter of applying statistics on the last n otherwise identical chunks of progress. It is of course important to do it right, i.e. to throw away samples that are too far from the meridian/norm, but you won't know this until you're some samples further along. But if you do it right, and circumstances don't change, your progress and calculated ETA will be spot on.
Way back on the old shareware days I created many progress bars from scratch, complete with ETA, and they were rarely off by more than a second or two for long jobs (several hours), and spot on for small jobs.
"For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
The progress bar will be 100% accurate when the call centre *you* call answers your question fully and completely the first time, when they answer the call on the first ring every time (no waiting, no one is having lunch or coffee, ever), and any shipments are all on time, accurate and correct 100% of the time. With precision, what will the weather be where you live for the next 100 days? Oh, and while I'm at it, what will be the hot stocks on the stock market for the next week, and when can I expect the next major earthquake, how strong will it be exactly, and where will it be? After you solve all of that, (everything works like clockwork), then the meter will be exactly perfect. Would you rather have them lie to you and show you that its finished when its not? When 'the other end' stalls (for any of 1,000,000 reasons), you complain about the bar on this end?
A progress bar is like a light in your car. A program or mechanism light up it, but the light by itself cannot differentiate why is on or off. It just animate and runs expecting that you understand what happens.
Why progress bars are bad? Coders use generic widgets, add some values place it on some part of the screen and game over. Or write code expecting that it will never fail (hardware, bandwidth, file errors). And when something happens the small progress bar cannot say a thing since nobody programmed it for that.
Some old MSDOS, UNIX and LINUX installers went with easy progress bars that were accurate:
">Installing 100 files now." ... ...
50% Done. Installing.
">File 86 has an errror. Retry or Abort?"
or
"Install complete 100% 100 of 100 files."
For that you need a program that send info to the progress bar *when things works* and *when things doesn't work*. That means more code, more testing and more time. So why they just put the widget and forget it and will have a punny progress bar always.
Funny bit: Attempts to use ASCII for simulate a progress bar ends with "Filter: Use less junk characters" So even slashdot consders progress bars as junk! LOL!
This migration I'm doing has sat on 'less than a minute' for over 30 minutes.
Are you or your system in close proximity to a black hole?
It must have been something you assimilated. . . .
Is why.
Hmm. I wonder if I wrote an app that was nothing BUT progress bar, if people would go for it.
Some developers have already come to the conclusion that installation is a prime advertising timeslot. So even if anyone was inclined to write a progress bar, it'll still end up ad-laden and annoying.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
...most programmers seem to be far more interested in more impressive things like making sure whatever you are doing is in the wrong language and you cannot find any way of returning to the English language that you have set on your computer - (not that I get really pissed off with that)
I love stacking my barbecues in the shed at the end of summer - you can't beat a bit of grill on grill action.
they just spent countless hours writing all the logic, why waste countless more hours writing an extremely accurate progress bar just so you could know if you have enough time to grab another mountain dew before your bonzi buddy warez is done installing. but progress bars that go backwards have NO excuse, they're just fuckin with you.
It's not a simple problem, I've tried to fix these a couple of times with mixed success.
For simple file copy, the standard total count of files, increment by 1 after each file is copied suffers from the obvious small files update the progress bar quickly, while the progress is slow for large files.
So I calculated the total bytes instead for the progress bar, and instead of copying 1 file at a time, I copied 100 bytes at a time, and it was very successful to a point.... unfortunately although accurate it was exceptionally slow, unusably slow, so the standard method was good enough to provide feedback that it was working.
A second attempt actually worked quite well, I wanted an estimate of how long a series of single row updates to an sql database would take.
The simple count total records and update progress after each impacted the performance too much.
So the solution I came up with was to total how many records were updated per second for each second over 5 seconds, I then took the average records updated per second and used this to calculate an estimate. Of course no progress was shown for updates that completed in under 5 seconds, and no progress showed until after 5 seconds.
However after that it was pretty good, the average over 5 seconds allowed for slight changes in performance, and if the updates took longer near the end, the progress had been adjusted slowly over the whole process so it never stops, and was reasonable accurate near the end.
However the biggest success was, as the main reason I wanted to know how long the update would take was so that at 12pm, I could tell if the update was going to be 10mins or 60mins, if it was going to take 60mins it might as well be lunchtime!!
This is why, when I can, I always display a count along side a progress bar vs. a precentage. For example:
123/1523
as opposed to:
23%
I think the percentage implies a certain monotonic advancement, whereas a count doesn't easily map onto our units of time.
Just my $0.02
Scheduling failure is such a deeply rooted problem in software engineering.
Just read The Mythical Man-Month.
Its generally impossible to accurately predict write and read speeds on a drive, only average speeds. This is even moreso when the reading or writing is done over a network. Beyond that updating a progress bar too often takes away processing power from what is actually installing or being done on the system. A single progress bar often represents many tasks being done in the background. If your first tasks are say installing files to a hard drive, you can semi accurately guage a progress and a completion time, but if at the end you are also doing some cpu intensive calculations like post install optimizations, its hard to tell how long they will take until they have already been done without slowing down the install by benchmarking the processor before the install begins.
http://interserver.net/
I think this is not an "accurate progress bar problem" in as much as it is a "communication problem". Programmers (in general) don't know how to communicate with users. A lot of times, we go with a tried-and-true method (like the progress bar) that falls down on its face in a situation like you propose. The progress bar has its place and is very useful (even if it is not progressing at a uniform rate in time). It's the communication to the user that sucks.
Depending on the process, yes, it can be difficult or impossible to implement an accurate progress bar. It requires an accurate estimation of how long a process will take. Often this is a pie-in-the-sky guess, for many reasons, including but not limited to:
-The sheer complexity of modern software, which is written on top of dizzyingly deep layers of abstraction, means that higher-level code that does things like display progress bars often has no knowledge of underlying processes like memory, disk, and network I/O operations on lower levels like bits, bytes, blocks, and packets. The program will simply tell the OS to "take data from here, and put it there", and then wait for the system to complete the task. Even when a mechanism is in place to have the underlying code report on the progress of the process as it goes, it adds complexity by requiring programmers to handle things like concurrency which may otherwise be non-issues, and doing so often can negatively impact the speed of the process.
-Many relevant factors are simply entirely unpredictable. Just a few examples include network conditions, other processes consuming fluctuating amounts of system resources, hardware and software buffers and caches as they fill/flush/thrash/etc, systems that throttle performance based on power usage and system temperature, VMs which have all of their guest OS's complexity on top of the host OS's complexity, and known and unknown bugs and peculiarities in the seemingly endless number of libraries, interfaces and drivers involved in accomplishing any given process.
-If a marginally accurate estimation is possible for a particular step of a whole process, there can often be several steps involved, with very different performance characteristics. If your whole process has three steps, and step 1 takes 5 minutes, doesn't mean that step 3 can't take 5 years, and you may have no way of knowing that until after you begin step 3.
-The nature of the data you're handling can have a large impact on the speed of the process. For example copying a collection of files will usually go faster when it hits large, contiguous files, but slower when it hits a larger number of smaller-sized or more fragmented files. Or sending data over a compressed network connection, which will go faster when the data is more structured, repetitive, and thus compressible, vs less-compressible like pre-compressed or chaotic data.
In spite of all of these problems, a better guess could be made in most cases, but the result is almost never worth the required effort and added complexity. Given this very low cost-to-reward ratio, spending a bunch of time tweaking your progress bars usually takes a back seat to fixing critical bugs and implementing exciting features, for reasons like preference, practicality, and profit.
It would negatively affect performance to be more accurate in many cases.
In a program I made it took a playlist and copied over all of the files to a specific folder.
These files were varying sizes and from varying folders and could even be from different hard drives.
For the progress bar I just used (files copied)/(total files).
You could have songs that average 5MB but have some sort of megamix mp3 that is 300 MB, and maybe some sound effects that are only a couple KB. The progress bar isn't going to be very steady with inconstant file sizes.
If I wanted to be more accurate I could have looked at the size of each file, but that would take extra time because the program would have to read the filesize of each file. It would slow the transfer over a lot and it just isn't worth showing a more accurate progress bar for that.
Even if it did that hard drive seek time varies depending where the file is and files can be fragmented to varying degrees and you can use your hard drive for other things while it is running and a crapload of other variables.
The Official Site of 1337 Pwnage
I love them progress bars that go smoothly up to 100% and then sit there for few minutes before anything happens!
Show progress of the work, not the time. You never know what other things the machine is doing, and is better to give no estimate than a bad one.
And don't suffice with a bar, also provide information on what the process is doing.
-- The Internet is a too slow way of doing things, you'd never do without it.
I think we should just make them all throbbers/spinners/whatever you want to call them, and ignore the people who complain. /rantoff :D
My solution is even less work: do nothing, and ignore the people who complain ;-)
Apple just got a patent on a solution (for a specific case): Progress indicator for loading dynamically-sized contents.
we developed trading software for a customer that implemented an extremely simple algorithm to decide between buying/selling stocks. Each calculation took less than a millisecond.
Of course this wasn't very impressive. Something that calculates so fast can't be worth anything.
So about half of the project's development time was spent on creating a realistic (that is: unpredictable, jerky) progress bar that created the illusion of a complex analysis taking 30 seconds or so. :-)
To measure progress you need to be able to identify single steps. However, that is not suffice. If only allows you to find out if there is one more step to do or you are finished. In such situation most programs show a bar bouncing back and forth, show a turning wheel or something similar. If you also know that there are N steps to do you could count steps and calculate how much of the task is actually done and show a percentage.
To show users how long it actually will take you have to measure time for every step and estimate that the rest of the steps take the same amount of time. However, it is seldom the case that an operation takes exactly that much time as the previous operation. Even if you copy images, which are all merely the same size. In the beginning the cache will speed up transfers, so you might get lower transfer times, then the cache is filled and more time is required. A background process, for example you e-mail program, fetches new mail. Suddenly the transfer capacity shrinks for short time delaying the image transfer.
Still copying measures are in most cases pretty good. But that is a simple task. In other cases, like installing something or processing something along a data-driven workflow, might not only incorporate equally shaped steps, it might include different tasks, which are comprised of different step which may require different amounts of resources. To make it worse the number of steps in sub-task B might be determined by the result of sub-task A. However, there is now good way to calculate an estimate for B on the number of steps of A.
As you can see, it is not that easy to calculate the total size or time required for a task. And most software developers do not go into the topic to find out. One approach would be to calculate I/O relationships (which fails when there are databases involved and you do not know how much data there can be in at the beginning), like X input record cause Y output records, complexities, like this operation is O etc. Also you could use past progress pattern to estimate future progress pattern (which would not help for progress bars on installers).
Once it has this info, then it needs to determine what other processes will be running and what resources they are likely to be using.
From this, the software can do some complex statistical analysis and determine that it will take 146 seconds, with a standard deviation of 8 seconds and start displaying a fuzzy, yet highly accurate peogress bar, but the user has sadly already terminated the process, wondering why the computer has been unresponsive for the last 10 minutes, whirring and clicking alarmingly...
Software 2.0 needs to have a progress bar on the progress bar estimation progress.
This problem was solved a long time ago... I still wonder why people don't know how to code a proper progress bar...
HOWTO:
I've left out the easy stuff but this provides for a 99% correct est. Please note that the computer must be from 1 year in the future for this to exceed 87% correctness without causing time dilation within progress meter by blocks remaining^66 msecs. Check Mfg date at start and adjust time est accordingly.
The above is copy-written and my not be used....
Prediction is very difficult, especially about the future.
Niels Bohr
Danish physicist (1885 - 1962)
This migration I'm doing has sat on 'less than a minute' for over 30 minutes...
Ah. Evidently, you are on Microsoft Time. Consider migrating to a different techzone.
They are possible. The problem is that the junior developers that create them generally don't have much of a clue.
Forget progress bars, give the user a game to play while they're waiting. You can implement Tetris in a couple of K, so that shouldn't bloat your installer too badly (and of course the game itself need not be installed).
Make it get harder as the installer progresses, so that it starts to become too difficult just as the job completes anyway.
Progress means : movement towards an end. A progress bar is basically informing you when a process should stop.
Yet, it is impossible to have an algorithm that can predict when an arbitrary process will halt [q.v. Turing, 1936]. Granted if we know something about the process, we can know certain bounds to the termination of the process, but this is far from simple.
The original post states that "less than a minute" is being showed however makes no mention of the status of the progress bar. This is the fundamental mistake as the progress bar position and the time remain are almost independent items. Yes they should be linked as closely as possible but as stated in lots of other comments things change overtime and therefore can not be predicted.
I would suggest that the following implementation of a progress bar is more sensible.
Firstly, work out your operations and give them a waiting (waiting can be called an atomic operation). With file upload/download this is straight forward but for other operations you may just resort to giving them all a waiting of 1.
Second you can have 4 types of progress
1. The number of operations complete out of all the operations you need to perform (represented as a progress bar)
2. The number of atomic operations complete out of all the atomic operations (represented as a progress bar) NOTE. If you have wait operations which don't have progress in themselves then your progress bar will jump, e.g. if you have give a 50% waiting to one task but can't tell how far that task is through.
3. The time remaining based on the number of atomic operations remaining over the total number of operations remaining and duration form the start of the process. (This is inaccurate as for a long process which slows at the end you will get the 1 minute remaining scenario)
4. The time remaining based on the number of atomic operations performed in the last period e.g. 1 minute, extrapolated into the time taken to perform the rest. (This give a more accurate representation of current progress)
Personally I think showing 1. and 4. gives the most feedback to the user and although the time may go up (as we see with download times) the progress bar is still accurate.
Ok it's not entirely perfect but worked fine for a database population program. When I say fine I mean I could tell if it was going to complete before I wanted to leave for the day.
All I read was "blah blah blah I'm a stupid pencil necked programmer who promises but can't deliver". Shape up or ship out, Poindexter.
Some bars are more progressive than others
Having worked on several projects where we had to implement graphical representations of time left to run on processes which will take an unknown amount of time I feel qualified to answer this.
It's because you don't know what you don't know (but you know that).
Conversely, once you do know what you didn't know, well, then you do know. Of course, what you know is generally different to what you thought you knew when you didn't know, so you have to adjust for the difference between the unknown and the known - and amount which is unknown (until you know it).
So now you know.
PS this is why you should never ask a question you don't know the answer to. And when you do know the answer to the question there's no point in asking.
What you asking for is a bar that shows you far you've progressed through something. If you were in a race this would be easy, but your computer is not.
Your computer does not care how far it has progressed through something. it only knows it is at step x of y, and even then it has no idea of how long step x will take on your hardware. Sure it could make an estimate, but that would be asking the software programmer to know everything about every system out here. Even then it will occasionally run into something called blocking functions. When it does it has to wait until the function completes before your program can do anything, thus you end up with a jumpy progress bar.
To drive this in I'll examine the loading of a game level for an mmorpg. To load a level your computer needs the following:
In video memory (this loading time can vary greatly depending on your graphics card)
the level layout polygons, any object polygons that may be seen in the level, the locations for each of those objects, the texture maps for everything that may be seen.
The computer needs to know (which can depending on dependance can vary greatly in time needed.)
the current status of your character, the statuses of characters around you, and any user interactions that can be preformed.
On top of that the computer needs to establish any network connections that are needed. (And due to the nature of the internet, not even the best computer scientist can figure out how long these can take to be established.)
Now assuming that everything works in a fixed time will allow the programmer to make a fairly reliable progress bar, but this is for the test system he is running. If your processor is slower, network connection is faster, and you have an average video card. Well all those calculations are now screwed up. Heck even if you have an equivalent AMD card and the test PC used a NVIDIA card can screw with those timings.
This is to say nothing about hard drive caching, weather you are using a solid state hard drive, or even a ram drive, swap file usage, If any other processes are running, and that all software dependencies are properly installed. (e.g. the program may need to add in a few dependencies to ensure it will work.)
So all the programmer can do is give you his or her best guess, if they are even lucky enough to even know if any progress has been achieved. Even that is a lot of work, if you had any idea how much additional code they had to add just to get your progress bar to work at all, you might not be so crabby about the fact that it isn't all that useful.
I always try to hide any progress bar/animation/verbosity if possible so that the computer can focus on the actual task at hand rather than using CPU cycles to tell me how it is doing. This works especially well for large file transfers if you move the animated file off screen. I often reboot into a Linux live distro if I need to perform data transfers from a clients computer, they go a hell of a lot faster without the graphical user interface overhead and even using this I often don't enable verbosity so the copy procedure takes even less time. J
It's simply not something coders worry about. They should but in general they don't. They create the bar then arbitrarily tie in random points to it.
Consider this: Once you've put progress on a bar, you can't take it off. Suppose you start a process that should take 20 minutes, and do the first 5 minutes, progress is now at 25%. But then, partway through, something unexpected happens and you realize the process is actually going to take 40 minutes. You can't take the progress "back" now, that would disorient the user. So you have to rescale the remainder of the bar.
This is why you should not put different "steps" in one progress bar, the software would just be lying to users. Use a 0..100% range for each step.
For each step, always show a useful number allong with a progress bar such as "10/1003" or "(digested 10 of 1003 buckets of lava)".
ONLY if its perfectly possible to predict remaining time, show that estimate. Factor in at step 1 that it has more work to do at step 2.
It can not factor in unexpected failures ofcourse, so state the time as a minimum ("at least 90 minutes remaining) or best case pessimistic running estimate. Dont even bother calculating on the second/minute if its over 10 second/minutes.
Hivemind harvest in progress..
Back in the 80's and 90's - when DOS was still widely used - software didn't get written the same was as today. Languages like C, Pascal and Assembly didn't have Callbacks[1], as they are known today, and oftentimes the program variables themselves were globally addressable by any part of the program. This way, you could run a Busy Loop[2] doing a function call to, let's say, a disk read which would update a global variable with the number of bytes read. When the function returned, the global wold be updated and you could re-draw the progress bar with the updated variable. If disk reads stalled or slowed for some reason, the function would not return as quickly, and the progress bar would not update as quickly which in turn, would actually show this in the progress bar.
In modern languages, these Busy loops should be non-existent. (they were incredible resource hogs) You have Callback routines which usually hand off the long-running portions to another Thread of execution. If they don't, you can usually tell because the UI becomes "frozen" and won't update because the Thread running the main part of the program is also responsible for re-drawing the UI and it's off doing some long-running IO or something. The problem with handing execution off to another Thread is that now it becomes very difficult to code an accurate representation of actual progress. Dialog boxes, variables, thread-safe execution, UI updates -- it all suddenly becomes really difficult to do correctly because all of the parts needed to represent an accurate progress bar are strewn across multiple threads and sometimes you have no way of actually knowing when they have finished. At this point, most programmers just say "ahhh, f#ck it, put up a Spinner" and you end up with nothing more than an animated image showing a spinning circle/dots of some sort. It shows an expected pause in execution but you have no way of knowing if the application is still alive because it hasn't been coded in any way to tell you that.
[1] - http://en.wikipedia.org/wiki/Callback_(computer_programming)
[2] - http://en.wikipedia.org/wiki/Busy_loop
Join the Slashcott! Feb 10 thru Feb 17!
Started, Done.
Problem solved
Is very easy to create a perfect, 100% accurate progress bar that works on all situations:
1 - During the first part of the job, show the words "Estimating time. Please wait."
2 - Do all the job, maintaining the "estimating" thing...
3 - After the job is complete, make up some numbers (e.g. "45 seconds left").
4 - Keep decrementing the time as accurately as possible. The user can't know that the job is already complete and you are wasting his time.
5 - The user will be pleased to see that your 45 seconds took exactly 45 seconds.
As Cypher said once, "Ignorance is bliss."
Even if you don't try to do time prediction. Compare to classic errand to buy groceries.
make the list
watch the weather
dress accordingly
find your car keys
take the car and go to the store
find all items on your list
choose the good line
pay the cashier
go back to your car
drive back
unpack your items
How to do an accurate progess bar ? what is 30% done? How to not stop a long time on a percentage then go really fast on others ?
If you count percentages of tasks some are fast some are very slow, if you count time it's may work on your test but it may not on someone else who leave farther from the store or drive faster, or find the store closed etc
Yes it's pretty difficult to choose good metrics for some complicated tasks.
There is a lot you can do to improve the accuracy of a loading bar, but at the end of the day life is too short. Your update bar code needs to be using the update process to benchmark the performance of the specific hardware, network etc., and use those data points to feed a model you have of expected timings. The more parameters and the more often you check them the better the accuracy will be. Don't forget to include estimates for the overhead of measuring.
If you have a lot of users then you could have the code send back to your server a bit of info about your system and a time measurement for install. Then the next person to install would "people with your type of setup took between 10 and 15 minutes to install. So far you are installing at the 95th percentile". Perhaps you have a checkbox next to the progress bar with "Enable insane progress stats". At least it would take your mind off the time it's taking to install....
You appear to be confusing progress with time remaining.
Dimensions at play include:
Progress:
Information (Bytes) downloaded
Current. Rate of Information downloaded (Bps)
Avg rate of information downloaded
Time elapsed
Time remainin:
Information remaining to be downloaded
Expected avg download rate
Confidence/ accuracy of expecattion
Estimated time to go
Apart from genuinely broken progress paradigms, I think users have the expectation that progress gives them a meaningful representation of 'time remaining'. That is really hard to determine, due to all the variables mentioned in this thread. I am in my xen space if I am 1. Reasonably warned ahead of time that the process might take as much as xx minutes, and 2. The indicator is actually giving me feedback that shit is happening, as opposed to "I'm stuck". Given the range of hardware and possible combinations of installed software, I don't think there's a problem at all if I get those two pieces of information.
DT
...I will make you an accurate progress bar.
Q: For how long should I microwave this popcorn?
A: 3 miles.
Welcome to the idiocy of measuring in the wrong units. We "all" know that just because you've driven 90% of the miles of a trip doesn't necessarily mean that you've driven 90% of the time of the trip. Your gps can usually give a decent time estimate, but it certainly won't predict that flat tire you're about to get. Expecting that from a progress bar is naive.
I never mention time remaining in my progress bars.... why do you assume that is what I am displaying?
I am very small, utmostly microscopic.
I honestly would be fine with a progress bar that simply displays a sequence of pseudo-random integers until it is finished, at which point it displays 0 (yes, this is stolen from Futurama). The most important aspect of a progress bar is usually just knowing that the process hasn't frozen. Sure, the estimates of how much longer I might be waiting are helpful, but short of the time estimate actually being correct, the time estimate is a secondary concern.
Making a perfectly accurate progress bar that works in any situation is equivalent to solving the halting problem. Can't be done.
And if your task is interacting with the network, how can it predict things like your DSL connection dying for a couple of minutes?
It's because you haven't invented a crystall ball yet, dumbass.
Actually, this has been done. The most useful progress indicators do the following:
1) Show overall progress
2) Show progress of subprocess
3) Have some type of message display that actually tells us what is happening (in fact having this may be more usefull than showing progress of the subprocesses).
Here are some examples of great progress indicators (granted, not all are installers, but they are informative):
http://stackoverflow.com/questions/14684652/how-do-create-progress-bar-while-clicking-remove-button-in-nsis
http://doc.zarafa.com/7.0/Migration_Manual/en-US/html/images/MGR_Progress.png
http://www.codeproject.com/KB/files/Copy_files_with_Progress/copyfiles.jpg
http://help.comodo.com/uploads/Comodo%20Backup/b11a8045cb003891d886ace8f138a534/5eac818f1e1c4adc19d335055b06586b/d871fb826e82b18c4d9d5f28f76278a5/cbu_restore_final1_022012.png
http://openchrom.files.wordpress.com/2011/09/openchrom-installer-unpack.jpg?w=640
The last one I want to show is actually from a game I like, and I was having a ton of issues trying to find a screenshot of the progress indicator, so instead, I found a Youtube video. The installer is about 5 minutes in - when you first launch the game, you have a progress indicator, but, its a little dark in this video, in the upper left hand corner, you can see how many files there are, what file it is on, if its downloading or installing, etc. Probably one of the most helpful progress indicators I have ever seen:
http://youtu.be/ROOJFT6ae7M?t=5m2s
This is one of the dumber ask Slashdot's that I've seen in a while... "I'm not a auto mechanic but why can't my car run on water? Is it really that hard?"
Most things are not straight line linear events.
Bram Cohen, the inventer of the bit-torrent protocol, wrote some interesting articles on the difficulty of calculating accurate ETA's in distributed systems. They seem to of vanished now, but I'll link to the google cache of the two most relevant.
http://webcache.googleusercontent.com/search?q=cache:http://bramcohen.livejournal.com/24122.html
http://webcache.googleusercontent.com/search?q=cache:http://www.mccaughan.org.uk/g/remarks/time-left.html
I may be rehashing what most posters here have already pointed out in different ways, but it comes down to the fact that we can't predict the future. PCs in general allow arbitrarily defined operations to happen (copy a folder from A to B, with unpredictable contents, hardware timings, available operating system resources, etc). Added to this problem is one of interpretation: what KIND of progress does the bar measure? Is it time, disk space, ordered task number, or what? All of the above?
Suppose I make a progress bar that measures the time until nuclear winter? We call that the Doomsday clock (and yes, progress bars CAN go backwards, my fellow slashdotters). Is it accurate? No one knows, and I hope we never find out.
But lets say that we only want a progress bar that measures time to completion. I actually like the file-copy progress bars in Windows 7. I think they finally got it right. The underlying hardware will vary in its speed, so the progress bar cannot inerrantly give estimated time to completion. But it does give enough information to satisfy me while I wait. I see the current MB/s of data transfer, the approximate time remaining (and data remaining), and a bar that shows how much of the data has been moved so far compared to the total amount of data to move. Not perfect by any means, but I am satisfied to wait. And that's the whole reason you have a progress bar in the first place.
And over there we have the labyrinth guards. One always lies, one always tells the truth, and one stabs people who ask t
Your need to define exactly what progress is being tracked by a bar. For instance if I'm tracking CPU load the bar will react different then if I'm tracking memory Load, hence these two progress bars will react entirely differently but not wrong. If your tracking overall system loading from the view point of the kernel then you'll get one progress bar and if you track user space load you'll get another. So it's not that the progress bar doesn't work, it just isn't being defined well enough to the average or unaverage user.
It is not a simple problem but it could be done.
If you know exactly what you need to do, Exactly how much data you need to unpack, how much you need to copy to the hard drive in how many files, How many registry entries to make, etc.
Then if you knew the time these would take on a average computer, And run a few metrics on the computer you are installing onto (and keep doing this throughout), as well possibly looking at the hardware.
Do all this and you could get a very accurate progress bar. It is simply easier to semi-randomly put progress++; markers throughout your code that are accurate enough that the user does not give up and cancel or reboot the PC is frustration.
Troll is not a replacement for I disagree.
Give me a crystal ball and I will give you an accurate progress bar
Wait, what are you doing here then?
The progress bar is estimating an aspect of the behavior of a very complex system. Remember that the underlying system is not only your software, but all the other running software (OS included), RAM, peripherals such as storage devices and network interfaces, contents of the storage, and network traffic. The behavior of such a system can be at best captured in a stochastic model of some sort. A model, I must add, where a lot of the state variables are not subject to direct observation. What your progress bar can show is then, at best, something close to the expected behavior of the system. Capturing the model of the underlying system to produce a model requires hard core domain specific knowledge in stochastic modeling. It's precisely the stuff that "experienced programmers" proclaim loudly they don't need -- that they can somehow do their job while resorting to essentially high school level maths. I'm sorry, but if you don't even know what's out there when it comes to applied math, then you're in no position to boast about getting through all your programming without ever having to resort to any college-level mathematics.
Almost any decently performing progress bar would need to use some sort of progress monitoring framework. Such a framework would need to constantly capture system performance and estimate the state of the system. When the time comes for your progress-monitored task, your model will be "aware", for example, that there is a disc burn operation that already does 30 hard drive transactions/s, and that a network download into a fragmented filesystem takes another 15 transactions/s. In spite of the hard drive being able to stream 65 megabytes/s, you're IOPS bound and can maybe get 1 megabyte/s for your own use, and won't get more than 10-15 transactions/s. This is just a most basic example of the stuff you'd need to take care of. So, the performance monitoring framework belongs in the kernel, and it's usually there to one extent or another, but you actually have to build a stochastic model that can consume performance data and properly use it. It's nowhere near a trivial problem, and everyone who tries to trivialize it just makes a fool of themselves.
The good old progress bar is perhaps the clearest example of how you can never be a truly good software developer without knowing your math past high school. Yet everyone does it like they were still 16, and you get what you complain about. Real software engineering is hard, and requires lots of knowledge in applied mathematics -- simply because applied mathematics is the only tool we got that can do the job.
A successful API design takes a mixture of software design and pedagogy.
There has actually been a lot of research into how to make progress bars "feel" right -- it turns out that certain psychological tricks can help with that, too. Roughly speaking, it tends to be better to be conservative at the start (i.e. give a worst case estimate) and then improve it over time, than the other way around. Hence a simple trick to improve the user experience is to take whatever you think will be an accurate value for the progress, but then apply a scale to it to make it appear slower at the beginning, and faster at the end. This is studied and discussed in depth in the paper "Rethinking the Progress Bar": http://www.chrisharrison.net/projects/progressbars/ProgBarHarrison.pdf
For example, if x is the progress ranging from 0.0 to 1.0, then instead of using x directly, use f(x) = (x+(1-x)/2)^8 to calculate the progress estimate you are going to display to the user.
The key observation here is that if I am told something will be finished in 1 minute, but then it turns out to take 2 minutes, I am upset; if instead I am told it will take 2 minutes, but then it finishes already after 1 minute, I'll be happy. Of course this has limits, and one needs to strike a delicate balance: if the original estimate is too far off and bad, the negative reaction to the initial bad estimate and how far it is away from reality will create a strong negative reaction on its own (what would you think if you were regularly told "performing this operation will take ~2 days" when it ends up needing only 1 minute each time...)
There are other tricks to make the UI feel "faster" when it comes to progress bars, see e.g. http://uxmovement.com/buttons/how-to-make-progress-bars-feel-faster-to-users/
Comment removed based on user account deletion
Because you don't know how long it will take to process the stuff - and sometimes even how much, *stuff* there is to process, until it's all been processed.
Trying to make a linear progress bar is a useful exercise, and worth what you learn about the problem. Not possible to be perfect, but being able to characterize run-time *should* be a part of the solution.
Consider the file download problem. You get questions like:
1) what are the relative sizes of the files?
2) can I get the server to indicate size before download starts?
3) how often are bandwidth drops? Can I predict them for the current user based on past trends?
4) what is the average transfer rate over days, including high load times?
Progress bars can be *more* linear. They got less so because we're doing more complex tasks now. When we really undestand those tasks, they'll go back to being somewhat linear again.
The lazy programmers probably assign each of your 5 steps one fifth of the progress bar. The reasonable thing to do would be to keep track of the time taken by each step when they're testing the installer, and hard-coding the average time taken by each step in the progress bar code. This is not hard and I bet it's not being done.
ok. I admit, random acts of nature seriously screw with the halting problem.. Ok, 50% done, 60%, 70%... Zzzrrrrchkipt!... Lightning fries the computer. Problem halted. Progress = 0.
But seriously, can we agree that random events should be dealt with using feedback. Hey, you lost you network conn... Do you want to proceed?
In eralwe can do a better job of estimating our task completion. Let's not use random acts of god as an excuse not to better characterize the run time ifof a task.
the real reason progress bars suck is that most developers are too lazy to calculate total bytes in a set of files, and make their progress bars based on bytes handled. Most bars are written in the wrong metric because were either too busy, or because the bar is low priority compared to the task itself. It's the same reason we hate estimating coding tasks.. how long will it take you to solve these 7 tasks? I don't freakn know. Why don't I just do them and ill tell you when I'm done.
You know the bar chart which indicates resistance & progress.
http://resources.sport-tiedje.com/bilder/kettler/crosstrainer_ctr1_cp_detail.jpg
Difficulty could be added
Just be clear about what's going on. When downloading a package, indicate which package you're downloading, how much of it has been downloaded and the total size, as well as how many more packages there are to download.
If you don't know how much longer something is going to take, then don't give me a "x minutes remaining" estimate. Just show me what is currently going on so I can see that *something* is happening.
In order to have an accurate progress bar, one should be able to predict how long time the overall process will take and this can be very complicated or totally impossible to predict. Additionally, most software is quite badly coded and based on simple boolean control-logic and more sophisticated techniques [complex mathematical algorithms] are not routinely developed or used. How I would do it:
Run test code to test operation speeds and create and use mathematical model (black box solutions like neural networks could be fine here) to predict correct progress. Create learning dataset [for neural network] running the algorithm in different computers. Similarly, measure real progress and time use in the same machine so that the next time progress bar is needed it is more accurate.
In other words, creating accurate progress bars just makes very simple problem very complex and difficult and there are typically much more important things to do (most software has bugs - some very serious) than trying to get progress bars work 100% correctly.
Still, significant improvement could be made if somebody just created generic, self-learning progress bar class that would get more accurate the more often it is being used [the coder would just call new bar("problem-code"), bar.start(), bar.progress(0.75), bar.finish() and the algorithm would self-learn how much real progress 75% typically means for this specific problem with given machine specs (cpuid etc)].
Anyone who's put any thought at all into it can solve this problem. On occasion I'll run into a piece of software that does this right. Usually open source. The bottom line is in the grand scheme of a project managers driving the bus are more concerned with functionality and time to market than an accurate progress bar.
There have been too many times that the system has hung with the "working on it spinner" still spinning away merrily.
Show me actual information. Show me how many bytes have been downloaded, or which file has been installed. Show me something that proves that the system is doing something useful rather than busy-looping doing nothing.
It's just like my fucking gas gauge! I don't want to know what dot it's on, that's meaningless! Just tell me how many fucking gallons are left! Why is that so fucking hard?!?
Why can't the operation just tell me xxxxxx bytes out of xxxxxxxxx bytes (or KB, MB or GB, etc.) transferred? Or copied, or deleted, or modified in whatever way I ask it? Why does it have to tell me in minutes? We all know these numbers are bullshit.
Or how about this for a novel idea: PERCENT! Just tell me out of 100 parts, how many are done. How the fuck hard is that?
Ditto the gas gauge. I would love for it just to tell me I have 5.3 gallons remaining. Especially if it's ACCURATE! I would switch to fucking metric if I had to to get a goddamned gas gauge that told me the fucking truth, and not, "oh, you have about half, jefe!"
Fuck you, I don't want to know ABOUT! I want to know how much is left! We put a man on the moon, and we have probes on Mars, and they can't make a fuel gauge accurate down to a quart? Is that too much to ask?
I'm right with you brother! Let's have progress bars that actually tell you something about the progress, and while we're at it, gas gauges that read out in a useable fucking unit, not some weird approximation.
Not necessarily - as someone else pointed out earlier, for some things, it's possible that a long-running process may have portions that are themselves long, but also may encounter problems and need to be rolled back and retried. Then you're stuck between showing zero progress for a long time while running the portion that can be rolled back, showing a reversal of progress should the rollback actually wind up having to happen, or extending the progress bar in some way to show that there's now more to do. Alternatively, there may be operations that rarely need to be done as part of a process - rarely enough that including them in the normal estimate of how much needs to be done doesn't make sense.
(To take that back to your analogy - if you get partway to work and realize you forgot something you need, then you're not going to drive the same distance that day. A progress meter of your drive would then either have to stop until you return to the point you'd gotten to when you had to turn around, show backwards progress as you go back home, or add additional distance that needs to be driven. For the second alternative, an analogy would be encountering a detour, traffic accident, or other blockage that causes you to take a different, longer route that day.)
Of course, what you really need there is some explanatory text, so the user knows what the heck is actually happening... which is why I personally like progress bars that have a way to "open them up" to get more information about what's going on.
Hi -
I worked in software for 20 years. Better (not perfect) progress bars are certainly possible. I would say many companies put little effort into them because they are a non-essential feature of the software.
TWR
[The overal progress is] the only part of the UI that is significant for the user. The rest is clutter. As a geek you might be interested
No software marketed for home use is perfect on its first release, and tidbits interesting to "a geek" are probably useful for support while taking fewer resources than a full debug build. For example, reviews that mention that "the progress bar stops for a while on discombobulating splines" or "the ETA stops decreasing for a while on discombobulating splines" are a signal to the developer that he incorrectly estimated the fraction of time that "discombobulating splines" takes on end users' hardware, and that the next version should make "discombobulating splines" faster or expand the step's time estimate. Or would you prefer to solve this by giving one manufacturer a monopoly in order to limit the variety of end users' hardware?
Before we can build an accurate progress bar we must first figure out how to accurately predict the future.
Because, unlike the computers you see on TV and the movies, computers in real-life cannot be programmed to be psychic.
Yes it is.
I don't know what kind of engineer you are, so for the sake of illustration I will assume you design and build bridges. Let's say I ask you to build a bridge across a river. Can you give me an accurate estimate (to the hour) on when said bridge will be complete without you having done any surveying first and without knowing anything about the availability of the materials and labor required to build said bridge, or what the weather patterns will be like?
You could make an estimated guess, sure -- and that's exactly what progress bars typically do. Sometimes they are just bad at guessing (i.e. poorly programmed) or things happen that just can't *reasonably* be taken into consideration when the time prediction algorithm is coded.
Modern copyright is theft of culture from everyone and it retards the progress of the useful arts and sciences.
I'm not quite that angry yet. Also, knowing that your program isn't just hung is a reasonably good thing.
There are 10 types of people in the world. Those that understand this sig, and those that beat up people who do.
The actual and psychological underpinnings of the word 'progress' are illusionary and fleeting.
It would have been better to focus on 'pain' or 'struggle', which are born in actual life and have intuitive meaning.
Then, all of the sudden, you grind to a halt when the cache fills. You won't see any progress, sometimes for minutes
Even if you can't predict the caching mechanism, you can control it in some cases. Issue a sync() call (or syncfs(fileno(fp)) if available) before the copy begins, before every change in top-level folder (e.g. /opt vs. /usr vs. /etc vs. /home), and after every 10 percent.
Yes I love to watch progress bar's suffer the fact that new threads and new process are not even known to the progress bar,
setup
---- msiexec.exe
---- msiexec.exe
even if you don't have this problem, I think you should get a smoother progress bar using logarithmic math in there. (good god where's my calculator, what is it PI, LOG, SIN, COS -- looks like remedial non-linear simultaneous equations for me tonight - shit... shit SHIT... oh well... maybe I can snatch some EXPRESSIONS from the adobe after effects training videos?? or not, where's that fuckin Malvino book)
I recall doing a big project for College. The program in question was a big one, and there were a lot of background processes going on using a large amount of data. To compound that, it was written using VB, and using a GIS addon which wasn't exactly lightning optimized. In any case I can't remember what all the thing did now, something to do with old age homes or something. In any case it had a horrific load time on startup. I recall trying different methods, of selectively loading, etc... but it was almost better to get it out of the way rather than annoy the user continuously. So I wrote a progress bar for the application so that the user (i.e the prof that was grading the work) wouldn't think the thing had simply crashed at startup.
I also cheated horribly. All I did was time how long it took to load several times. Since it was using the exact same data and the exact same hardware every time, the time elapsed was for all intents identical. I then added a nifty showy flash screen (to simply waste more load time, I assume this is why everyone does it now) attached to a timer, then added a progress bar that incremented to another timer for the balance of the time left over. It had nothing to do with the actual work being completed, nor was it in any way connected to anything other than I knew the process took about 12 seconds or whatever. Once the timer finished it would bring up now loaded program all ready for use. Since I tied it directly to the timer, even if you put it on a faster computer, or used less complex data, it would still load in the exact same time. The only thing they would mess it up, would be if you made the data even more complex, or put it on a slower system, in which likely the progress bar would finish, and the program would load, but be unavailable for a few more seconds while it finished loading.
So anyway, it was mainly subterfuge to simply hide a cludgy loading process.
Progress should be reported to the user in outline format. Give the user the list of tasks that the computer is working on, show progress in each task. This is much more informative, and as a user - i'd feel more intimate with the process and in turn more trusting. also this might allow me to troubleshoot things that are moving slowly. some games are great at this, i've never seen it in an OS though.
I've seem them once or twice before. There are two reasons to have a progress-bar. One is to know the "rate of progress", and another the, amount of work done so far. When a progressbar freezes, it could also be because some background process just died, or someone forgot to catch an exception in time, or an async call that doesn't handle a timeout or has a long timeout. Most technical users just use -v or --verbose options to see what's happening, rather than how much has happened as an unquantified bar. Maybe the solution is to have a sort of speed-dial like that on cars. Backend processes can ping the log-collector or some other globally accessible entity everytime they "do" something useful. That would allow the speed-dial to show the frequency of operations, giving a good idea of whether or not something is happening or not.
The progress bar implies the total time is proportional to the length of the bar and the time left is proportional to the unfilled part of the bar. And there is the mistake because each step could be unbound in time (even to copy a bunch of files with a known length could be delayed by unbound events -nonresponding server due to load, for instance-) so, the program must have a crystal ball to do it right. I understand that there is no excuse for some “progress bars” out there, but at least, there is no way to do the progress reporting right automatically. Maybe some programs should deemphasize time as a measure of progress and resort to “steps” instead ;-)
As a user, I want two things from a progress bar: (1) Should I wait for it to finish, or go for coffee? (or take a nap, go home for the night, etc.) and (2) If the process is taking a long time, is it just slow, or has it frozen? A progress bar that stops moving is bad as a spinning wheel: it leaves me in the dark about (2). I wonder how many times I've rebooted the machine when I didn't need to?
I'm going to get rated down for this but fuck it. What the fuck is up with this site? Who thought THIS would be a good question to ask? Isn't it blindingly obvious that progress bars lie because in almost all cases the task at hand cannot be accurately measured? Could the submitter not have googled this? It's stories like this one that make me not want to know what other humans are doing with their lives. Better to just pretend they're not completely fucking brain dead.
There are several problems with progress bars:
1) there are some tasks for which progress -- and even present state -- are not easily measurable. For example, your Li-ion batteries look to be in pretty much the same state when you have anywhere from 10% - 100% power remaining. It's not enough to be able to program a progress bar: you have to have knowledge of the thing being metered to do it right.
2) There are some tasks for which the total workload is not readily known. If you have a simple (light on metadata) filesystem with many nested directories and fragmented files, you have no idea what the full scope of the workload is until it's either almost done, or unless you do some pre-processing. Metadata collected in advance (file size info) can help, but it's basically the same thing: you're pre-processing before you know what or what you'll need the data for. And naturally, this pre-processing can be hefty for a few rare cases.
3) For some operations, progress so far doesn't strictly map to remaining workload: your algorithm could handle the fast, "low-hanging fruit" first.
4) On complex systems, time spent and progress don't map well to time remaining. As I said before, some portions of the workload could be inherently faster (as with some coding/decoding problems). The system or signal path could impose unusual delays, then relieve them, spoiling the calculation. The bar has to account for this with some knowledge of the system outside the algorithm. In other words, it's not enough to understand the problem, you also have to understand progress bars.
5) The progress bar only matters when the user is in a hurry. If you were comfortable with a very coarse granularity ("Come back in about an hour") you probably wouldn't care about the ticking of the bar. The edge cases define user's perception of the bar's effectiveness.
6) The bar actually conveys more information than progress. The bar is actually used more than it should be -- that is, instead of the hourglass/spinning clock/spinner -- because it conveys more information that the user demands: "Yes, I'm really working. No, I haven't locked up. No, I'm not in an infinite loop, trying to free up a non-existent resource, or waiting for input to a hidden prompt. Yes, I promise we will get there at some point if you just sit there patiently and refrain from bothering your nephew 'because he knows computers'." For tasks where the progress bar moves imperceptibly, you usually also have a message to the extent of, "I'm manipulating this data point now. Notice it's different than the data point I was working with 5 minutes ago. See, progress! Not locked."
In general, you probably need to take a multi-prog attack to user notification: aspinner for pre-processing and simple tasks that will complete in under a second, a progress bar based on pre-processing for longer tasks, and some sort of very nebulous, "come back in an hour; I'm working on it," dialog for really long tasks.
... as long as you only care about 0% and 100% done. If you want the progress bar to reflect finer increments of work, say 10%, then it might be hard for at least two reasons: 1. there may be a large variance in different portions of the task tracked by the progress bar 2. exceptional occurrences (network lag, errors, the user suddenly increasing load on the system) can change how long things take In general though, progress bars are no harder or easier than the estimation task for what they should track. The estimation task is hell, partially because of leaky abstractions, partially just inherently. Progress bars with milestones can help, but there's no easy answer to the basic problem: it's the estimation that's difficult.
All those bars that go from 0->99 in no-time are filling up your cache, then they close the output files or call something similar to fsync() and then you have to "pay" the time it actually does take to flush the cached bytes to disk.
And doing small synchs in between (at 10,20,30% and so on) will of course lower the overall total, but may make your install bar viewers a bit more happy.
Obligitory xkcd cartoon.
God is imaginary
I can accurately measure bytes. I cannot predict that you will decide to defragment your HDD halfway through the file copy.
I can accurately measure discrete steps of a process. I cannot predict that you'll start CPU-mining bitcoins shortly after starting step 3 of that process.
I can accurately measure network throughput (to date or right now). I cannot predict that, 30 seconds after you start downloading the latest Debian ISOs, you will fire up a game of Minecraft that crushes your network connection.
I can't predict that your playing Solitaire while waiting for my progress bar will consume the last of your physical RAM and the system will start paging. I can't predict when your neighbor (on the same network segment) will decide to fire up a torrent session or 20. I can't predict that your CPU fan has clogged with cat hair and the system will go into thermal throttle.
And all that, just for processes with a known end-point. Plenty of scenarios don't have a known number of steps or bytes or CPU cycles - From something as simple as searching for a file containing a given string (might get it on the first try, might need to scan 8TB of porn before we find the right one), to recursively fetching dependencies with apt-get.
As much as we want everything our computers do to appear deterministic, it might help to consider most progress bars as less of a measurement of "% complete", and as more of an indication that the program hasn't crashed after three hours of doing nothing obvious.
for a indeterministic process is like providing accurate prediction for a creationist where the evolution is going next. That's how difficult the progress bar problem is.
I have a script I wrote to copy photos from my camera that has an excellent progress indicator.
Before it starts, it finds the size of each file, so it knows how much data it will copy. Then it copies the files manually -- opening the source and destination and moving the data itself -- so that it can track the progress within each file. After it copies each block of data, it multiplies the time that has passed so far by the amount of data remaining, then divides by the amount of data copied so far, then rounds the result to an integer. If this new estimate is the same as the currently displayed estimate, or it is the same as the currently displayed estimate plus one second, then the display is not updated. Otherwise it converts the value to minutes:seconds and displays it to the user.
Watching it work is a dream come true. You tell it to copy the photos, and it tells you how many minutes and seconds the process will take, and if there's any jitter at all, it only occurs in the first few seconds. After that, the timer always goes down one second each second until it hits zero, at which point it's finished.
Honestly, this stuff isn't that hard. If you're copying files, you track how many bytes/second you're able to process. If you're processing images, you track how many pixes/second you're able to process. If you're doing both, you do both at the same time, rather than doing all of one then all of the other, so that you know sooner rather than later how long each process will take. (It also goes faster, if one task is I/O-bound and the other CPU-bound, if you do both at the same time.) If you're downloading files, you track the download rate. If you're accessing multiple servers, you download from all of them simultaneously, so that one of them being slower than the others doesn't mess up your calculations because you're still downloading at the full bandwidth of the internet connection anyway. (Also, this just plain saves the user time whether you're displaying a progress indicator or not, when one of the servers is far slower than their internet connection.) Obviously some situations will be difficult, like when the one large file you need happens to come from the slowest server, and it isn't apparent just how slow it is until all of the other files have completed because, until then, for all you knew, it was slow just because all of the bandwidth was consumed, but fuck... There are solutions to that as well. I've seen progress indicators that are full-screen, with a separate bar & estimate for each file. May not tell the user exactly how long the process is going to take, but gives them an excellent view of what is happening, and certainly tells them whether or not now is a good time to go to lunch.
I can understand if not all progress indicators run as smoothly as my own. What I don't understand is why essentially none of them run so smoothly.
There are four main resources used when installing software; CPU, mass storage, RAM and network. The problem is that these resources can vary greatly in speed. In general the process goes like in the following order; download (network speed+HD speed), unzip (CPU+HD speed + RAM capacity), Install (CPU + HD speed +RAM capacity and maybe some network). The issue comes in the estimate as to how long each stage takes. For example, a slow system hooked up to a fact net connection would zip through stage 1 and then stall on stage 2.
There is also the issue of multitasking. The install may be part way through stage 2 and then some other disk intensive process may kick in and grind it to a halt. One may be part way through a download and your torrent client may find lots of peers to talk to.
The problem with progress bars is that they attempt to use recent past performance to predict how much longer something should take. Due to the variability in performance and resource allocation the past performance has little to do with future performance; even seconds later. I have seen a number of install programs that no longer have a progress bar but a scanner that keeps moving to indicate something is happening but they don't know how long it will take. I prefer a scanner or a % complete bar.
No, in your drive to work analogy the progress bar going backwards is a detour. Because of the detour the distance has increased. I think the bar going backwords tells the users information. It tells the user that the install is not going as planned.
Progress bars do not make sequences of actions complete any faster. In fact, they make them slower.
The progress bar may use CPU cycles, but this will only make the task slower if the task is CPU bound. IO bound tasks won't go faster in the absence of a progress bar.
It makes sense to use a time estimate alongside a progress bar (as long as it is clear that it is an estimate, which is usually the case.) If the time estimate gets stuck (except, perhaps, on the last increment on which an update is possible to the progress bar, where it s quite possibly that even a reasonably non-optimistic method would eventually asymptotically approach 0), its a sign that the estimation method being used to produce the time estimate is probably poor.
You're confusing the issues of indicating progress and estimating the time that remains. Progress bars show you how much progress has occurred, not how much time remains. Ideally, "progress" is defined as how long it takes to do something, so that the progress increases linearly with time, but that doesn't mean that you label one end of the bar as "1 hour remains" and the other end as "0 hours remain" and move the bar back and forth when your estimate changes.
Imagine you're building a house, and you estimate a week to lay the foundation, a week for framing, a week for interior work, and a week for finishing. (...because, like the average programmer, you totally suck at creating estimates.) Then you spend two weeks laying the foundation. Regardless of whether you assume the remaining work will take three weeks or six weeks, you're still 25% complete. The only way the progress goes backwards is if, after the house is complete, someone points out that you forgot to have the foundation inspected, and the inspector rightly points out that you should go back to writing computer software where there are no engineering standards and your "it looks good to me" attitude is considered acceptable.
When people get confused over the simple concept of how a progress bar is even supposed to work, it's no wonder that they so often don't.
Captcha: Pompous
My $699 progress bar add on works perfectly. I didn't even have
to reboot.
I don't think the progress bar measure time, but progress of the task being done. If the computer goes off to lunch, then the progress stops.
...has the most annoying progress bar I've ever seen. It doesn't measure the size of the files but the number of files so you have no idea how long it's going to take. It could be 95% done after 3 minutes and still take hours to finish. Do I have time to run to the bathroom or do I have time to go grocery shopping? (Or in the case of the preliminary download I might have time to significantly remodel my house.) That's all I want to know...
train your users better? Users have been exposed to uneven progress rates for years, you would think they could just understand that it is an approximate representation of a large amount of work that the computer is working through. I'm not a user; is it really that hard?
That's because the ideas to communicate are complicated.
Writing one large files IS NOT THE SAME AS
Writing a bunch of small files IS NOT THE SAME AS
Copying a file from the internet IS NOT THE SAME AS
Copying a file from the local network IS NOT THE SAME AS
Querying and updating the database IS NOT THE SAME AS
Reading files from a cd-rom IS NOT THE SAME AS.......
all of which a status bar tries to sum up, poorly.
...especially about the future.
It is not impossible, but it is not worth the effort to create or the extra processing time required to do it. It could even potentially add minutes to your wait time if done poorly.
Acronis does this every time I use it.
of course, .. once the outer bar is allowed to grow.
There is at least 1 progress bar which is perfectly accurate!
...Had this been an actual emergency, we would have fled in terror, and you would not have been informed.
CPU usage is not the bottleneck for file transfers. Your sensation of speed up is imaginary.
Use a moving average of N samples. All you have to do is figure out the correct N....
especially about the future ....
- Yogi Berra
My Favorite "progress indicator" is the one on GParted. Some "steps" can be 100-1000x as long as others and often there is no possible way to know (like chkdsk). So, it lists all the steps & their micro-steps (with timers?), shows you their command-line equivalent (educational), and gives you the most sensible progress possible within each micro-step (sometimes none). The aggregate is logical: In step 1 of 4? 25%, within there if you are on micro-step 2 of 5: 35%.
It helps because you aren't concerned when the progress is halted at an annoyingly long step. It tells you what to fix if you want better progress here. And it becomes easier to guess if something's halted vs if it's simply taking a while: If writing to MBR is the first step & takes minutes, something's wrong.
This is similar to the kernel boot process & some video compression software. When in doubt, give us more data.
Science & open-source build trust from peer review. Learn systems you can trust.
Getting it right all the time is impossible. Getting it right 80% of the time should be easy. Much of the problem is due to bad assumptions about 'progress' E.g. The classic windows file copying progress bar. I think it works on the basis of the number of files. The estimation doesn't take into account the size of the files, or the depth of the directory tree, or how busy the disk subsystem is.
What *should* happen:
* A better estimation algolrithm taking into account the 4-5 leading factors.
* Smart modification of that estimate based on a weighted average of the progress to date.
Sure it's not going to take into account the 'getting hit by a bus' scenario mentioned in another response. But it should be able to make better and better estimates for MOST of the circumstances.
In addtion, if the program is smart, and sees random events interfering with the progress, it could express a range.
Third Career: Tree Farmer Second Career: Computer Geek First Career: Teacher, Outdoor Instructor, Photographer.
On the one hand, the logical implication is clear: if we can't predict whether a given program will halt, we certainly can't predict when it will halt.
But, on the other, who on their right mind would tack a progress bar on a problem that is equivalent to the halting problem, like for example searching for proofs of mathematical theorems with no limitation in lenght?
An essential ingredient for turing-completeness is precisely the ability to express potentially infinite loops, and in all cases of progress bars the loop is obviously bounded, like for example by the amount of data being transferred, or the amount of processing done for some algorithm.
What you are describing is stuff to go into log files.
In which case displaying the title of the current step becomes an option to expand or collapse a view of the tail of the log file inside the progress window. I've seen such a "show details"/"hide details" control in Nullsoft installers on Windows and in Update Manager on Ubuntu. Or are you talking about hiding the entire log file from the user's view until the entire process has completed or failed, so that the user has nothing to Google until it's too late?
The problem with progress bars is that computer science hasn't stressed measuring the time and resource cost of different kinds of computation. We talk about the Big-O notation and discuss profiling as if it were a curiosity. This is contributing to making operating systems sloppy about what kind of resources they'll allocate to a given process to the point where the idea of a "real-time" operating system is considered novel and different instead of the norm.
If we learned a holistic time and resource cost to our code and built in self-profiling in from the start, we could have extremely accurate progress bars, assuming they reflected a deterministic process, which the vast majority are.
Hey fuckface, what you're talking about isn't a progress bar, it's a time estimate. Yes, time estimates are often next to useless, but they aren't progress bars. Progress bars are, well, exactly that... Bars to indicate how much of something has been done, not how much longer that something will take to complete. Just because you often have a time estimate along with a progress bar doesn't mean one is the other.
The simple fact that a task is non-deterministic in nature should lead the developer to change "2 minutes remaining" to "approximately 2 minutes remaining" or something similar.
that the author is so stupid he doesn't understand the problem,
or that slashdot has sunk to such new lows as to even entertain this question in the first place.
A progress bar can be accurate for processes that perform one single action, when all relevant factors are under the program's control or at least predictable. Such as copying one single file, or burning an ISO.
But processes that perform multiple actions depending on a variety of requirements are quite another story.
For example, adding or updating an application through the Ubuntu Software Manager in simple terms involves downloading the software (which requires bandwidth) and installing and configuring it (which mostly depends on hard drive/CPU speeds). In many cases, especially for software from non-Ubuntu repositories, software must be downloaded from different servers, each with its own bandwidth, etc. The progress bar for that process is rather inaccurate.
Progress bars for process that build or update data warehouses or produce complex reports from multiple sources are also very tricky for the same reasons.
Modern operating systems are non-deterministic in terms of resource allocation. In other words, you don't know if you're going to have 90% of the CPU or 10% and you don't know if you've got an empty hard drive, or one that's got 1% left and is completely fragmented. On top of that, you don't know if you're running on a single processor generation 1 Pentium, or the most modern multicore processor, so 90% "of what" is undefined (in terms of CPU). You _could_ create a lab of like 20 different PCs (like one for every year since the Pentium came out or something) running various flavors of hardware and doing various things (attempting to broadly simulate your audience), and you could get "in the ball park" for estimated time for your multistep task using statistical analysis of these varying machines, but we're talking about a progress bar here.... is it really worth all that money, time, expense, just to make you feel better about an installation that's going to happen once, maybe twice?
Now, we could be developing on top of real time operating systems, in which case resource allocation is deterministic (or at least very close to deterministic), and your progress bar would behave very predictably. However, you probably wouldn't like the trade off. Your "looky looky, I can do 15 random things at once, and I don't have to know anything to create Powerpoint presentations" computer would suddenly become much more demanding of your brain, and your software selection would be reduced significantly.
For a file transfer process why not use a picture of two tanks separated by a pipe. Water level in the tanks for source and destination. Size of the pipe and a flow rate for bandwidth and read/write speed. You could tell what an estimated completion time was based on and why it was changing.
You seem to regard science as some kind of dodge... or hustle.
Progress bars are accurate when they measure progress (which many do).
People complain about their accuracy when they fail to predict/indicate how much longer it's going to take to finish the task. News flash: the application has no idea what the future holds! It knows only about the past, and when the future doesn't match the past, any assumptions predicated on past behavior turn out to be inaccurate as the future comes.
I would like to add: duh.
Your probably right about it not being CPU related, but for large tasks verbosity definitely slows the procedure down.
One of the reasons that progress bars are so hard is that people architect their applications incorrectly.
Progress bars are an overview of the whole process. If you don't design your application for that use case, you'll get small, atomic operations with no real cohesion - ie: what you tend to get when you do OOP design. Data encapsulation makes it so the left and right hands have no idea what's happening. Plus, your UI code is somewhere off in the distance, and you can't really get to it.
It can be very difficult to create a single progress bar that communicates the progress of several unlike tasks. I attempted this once for a project I was doing. I had several file system intensive processes and many other types of processes. (About 40 different unlike tasks in all)
I gave a 'weight' to the different processes based on my observation of how long each took. It worked great "on my computer" when I was done. (Though the progress bar accounted for about 30 percent of my code at that point and added quite a bit of complexity)
When I ran it on the production machine it was way off. I adjusted my 'weight' values for different processes to make it better and eventually changed it to give a text description of each time consuming step with percentage complete of each major task. My customer said the exact same thing "is it that difficult to make an accurate progress bar?" The answer is yes.
I agree that little progress bars flickering all the time are not useful other than that they show that it is doing something. The lack of any feedback would cause the user to want to end the process before it is complete. (Worst case scenario if you don't give the user any progress indicators)
I think that unless extensive testing is done, most developers put a relatively small amount of effort into creating a progress bar and go on in ignorant bliss believing that it is fairly accurate. It is difficult to get project managers to classify progress bar accuracy as a high priority item. (Especially since most of them are so inaccurate and we are all used to it)
If you really want an elaborate and accurate progress bar you would have to first come up with a benchmarking system to test the system resources on the system you are installing. (this would run before anything happens) You would then need to apply weight to each type of process based on the tests. Run it on hundreds of different configurations with different amounts of available resources. Write something to capture the results and adjust your weighting system accordingly on the fly.
All of this will add cost and delays to the project and the installer will take more time to account for the benchmarking and constant checking and adjustments to the weighting system. I guess anything is possible given enough time and money.
In my opinion after adding hundreds or even thousands of lines of code and a self adjusting weighting system you may or may not be happy with the accuracy of the progress bar. It would most certainly have to go forward and backwards to be as accurate as possible if estimates are constantly changing or it would have to freeze at a certain percentage until the new estimate catches up to the old estimate. (either way would not be perfect)
Programmers love to go down rabbit holes like this but in business you will go broke spending thousands of dollars on progress bars unless the project is really huge and the progress bar is an essential part of it. (I cannot think of a scenerio that meets these criteria but one may exist)
...that you can't ask a software developer when the code will ship.
The honest ones will say "when I done with it." Others will make up a date and then miss the deadline anyway, because unexpected shit happens.
Seriously, computers can't magically solve problems that humans themselves haven't any hope of solving.
Read up about Turing's Stopping Problem
While I am possibly the least qualified reader of Slashdot to attempt to answer this question, my guess it that it's the same reason I have trouble telling my boss when I'll have a given task or project done. Namely that different parts of the project take different amounts of time depending on difficulty, some of those processes are dependent on yet other processes that I can't directly measure myself (because others are involved), and because background processes occasionally spring up as a high priority event that interupts what it is you're asking me to measure (occasionally even causing me to never get back to the original process).
I can tell you that it's not always easy to give back a progress indicator which is meaningful to the user. The user wants to know (generally), how *long* is left, time-wise, where most progress-bars indicate how much of the overall activity is left to complete. The OP actually makes this point quite clear.
And therin lies the proverbial rub: if you're, for example, unpacking a small app, but you send some kind of statistic or registration information over a network at the end, even though that last sub-action is small (in comparison to the overall process), you're at the mercy of network latency, so that could be anything from 5 seconds to whatever network timeout has been set.
Trying to give a useful ETA on a progress bar / percentage feedback: now that's a challenge. Just for chuckles, check out http://code.google.com/p/fappy -- it's a playlist generator written in python. I wanted some kind of ETA on there, but I'll be the first to admit that it takes a while to settle and the ETA may rise -- because you can only make future predictions based on past experiences so, whilst you may have zipped through the first 1000 of 20000 files really quickly, you could hit a bunch of super-fragmented files, wait longer on disk IO, and have your ETA rise.
So the short answer is that it's quite easy to provide a progress bar displaying, essentially, a percentage of completed tasks within a procedure. But tying a progress indicator to an ETA or making sure that all percentage points come at the same time cost -- less than trivial.
We could use the same algorithm to report to the boss "How far along is your project?"
Star Trek transporters are just 3d printers.
Your progress bar may have been accurate as far as the total task completion, but not as an estimate of the time remaining to complete it.
It is well known in coputer science that there is no way to determine through programming how long a process will take to complete, or even whether it is guaranteed to complete at all. Combine that with the fact that your program does not get complete control of the CPU and instead shares time with other tasks.
Even if the ETAs are increasing rather than decreasing because of the slowdowns you mention, they will still be reassured that the process hasn't frozen.
If a time remaining display ends up fluctuating between (say) 1 minute and 1 hour depending on what step the process is on, the user gets the impression that the estimation is uselessly inaccurate. In this case, showing the title of the current step assures the user that the time remaining display isn't just wired up to display random numbers to placate the user.
Ideally, the program would write a log file containing the title and completion timestamp of each step, and it would send that log file to the developer to help improve the estimation in the next version. But I imagine that a lot of users aren't willing to enable that out of phobia against applications that "phone home". Do you check the "Customer Experience Improvement Program" box (or other publishers' counterparts) when you install software? Showing the title of the current step gives the user something to talk about in reviews even if the user chooses not to share the log file, as a form of indirect customer feedback.
The reason you put up with the hoop jumping is because the overall gain in cleanliness and layout of the UI dominates the very rare occurance of wanting to see this information.
Perhaps it's because I'm a geek, but I disagree that it's a "very rare occurrence". Say the progress dialog has a show/hide button to show or hide this tail display, placed next to the random number generator labeled "estimate of remaining time". How exactly does removing this show/hide button produce an overwhelming "overall gain in cleanliness and layout of the UI"? What it does is provide an incentive to keep the "estimated remaining time" honest.
The problem is that a progress bar has the faulty conception that it has to predict how long is it going to take when in reality a "progress" bar should indicate how much has it done out of a total to be done. Remove the time prediction and put something that really tells the user that something is happening, for example if you are moving a file put a "Bytes moved" counter or something similar, the user will know that the computer is working on the request and your progress bar may then show really how much has been done
This gave me an idea for a better progress bar. I even made a rough animated GIF.
Some details:
To my thinking, this solves the problem of having to 'lie' to the user, while still giving them some useful information about remaining time and assurance that something is happening.
I'd value some feedback on this idea.
That sounds like you spend a lot of time using a lot of broken software.
No software is shipped perfect.
[logs] will simply look like "some linux scrolling by" or "the matrix". Those users will instantly be made fearful of the application
You claim that the availability of logs necessarily induces fear. I'd like to see evidence of this.
InputBox, usrlp,
Loop %usrlp%
{
Progress, R0-%usrlp%, , , Working on task %A_Index% of %usrlp%
Progress, %A_Index%
[TASK]
}
software that can't even install without failing is pretty spectacular.
That depends on the difference between "installation" and "initial configuration". Is a device driver "installed" if no device is connected? Is a program "installed" if it hasn't been able to download essential components through the network? Here, I am including any initial configuration step that does not involve user interaction in installation.
Consider programs that install by downloading components through the network, programs designed to communicate with a network service, and programs that verify the user's license to execute them through the network. If the server is unreachable, the installation will fail. If the network is unexpectedly slow, such as 0.05 Mbps dial-up or EDGE where the developer was expecting 5 Mbps cable, the estimation of remaining time will jump around, and the user will be curious as to why. A user who can't easily view why will suspect that the program is hiding something from him.
Or consider the installer for a printer driver. If the user hasn't connected the printer, seated the ink cartridges, and inserted paper correctly, initial configuration will fail.
the user should be informed "something is happening"
The program has already proven itself distrustworthy by showing the inaccurate progress bar and ETA. It needs some way to assure the user that "something is happening" in a way that the user isn't inclined to immediately distrust.