Ask Slashdot: Why Is It So Hard To Make An Accurate Progress Bar?
hyperorbiter writes "How come after 25 years in the tech industry, someone hasn't worked out how to make accurate progress bars? This migration I'm doing has sat on 'less than a minute' for over 30 minutes. I'm not an engineer; is it really that hard?"
Comment loading ...
Yes it is "that hard".
I did it all for the penguins!
Yes it is. And to be fair, it's a lot more accurate than Nostradamus ever was.
Things are asyncronous. You wait for things from disk, ram, user input, over the network etc. How long it will take is non-deterministic. So a task composed of a bunch of these little pieces will be non-deterministic too.
One reason is the progress bar starts out as just a generic tool to show that your loading hasn't froze. At first it is parsed correctly with the elements to be loaded, but as scope increases and more things load, it can get sketchy later on.
Another reason is it is difficult to estimate time left. If you look at some old FTP programs, they'd estimate the rest of the download's time based on how fast the previous has taken. Future lag, fragmented files, etc aren't taken into consideration.
There's a bunch more reasons, but namely the progress bar's main purpose is to show you that the whole system isn't locked up, which they've been doing well for the past 30 years or so.
God spoke to me
Copying files. Sure, get a list of the files to be moved, get the size, as files go across, start the % progress meter. What if the network starts slowing down as you start to copy? New files are added. You used a rough calc to get a vague idea as it was 10x faster, but when you start copying, there's a lot of files bigger than you thought. Network's fast, but the end machine you're copying too is having problems keeping up. You start hitting cache, it was fast (and skewed the result) till then, now it's crawling. Installations. All the fun of copying files, but you're updating existing files too, file system may be fragmented, some of the .ini files as you get to may need extra work. Drivers to install may take longer than expected. Once installed, you have to generate/compile/download extra, that's more rough guesses.
As long as the hourglass/cursor/spinner is spinning, and the %'s is going up now and then, probably the best you can ask for. The trend for guesstimating time remaining seems to be diminishing, as surely the main thing most people want is to know 'is this still working or has it hung?' for anything else, logcat/catch stderr'more details' to find out what it's actually doing.
It COULD be more accurate perhaps, but you'd spend so long working it all out in advance, for 9/10 things, it'd have been quicker to just do it.
Waiting for an amusing sig.
It's very hard to predict how long something will take, particularly in relation to other things, if what you're writing is going to be on any number of platforms with different processors, storage, memory and network situations.
You can be reasonably accurate with it, far more than my favorite 99% in 1 second, the last 1% in one hour scenario. There are cleverer and cleverer ways of making it ever more precise, but those methods usually involve spending time on getting it right, and not many people do it.
There's probably a pantent for a "method or apparatus for an accurate display of progress", nobody wants to mess with that (but seriously most of my innacurate progress bars deal with unpredictable things like I/O, or non-uniform sets like loading textures and meshes and animations all together, so who knows how much time it will actually take to process the same ammount of data?)
--
Stay tuned for some shock and awe coming right up after this messages!
You know someone is going to take your suggestion literally as a tutorial on how to implement a progress bar - later they'll come back with some mystical crash always happening at 0%.
Patience is a virtue, but haste is my life.
See http://scribblethink.org/Work/kcsest.pdf and http://scribblethink.org/Work/Softestim/softestim.html
(No, I'm not being serious. The topic just reminded me of when I once jokingly justified a poorly estimated ETA on a "simple" development project by referencing the above paper.)
My favorite terrible progress bar was Internet Explorer, back in its early days of essentially being a renamed version of NCSA Mosaic. When attempting to load a site that wasn't available, the progress bar would slowly creep towards complete, despite the server being completely unresponsive. Then after a long while the browser would give up and stop the progress bar. Why on earth would the progress bar move if the server is completely unresponsive? Who programmed this? I would hope that they, like the inventor of Clippy, suffered a terrible death by fire.
Consider this: Once you've put progress on a bar, you can't take it off. Suppose you start a process that should take 20 minutes, and do the first 5 minutes, progress is now at 25%. But then, partway through, something unexpected happens and you realize the process is actually going to take 40 minutes. You can't take the progress "back" now, that would disorient the user. So you have to rescale the remainder of the bar.
Progress bars do not make sequences of actions complete any faster. In fact, they make them slower.
That being said, take for example an installer that must perform the following steps during an upgrade:
0. Figure out how many files need to be replaced.
1. Replace 30 files of varying sizes.
2. Add 10 files.
3. Update a half million rows inn a table with a million rows setting a column to a computed value based on some predicates.
4. Run a third party installation mechanism (MSM?) for a supporting library, etc.
Modern computers are time-sharing systems. Each process that involves computation is at the mercy of the scheduler in the kernel to give it the cycles it needs to complete. That means that even if you measure the time it takes to complete some process, it's not going to be the same a second time, because the installation process doesn't get undivided attention.
Steps 0 - 2 - you're at the mercy of the IO buses, hard disk, antivirus software interfering, etc.
Step 3 - What shape are the database statistics in? How efficiently can you apply the predicates? What does the distribution of the data look like? You can't tell this ahead of time...
Step 4 - Does this third party installer provide you some sort of metrics as it runs?
These are the sorts of problems to be overcome to do an accurate progress bar. In short, they aren't worth overcoming.
For over 50 years rocket launch countdowns have not run in a linear fashion, sometimes even being set backwards.
...would be a PB combined with some way for the app and OS to tell me why things have slowed down, in plain language. This would not be impossible to do.
-- This sentence is false.
The programmer does what the boss tell him to do.
The Progress Bar loses his functionality when Windows 96 start to use it just as something that moves on the screen. No real processing is associated with the Bar or Animation activity!
On the other hand, there are programs that do it right! Unfortunately, one of the best examples I have is a console Linux program: The Midnight Commander - so, very few people nowadays is exposed to a correctly written Progress Status notifier mechanism...
Lisias@Earth.SolarSystem.OrionArm.MilkyWay.Local.Virgo.Universe.org
You can work out where you are (% completed) or how fast you are going (rate at which the progress bar is growing), but not both at the same time.
It's simple quantum mechanics.
I am anarch of all I survey.
They have their progress bars sorted perfectly. Great game too!
I'm not signing anything
i'm a computer programmer. it's easy to make an accurate progress bar. take the total, take the current, divide. done. i don't know why windows progress bars and time estimated are so messed up. they're clearly doing something totally wrong. if not many things. as usual.
Clearly you're not a very good one.
As a happy middle, we could just change the UI widget to detect when a signal is lying to it and offer up a spinning (or otherwise infinite) processing ball.
I mean, what's worse, being stood up on a date or not knowing for sure if s/he was going to tag-along on a group activity that you invite him/her to?
Is there anything better than clicking through Microsoft ads on Slashdot?
The computer is able to measure it's data throughput, read/write times, etc.
Network latency, seek times, sudden Windows Update that decides to start installing packages in the middle of the process.. there's a BILLION different things that could be happening, all the kinds of things that drastically alter the speeds and therefore throwing the estimate right off the table. And that's only when you're just doing straight-up file-transfer, but what if you're doing something that requires altering pre-existing files, to boot?
Basically I am saying that it is certainly possible to make a progress bar that mostly does what it's supposed to, but you're going to be writing lots and lots and lots and lots of code and using quite a bit of CPU-time just to keep updating the estimate at all times, not to mention that the bar must then always be written specific to that single application and task at hand. And to what end? A bar that shows an estimate for how long the process will take, nothing more -- all the time spent coding the bar could be used to do bugfixing, providing new features or honing existing ones and the CPU-time could be used by other programs or to, you know, do something more useful in the meantime.
Mandatory Car Analogy: I know that if my speedometer indicates 60 miles/hour, that in one minute I will have travelled one mile. That's predicting the future son!
If you were to compare this to computers then you'd have things like sudden goose swarms jumping right in front of the car at unexpected times, all 4 tires losing friction, a bunch of other drivers on a single-lane road and all fighting each other for the privilege of getting to drive as first of the line, your car suddenly starting to perform maintenance and cleaning on itself while you're trying to drive, mandatory pit stops at variable distances and you having to always perform this or that manual task when you reach one, and so on. Does it sound so easy at this point?
The progress bar on the HTTP download doesn't show the amount of remaining time in the bar. It shows the number of bytes remaining in the bar; the number of bytes remaining can't go into reverse. The time remaining is showed as a numeric value for how long it would take assuming the speed is the same as the speed so far; if the transfer suddenly slows down, this value can go into reverse.
It is hard to make an accurate progress bar because it shouldn't be a bar at all - it should be a graph.
Consider the humble download: bytewise, it might be 97 percent complete, but at the last moment, the bps rate has fallen. With a progress bar indicating a percentage and an estimated time, it might say 97% complete, 3 seconds to go. If the progress indicator was a graph, you could tell that the bps rate has fallen, and that the 3 seconds to go estimate (probably based on a linear extrapolation of progress to date) does not apply.
I have never seen it done though. Partly, because I have never done it.
... when we solve the halting problem. I'm not entirely joking. The main problem with progress bars is that, quite often, it is not possible to accurately estimate how much time is needed to complete a problem (i.e. for the program to halt).
Loban Amaan Rahman ==> Anagram of ==> Aha! An Abnormal Man!
The public opinion of the Progress Bar would be considerably more favorable if programmers would simply treat 100% as if it were 75%.
In other words, do all the stuff you have to do, measuring progress and whatnot, but when you're actually at 80%, report yourself at 60%. Likewise, when you're at 95%, say you're at 70%.
Then, only when you really are completely finished, you jump from 75% to 100% in under a second.
Complaints gone.
-David
You know what I'd like to see more than a working progress bar? A "Cancel" button that actually stops the f*%! process! .
I don't want to finish the sub-process I'm currently doing (which has probably stalled)... just FREAKING STOP.
If you (programmer) want to close connections, or save the changes to the disk, do it in the background. Making me sit there for another 10 minutes while you're "cancelling..." is not helpful. I will force close your program. Failing that I will hard-reset the computer. Seriously.
Unlike porn, which yada yada rimshot hey-ooh!
Give this man some upward moderation.
The problem is that the question is wrong. It's trivial to make a progress bar...just sum up all the things you have to do, and move the bar each time a "thing" is done, rounding to the nearest pixel. It doesn't matter if the "thing" is a byte to copy, a file to install, or any generic task. As long as you can add one to a counter each time you have done another "thing", you can then display it graphically.
The actual complaint is about displaying accurate time remaining to complete the task, which really has nothing to do with the display of the progress bar. Instead, it involves guessing about how long each remaining "thing" will take to complete, and then displaying that sum of those times. This is hard because no matter how accurate the data used to make the guess, something outside the control of the program can disrupt the processing.
I like the new windows 8 transfer bar that shows instantaneous transfer rates (not average) as it progresses left to right. This gives you a very accurate idea of when the transfer will finish, as it can slow down or speed up based on either network congestion or file size (a bunch of small files transfer slower than few big files.)
Careful with names containing L slashdot.org/~AiphaWolf_HK slashdot.org/~AlphaWoif_HK slashdot.org/~AiphaWoif_HK
Once you know the past, it gets a little easier. I actually had to do this for a customer. A SQL select statement was taking about 5 minutes (lots of data), so the initial progress bar would be stuck at 1%. Knowing that it took about 5 minutes from testing, I increased the progress bar's total and incremented at a rate of 10 seconds appropriately via variable X -- which was initially stored as 5 minutes for all clients. Every time each client performs the operation, it computes the new time it took and adjusts variable X accordingly for each specific client. It's actually quite accurate now. However, the equation is trivial and only computes an average. An artificial neural network could perhaps make it even more accurate. But, if you don't know the past, then yes, predicting the future is hard.
The G
No, wait. It seems to have stalled.
Have gnu, will travel.
Have you tried PowerShell? It has replaced the "MS-DOS Prompt" if that is what your experience is of. In some ways PS is more advanced than UNIX, for example it allows you to pass data in an object-oriented fashion, avoiding the need to constantly parse data delimited with spaces with various commands (awk, sed, xargs...).
"How come after 25 years in the tech industry, someone hasn't worked out how to make accurate progress bars? This migration I'm doing has sat on 'less than a minute' for over 30 minutes. I'm not an engineer; is it really that hard?"
Yes, because all progress bars are inherently a prediction of things that will happen in the future. If there is any error condition, unusually large blob of data or weirdly structured hard drive to read from, varying bandwidth bottleneck, fritzy peripheral not responding as expected, etc., etc. times a million, then the unusual event will make the prior prediction incorrect and look silly in retrospect. As long as there is any "if-then" clause or error handling in the branches in the system, then the unexpected can happen and make the prediction (progress bar) invalid.
It's analogous to weather prediction. It can't be perfect, it's an extrapolation, but people will always complain about it.
We know where leadership by an anti-intellectual "strongman" who scapegoats minorities and likes boisterous rallies goes
Is why.
Hmm. I wonder if I wrote an app that was nothing BUT progress bar, if people would go for it.
Some developers have already come to the conclusion that installation is a prime advertising timeslot. So even if anyone was inclined to write a progress bar, it'll still end up ad-laden and annoying.
I'm trying to teach myself to set people on fire with my mind... Is it hot in here?
Show progress of the work, not the time. You never know what other things the machine is doing, and is better to give no estimate than a bad one.
And don't suffice with a bar, also provide information on what the process is doing.
-- The Internet is a too slow way of doing things, you'd never do without it.
This problem was solved a long time ago... I still wonder why people don't know how to code a proper progress bar...
HOWTO:
I've left out the easy stuff but this provides for a 99% correct est. Please note that the computer must be from 1 year in the future for this to exceed 87% correctness without causing time dilation within progress meter by blocks remaining^66 msecs. Check Mfg date at start and adjust time est accordingly.
The above is copy-written and my not be used....
Except even the number of tasks is often variable over the life of the task.
Take for example loading a web page. It starts out as 1 task: Get a page from the server. Once you've done that, how many more requests will that first request generate? Impossible to tell. It could be none. It could be hundreds, and some of those can generate their own requests. (etc, etc.)
The answer is this: Some feedback, no matter how incorrect, is better than no feedback at all.
Progress should be reported to the user in outline format. Give the user the list of tasks that the computer is working on, show progress in each task. This is much more informative, and as a user - i'd feel more intimate with the process and in turn more trusting. also this might allow me to troubleshoot things that are moving slowly. some games are great at this, i've never seen it in an OS though.