Slashdot Mirror


Preload Drastically Boosts Linux Performance

Nemilar writes "Preload is a Linux daemon that stores commonly-used libraries and binaries in memory to speed up access times, similar to the Windows Vista SuperFetch function. This article examines Preload and gives some insight into how much performance is gained for its total resource cost, and discusses basic installation and configuration to get you started."

46 of 144 comments (clear)

  1. LiveCDs do this... by SaidinUnleashed · · Score: 4, Informative

    This is exactly why live CDs like Damn Small Linux (and Knoppix, if you have a ton of ram) run so fast if you load the CD image to ram. Ram is fast!

    --
    Shiny. Let's be bad guys.
    1. Re:LiveCDs do this... by ozmanjusri · · Score: 2, Informative

      It felt like I was running XP on a machine with the minimum specs and running bloated software. Even the mouse was jerky.

      Something's not right there. Puppy's normally responsive on machines that'd be slow with 98SE.

      --
      "I've got more toys than Teruhisa Kitahara."
    2. Re:LiveCDs do this... by ozmanjusri · · Score: 2, Informative

      Puppy's like lightning on my Core 2 laptop. If the mouse is lagging, I'd suspect the graphics card/driver. Try selecting a different Vesa mode next time you boot and see what happens.

      --
      "I've got more toys than Teruhisa Kitahara."
    3. Re:LiveCDs do this... by Anonymous Coward · · Score: 4, Informative

      Make sure you pass the "toram" parameter when you boot the livecd at the kernel load prompt. You can press the various function keys at boot time to find the correct method.

    4. Re:LiveCDs do this... by tacocat · · Score: 3, Insightful

      Several things come to mind when I read the post.

      I thought Linux cached used libraries in RAM already, resulting in the appearance that Linux was always using up all my memory but wasn't really. If true, then this basically does what? Guesses what you want to use and loads them for me? Decides what I use a lot and makes certain they never fall out of memory? In both cases, someone is not using my resources in an optimum manner.

      If I use the price of my first desktop computer and use that to purchase a new computer at Dell I am moving up 40 times in speed, 2x in architecture buss, 4x in cores, blah blah blah. Compared to the last computer (2006) I purchased I can still get something easily double in performance from that.

      So, Not sure what you need in performance, but between the stupid amount of computing power and Linux already doing a lot of in-memory caching there might be a pretty small margin for improvement. But I guess what I really struggle with is the idea of someone/something trying to proactively determine what I'm going to use and then force my computer into a certain behavioural pattern that is making assumptions about my use. Sure, it screams marketing demographics, but even without a PR department for Linux I still don't think there is sufficient need for something like this.

      Can someone elaborate on practical reasons where this is something I would really need

    5. Re:LiveCDs do this... by Ibn+al-Hazardous · · Score: 5, Informative

      You start with a false presumption. I do not know what distro you use, and can't tell you if that does anything nifty - but "Linux" sure as hell does not do this already. If another app has already loaded as shared library, it may well be in RAM but it can just as well be swapped out. For all other cases, the answer is probably that your shared libraries are not cached or preloaded - and so this will give you quite a speed up.

      The thing that eats all your RAM is nothing Linux specific at all, it is your applications asking for more RAM than they are currently going to use. Why should they do such a thing? Well, what do you think memory management would look like if hundreds of apps, daemons and kernel threads ask for two bytes at a time? It'd paint a pretty fragmented picture, so they ask for gobs of pages at a time. Pages seldom touched get swapped out, but still there's an awesome amount of overallocation - thus your memory seems to be 100% allocated 100% of the time.

      So, preloading libs that are frequently used is probably going to use your RAM in a more meaningful way unless you already have a problem with constant swapping.

      --
      Yes, I am a biological organism. All rumors to the contrary are just that, rumors.
    6. Re:LiveCDs do this... by joto · · Score: 3, Insightful

      I disagree with both of you. There should not be a need for this. Linux memory management should be closer to optimal for desktop users, but unfortunately the current strategy just doesn't work. It's optimized for servers, paging out interactive apps whenever there's something going on in the background.

      In particular, the locatedb daemon makes everything unresponsive because linux caches every file on your file-system it touches, even though it's pretty much guaranteed nobody else needs those files anytime soon. This may be theoretically "optimal" in the general case, but it certainly doesn't feel that way for desktop users. Most desktop users would be more than happy to have background jobs run slower if it didn't impact responsitivity. Also, I believe many people would prefer predictive response-times; it's better for the disk to churn while loading a huge file, instead of it churning everywhere else to page in libraries that have been paged out because the huge file is in memory.

      Adding a daemon to predict shared library usage is a step in the wrong direction. Not because it doesn't fix the problem, unfortunately I haven't tried it, but sure, it might even work fantastic. It's a step in the wrong direction because it's a kludge, and not a proper fix for having memory management strategies in the kernel that the users actually want. Unfortunately, fixes to this problem are hard to do, and every time someone tries to do a proper fix, it is debated to death on the kernel mailing list, and then dies slowly as it ages out of tree. For all I know, it's also the right decision, if it should be in-kernel, it should also be *right*. A daemon might be a better place to experiment, and hell, if it solves the problem for 99% of the users, we might not even need to change the current strategy, which is certainly right for servers. After all, we live with kludges other places, such as the X Window System needing to be root and accessing raw kernel memory.

      But yeah, memory management is complicated. I doubt you can solve this on a piece of paper. If it works, I'm all for it! Maybe this is a proper kludge?

    7. Re:LiveCDs do this... by mhall119 · · Score: 2, Informative

      I thought Linux cached used libraries in RAM already, resulting in the appearance that Linux was always using up all my memory but wasn't really Linux uses a disk cache in RAM to keep from re-hitting the HD for often-accessed files. It actually is using up all your memory that hasn't been allocated to an application (which is good, because unused memory is wasted memory), but it will drop some disk cache space when other applications need more memory.

      If true, then this basically does what? Guesses what you want to use and loads them for me? Essentially, yes. On of the bigger bottlenecks to application startup is disk seek/read times. By performing this action in the background before it is requested, you won't hit that lag time. The new versions of Java are doing something similar to speed up the cold-start time for Java apps.

      Another benefit, and I'm not sure if Preload does this, is to arrange the files on the HD so they are not fragmented, and are in the same position on the disk, so that a single (or small number of) read can copy everything into RAM, instead of hitting the disk over and over.
      --
      http://www.mhall119.com
    8. Re:LiveCDs do this... by EvilRyry · · Score: 2, Informative

      The kernel already supports hinting like this. Indexing programs should throw the kernel the hint that the files it reads should not be cached. Whether the programs actually do this or not is another matter.

    9. Re:LiveCDs do this... by The+Mighty+Buzzard · · Score: 3, Interesting

      Unless you're still coming from the Windows mindset where you're used to closing an application after every use of it, preload isn't of much use at all. If you never close an application, startup time is not an issue. The firefox window I'm posting this response from now has an uptime longer than any windows box with automatic updates turned on and is only clocking in at 118M/22M resident/shared. I could possibly see it being of some use if you actually open and close OO.o regularly (it's a slow, bloated beast even by Microsoft standards) but that's an argument against OO.o not an argument in favor of preload.

      This is linux, people. We like tiny apps that require almost zero load time that you can chain together with pipes. Catering to bloated, poorly coded, Microsoftesque apps shouldn't be an issue for us.

      --
      Violence is like duct tape. If it doesn't solve the problem, you didn't use enough.
    10. Re:LiveCDs do this... by billnapier · · Score: 2, Informative

      Sigh. The reason Linux always reports all your memory as being used in the page cache where it caches pages that are read from block devices (like you hard disk). Physical pages in memory that are unused (as opposed to virtual pages that your application just hasn't accessed yet) are used to store data read from disk in case you need to access it again. If you application starts to actually use pages that it allocated (like accessing things in shared libraries), linux will dump those disk cache pages from physical memory and start using those pages for the data your app needs. It can easily do this because it knows it is just a copy of what is on the disk and could easily be recovered.

      Take a look at "Understanding the Linux Kernel" from ORA for 2 excellent chapters on how all this works.

  2. ramdisk by wcpalmer · · Score: 2, Insightful

    I read a guide on the Gentoo forums a while ago about copying different directories into ram to "preload" them.

    http://forums.gentoo.org/viewtopic-t-296892.html

    I never actually tried it, but I might now that I have 4gb ram! A daemon to help automate this process would be welcome, though.

    1. Re:ramdisk by arivanov · · Score: 2, Informative

      I do this on a couple of systems that see only "occasional" use so I can spin down the disks. Works quite well actually.

      --
      Baker's Law: Misery no longer loves company. Nowadays it insists on it
      http://www.sigsegv.cx/
  3. Re:What, what? by guruevi · · Score: 5, Informative

    I don't know what rock you were under, but preload has been available for a while:

    preload 0.2 release: 2005-09-01

    And it was there before as it was packaged in Gentoo (back when it was still popular) and Suse 9.3

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  4. Re:What, what? by bersl2 · · Score: 4, Insightful

    (or wherever Microsoft stole it from, if that's the case). OS X had prebinding before Vista had SuperFetch. And they got the idea from somewhere else.

    Just let it go. This pissing match over innovation serves no useful purpose.
  5. Hope... by deadmongrel · · Score: 4, Funny

    it doesn't make GNU/Linux as *fast* Microsoft Vista.

  6. Apple OS X by Thomas+M+Hughes · · Score: 2

    Is this functionality available in Apple's OS X?

  7. Nice but... by pizzach · · Score: 4, Informative

    I never had any luck with preload the times I tried it (a year or two ago?). Nowadays I use alltray for preloading often used apps that are a bit chunky such as Firefox or Openoffice. Openoffice also has a built in preload feature...but you can use alltray anyway for the same effect.

    --
    Once you start despising the jerks, you become one.
    1. Re:Nice but... by Nemilar · · Score: 4, Informative

      Alltray and preload are two totally different things. With Alltray, you're talking about keeping the application open, just minimized to the system tray. With Preload, you're talking about caching the binaries/libraries in memory so when you do open the application, it's reading the data from RAM rather than the hard drive. Sure, AllTray moves the load to RAM, but at the cost of entire applications. The point of preload is that it just caches the most commonly used files.

      --
      Nemilar http://www.techthrob.com - Visit Me!
  8. Re:But I thought Vista doing it = RAM hogging? by bersl2 · · Score: 2, Interesting

    You have the option not to run this sort of program. If it sucks, turn it off.

    Also, Windows' VM system (IMNSHO) has always sucked and will continue to suck; predictive loading of entire bits of software has nothing to do with it.

  9. Re:Security Implications? by Skapare · · Score: 5, Funny

    The normal ramdisk vulnerabilities.

    You mean like losing the data after a few hours of no power?

    --
    now we need to go OSS in diesel cars
  10. Re:What, what? by markov_chain · · Score: 5, Insightful

    innovation = first time to do something from your point of view

    invention = first time to do something ever

    Note how MS is always careful to point out they innovate.

    *flush*

    --
    Tsunami -- You can't bring a good wave down!
  11. Re:But I thought Vista doing it = RAM hogging? by Anonymous Coward · · Score: 5, Informative

    Vista's implementation is marketed as being useful for older, slower machines with less RAM, where it actually may be unwanted, and could cause performance issues (unless it's disabled below a certain threshold - it might be). It's only really useful if you have lots of RAM (around 2GB or so). Yes, SuperFetch has an extra mode where it uses a USB-2 stick as a secondary disk cache, but that's not what we're talking about here. That mode is generally perceived as a gimmick.

    Linux handles having lots of RAM a lot better than Windows (XP) does, because of differences in the way the caching system was designed. Linux (and OS X) was intended to run entirely from RAM and use little swap. I've run, say, OpenOffice once, not used it for several weeks, and the next time I start it it loads almost instantly, because it was still sitting in the cache. My machines have 2GB of RAM, with much less than 500MB actually in use - the remaining 1.5GB is being used as disk cache. Swap usage is either zero, or very close. Of course, performance goes to hell if you do something that flushes the disk cache, or if you try using such a system on a machine with 256MB of RAM.

    Windows, on the other hand, was designed to run almost entirely from swap, and tends to drop stuff from the disk cache when it's not been used in a while, as well as moving stuff out to swap rather aggressively. That works great if you barely have enough RAM to run the OS, but it's terribly wasteful if you have more than enough RAM. In this case, SuperFetch is actually useful, allowing it to catch up to and actually surpass Linux, by monitoring which files are actually used and making sure they're already in the disk cache.

    That's great, although nothing new. Other OSes have had this for years (this Linux implementation dates back to 2005, Mac OS X has had it for ages, and neither implementation was original) - Microsoft were just the first to brand it.

    TFA said nothing about Vista's implementation.

    I think the primary problem people have with Microsoft's implementations is that they're typically very complicated, and have a tendency to degrade over time. XP is the typical whipping boy for this - none of the self-maintaining performance stuff (prefetching, or the prelinker) actually works for longer than about six months, meaning that an XP installation starts off fast, gradually gets faster, and then rapidly slows down as the system tries to speed itself up.

  12. Blogspam by bziman · · Score: 5, Informative

    The submitter is the author of the blog, and is merely paraphrasing the whitepaper written by the author of the software -- and that is two years old. Nothing new or interesting here, just someone trying to draw eyeballs to his blog.

    1. Re:Blogspam by DragonTHC · · Score: 2, Insightful

      what's wrong with trying to drum up a little readership?

      For those of us with our own blogs, how on earth do you get readers without tooting your own horn?

      --
      They're using their grammar skills there.
    2. Re:Blogspam by xenocide2 · · Score: 4, Insightful

      By doing something productive, not spamming me with shit I already knew about. Blog about new information you've generated. Maybe make some charts about disk head position during boot and demonstrate whether I/O is throughput or seek bound. Above all else, don't just copy someone else's shit and translate it into HTML.

      --
      I Browse at +4 Flamebait

      Open Source Sysadmin

  13. Re:What, what? by sqrammi · · Score: 5, Informative

    It's actually a little different than the preload that's been in Gentoo for years. The core functionality is of course the same, but now a daemon runs that caches libraries and updates the linkage periodically. So, it can possibly give much more performance, since everything is always up-to-date. It will be standard in Hardy Heron when it comes out.

  14. Re:What, what? by Anonymous Coward · · Score: 5, Interesting

    SuperFetch was one of the first things that I had to disable in Vista. I had downloaded a linux distro (a large .iso file) using Firefox, and for the next two weeks, everytime I rebooted my computer I would have to listen to my hard drive chug away for the next 10 minutes while it loaded the file into memory. (The new resource monitor in Visa is nice -- that is what helped me track down the problem).

    My computer is MUCH faster now that SuperFetch is disabled. Like night and day.

  15. Similar? by jsse · · Score: 4, Funny

    Preload is a Linux daemon that stores commonly-used libraries and binaries in memory to speed up access times, similar to the Windows Vista SuperFetch function I might be wrong, but similar function in Windows Vista should be "Reload".

    Vista users respond positively toward the speed boost everytime we "Reload" their Vista. The downtime and data lost as a result of "Reload" might irritate some disgruntled users, but most of them enjoy the free break at the expense of the company.

    Nothing in those Linus thingy could beat that user satisfaction. I might be bias though.
  16. Re:What, what? by ozmanjusri · · Score: 2, Interesting
    it's an old, but good idea.

    The Amiga one of the same era was very good. You had a recoverable RAM disk, which functioned the same as a standard RAM disk, but would maintain its contents on restart. That meant reboots were lightning quick, and any data you stored in the RRAM disk was still there.

    Shame we haven't got back to that level of functionality.

    --
    "I've got more toys than Teruhisa Kitahara."
  17. Re:What, what? by Assembler · · Score: 2, Insightful

    OS X had prebinding before Vista had SuperFetch. FYI: Prebinding != Preloading
  18. Re:What, what? by EvilIdler · · Score: 5, Funny

    And what do you call the first time it works properly and reliably? Fantasy.
  19. Re:Security Implications? by janzen · · Score: 4, Funny

    Only if you run out of cold spray.

  20. Re:What, what? by metalhed77 · · Score: 4, Funny

    I'm envisioning a sensible sort of preload program in gentoo right now:

    *Preloading commonly used data, libraries, and binaries...
    gcc OK
    make OK
    libc-dev OK ... *snip* ...
    emerge OK
    kernel-src OK
    *Preload done, 3827K of USE Flags, 2TB of source code, and one compiler, and firefox to surf forums.gentoo.org for better use flags while you compile loaded into main memory

    --
    Photos.
  21. Re:What, what? by Spy+der+Mann · · Score: 4, Funny

    In Linux, most things never reach the x.0 stage, no matter how mature they are.

    This reminds me of a geek girlfriend I had... she told me she was 29.9.9.12.1 years old. But when I met her in real life, i was suprised she had a daughter... 17.1.25 years old!

  22. Re:What, what? by Jah-Wren+Ryel · · Score: 4, Insightful

    You mean Linux adapted something from Windows instead of the other way around? Fundamentally, preload and superfetch are just gussied up versions of the sticky bit which I am sure wasn't unique to unix back in the 70s either.
    --
    When information is power, privacy is freedom.
  23. Not sure about the gain by JanneM · · Score: 3, Insightful

    I have a pretty good amount of memory on my current machine - 2Gb - and I mostly just never close any applications, especially with the big ones like Gimp just reusing the already open instance when you open a new file. I suspect that preload would not actually be all that useful for me in practice; I'm still goign to enable it to see if I'm wrong, though.

    --
    Trust the Computer. The Computer is your friend.
  24. Re:What, what? by nguy · · Score: 3, Informative

    You mean Linux adapted something from Windows instead of the other way around? What's next, a sane proactor i/o api?

    Not really. Caching policies like this have been around for longer than Windows has even existed. Most of the things that Linux "adopts" from Windows or Macintosh originally came from UNIX or mainframes. Even in 2008, there is hardly an original idea in any of those operating systems. And preload itself is, of course, older than Vista.

    You can be mad at Vista for a number of reasons, but SuperFetch is not one of them - I have noticed a decent speed improvement because of it, and look forward to having something similar in Linux.

    It's not clear to me why this should be a separate user process; what it's doing is simple enough that whatever is doing can be done directly by the kernel. In fact, I wouldn't be surprised if you could get the same speedups by simply tuning a couple of kernel parameters.

  25. Difference with readahead? by pieleric · · Score: 4, Interesting
    Currently I use readhead which, at boot time, basically uses a special linux syscall to tell the kernel to read some files ahead whenever it has nothing else better to read.

    Does anyone knows the difference between the two projects? Does preload have a better algorithm for selecting the files to read? Does it also use this special syscall?

  26. Re:What, what? by pipatron · · Score: 2, Interesting

    Except when you were coding something that ran away in memory, corrupted the RAD (the name of the recoverable RAM disk), destroying everything you hadn't saved for an hour or so. :)

    (The Amiga didn't have an MMU originally, and even when they got it, the OS didn't support memory protection due to the shared message passing)

    --
    c++; /* this makes c bigger but returns the old value */
  27. eeepc + preload = less waiting, more performance! by Ferret55 · · Score: 3, Informative

    tried it out on my little eeepc and it definitely made a difference, on average its sped up all loading times by about 30 percent. This is especially good because i upgraded to a 2gig stick of ram but most programs hardly need that much ram and on average im left with about 1.2 gigs just sitting there doing nothing, now the ram is more productive and the loading time is noticably faster eg. firefox on a cold start without preload took 10 seconds to load before, now on a cold start it loads in 6 :). Also since the cpu is relatively slow it means fetching data and the overhead of moving it around it cut down alot. I'd love to shake the creators hand for this plucky little piece of software :) thanks!

  28. Re:is toram parameter really faster? by BobPaul · · Score: 2, Informative

    The series of comments to which you replied is about Linux LiveCDs, which don't require/touch the hard disk unless you explicitly tell them to. Using "toram" or "dochache" or similar kernel switches allows the entire contents of the CD to be loaded to the ramdisk, dramatically speeding up loading and allow one to remove the CD.

    Even if your particular LiveCD is set to watch for and automatically use swap partitions, a HD is still significantly faster than an optical drive. If you install Linux permanently to your HD those particular kernel switches no longer do anything.

  29. Re; Separate Process by Ayanami+Rei · · Score: 2, Insightful

    You could do it in the kernel, but you shouldn't. The kernel keeps track of files using inodes and device numbers, not paths, which may be volatile between reboots (udev+kernel can dynamically assign device numbers to kernel devices, filesystems are identified and mounted by scanning for UUIDs or labels in superblocks, etc). A tracking daemon can monitor system calls and keep a small database with logical paths, access patterns, and so forth; the user-space view of activity tracks intent better so the statistics can be more meaningful.

    Moreover the act of caching the file is easily accomplished by a low-priority user-space task which speculatively reads the files which may be referenced soon. In this fashion the kernel memory manager does not need to be changed in any way; we are not creating a new kernel memory pool which would need new logic under memory pressure. In the case that RAM is suddenly needed for storing application pages or an unexpected demand-loaded program binary or library, we can flush these buffered files just like any other cached file; it's not treated any differently. It's just a daemon touching files (with the hope of a benefit in startup times of applications that require them).

    --
    THIS THING CAN TURN ON A DIME, MACROSSZERO STYLE ALSO FUCK BETA, ~NYORON
    1. Re:Re; Separate Process by nguy · · Score: 2, Insightful

      You could do it in the kernel, but you shouldn't. The kernel keeps track of files using inodes and device numbers, not paths,

      The kernel can keep track of files in whatever way it likes, including paths. But I don't see a problem with using inode numbers anyway. I think "preload" is trying a little too hard.

      Moreover the act of caching the file is easily accomplished by a low-priority user-space task which speculatively reads the files

      These "low-priority user-space tasks" are spreading like a cancer. Linux is turning into a micro-kernel, without the "micro" or the efficient inter-task communication.

  30. Re:Puppy by Talkischeap · · Score: 2, Funny

    "Sounds like 7 years ago you had no idea how to build a decent computer..."

    Sounds like you're quite an arrogant asshole...

    --
    If it don't GO... chrome it. ~ Frank Banks
  31. Dammit, this is so easy to demonstrate... by Junta · · Score: 2, Interesting
    Ok, linux box here, free -m:

    total used free shared buffers cached
    Mem: 2026 1512 513 0 770 379
        buffers/cache: 363 1663
    Swap: 2870 0 2870
    slashdot won't seem to let me format the way I want, but run free -m on your own. The cached column is the 379 figure.
    Note the 379M number. That is the amount of data read from disk and kept in ram. When an application needs to malloc and no completely free memory, yes it will free up those pages (it ideally picks the cache least likely to be needed again). But absolutely, disk contents are kept in disk cache, but only after load. And no, memory leaks aren't hopelessly pervasive.

    What preload does normally happens implicitly during boot. It's hard to demonstrate on init scripts effectively, but log into gnome right after boot, and the disk will thrash like crazy. Log out, kill every last process of that user, log in again. It will be quite dramatic. preload aims for that subsequent experience without the pain of the first.

    So what preload brings is simple, and all that has to happen is simple, know which files are relevant to typical usage ahead of time, and be aggressive about 'cat file > /dev/null' if the system during boot has IO idle time. Presumably, the key is identifying which files those are for a particular system.

    Linux implicitly aids this, but the user interface side still subjectively 'feels' bogged down because it won't proactively load things it doesn't know you'll need, despite the ability to derive this historical data in user space. If preload takes idle time (let's say, for example, while services with arbitrary sleeps and while waiting for username and password) and proactively gets cache populated, it is more IO work in the aggregate (disk will be hit up for things that will never be needed), but it will feel smoother out of the gate.
    --
    XML is like violence. If it doesn't solve the problem, use more.