APT Speed For Incremental Updates Gets a Massive Performance Boost
jones_supa writes: Developer Julian Andres Klode has this week made some improvements to significantly increase the speed of incremental updates with Debian GNU/Linux's APT update system. His optimizations have yielded the apt-get program to suddenly yield 10x performance when compared to the old code. These improvements also make APT with PDiff now faster than the default, non-incremental behavior. Beyond the improvements that landed this week, Julian is still exploring other areas for improving APT update performance. More details via his blog post.
I'm missing something here. With all the flap about systemd, why the rush of all the distros to adopt it?
There has hardly been a "rush" to speak of. Fedora was first and switched in 2011, that's almost five years ago and we still have distros that haven't switched yet. This is one of the absolute slowest tech migration ever in the Linux community.
I'm on Mint, but even that is slated to go to systemd at the next major release. Binary logs, etc.? No thanks.
If you by "binary logs" mean regular text logs with a bit of metadata attached so that you can actually find stuff then yes, that's a good thing.
Wow, reading one byte at a time unbuffered? Who does that in real life? It's been well-known for like 30 years that buffered reading is an order of magnitude faster than byte-at-a-time - which matches the above result. The standard C library does buffered reads, unless you turn them off explicitly.
Did someone really turn that off explicitly? Why?
Jesus, someone should check the XML parsers. Maybe the same guy wrote an XML parser and it's doing byte reads.
Mainly because that flap involved very few of the actual technical people and people got upset long after the decision was already made. It also seems to be the case that some trolls went off on a misinformation campaign complete with fake bug reports. Quite frankly, I was terrified of systemd from what I was reading here, but then I actually read up on the subject and realized most of what people have been saying about it is false.
Now that I've actually tried it, my desktops boot faster and I have had a much easier time customizing the boot sequence of some of the servers I maintain.
Really looking forward to have apt-get speeds that can be compared with pacman. Julian Andres Klode, if you read this, please continue the great work!
Of course. It's not just plain text appended to a set of files so you would have to use journalctl to access them. Nothing wrong with that. Now this gives you a number of benefits. Since journald actually knows about the data that it stores it can make intelligent decisions. Everyone who has ever had to clean up a filled up /var/log partition knows how broken logrotate is. What if the journal could actually know about this and rotate based on size? It also makes it possible to actually search for things without having to take different date and log formats into account, and actually know that it came from the correct program. I know, crazy.
Someone did, the result is called dnf and has replaced yum in Fedora.
What's wrong with binary logs?
/var/log/messages was
Text is a terrible format for efficient storage of and access to structured data
Access to binary logs is O(1) instead of O(n)
journalctl outputs a pixel-perfect copy of what
You can query more effectively and precisely than with awk, sed and grep
You can still use awk, sed and grep if you want
You can run syslogd in parallel and have your text file as well
The binary format is well documented
Traditional logs are binary as well as soon as they are rotated and compressed
For fucks sake already, can we not have a single Linux related discussion that has nothing to do with systemd without it spiraling into a systemd flame fest? Systemd is not the devil. All I read here from detractors are people who are regurgitating bullshit they overheard while riding the bandwagon they blindly jumped on without actually having a single clue what they are talking about. Talk about the blind leading the blind. Meanwhile anyone with a clue who tries to chime in with a voice of reason is simply drowned out. Does using the word binary in sentence where you also refer to logs make you feel like some kind of super hacker? Sometimes I really think that's what all this never ending bullshit is about.
Brought to you by Carl's Junior.
If something is XML based and time criticial, I wouldn't bother to optimize the XML parsing, but rather exchange XML for a non braindead format to start with.
Holy fuck, I just read the blog article and this is what it said:
How the fuck did that all happen to begin with?! Who the fuck wrote the code like that initially?! How fucking long has this been the case?!
I understand that bugs will happen. I really do. But this is like a total breakdown in process. Why the fuck weren't these problems detected sooner, like when the code was committed and reviewed? The code was reviewed, right?!
We aren't talking about some minor software project here. This is apt, for fuck's sake! This is one of the core pieces of Debian's (and Ubuntu's, and many other distros') basic infrastructure. This is the kind of shit that has to be done properly, yet it clearly wasn't in this case.
The Debian project needs to address how this shit even happened to begin with. This is fucking unbelievable. The entire Debian community deserves a full explanation as to how this debacle happened.
I thank God every day that I moved all of my servers from Debian to OpenBSD after the systemd decision. I know that the OpenBSD devs take code reviews and code quality extremely seriously, not just with the core OS itself, but even with software written by others. The OpenBSD project will even create and maintain their own custom forks of third party software if the original developers can't get their shit together!
If yes, I would wonder how much speed could it gain over yum... after all, whenever I used yum to update some OS, the vast majority of time was spent with the stuff "rpm" would do, not with the database maintainance yum adds.
Well, we do download in parallel if you use httpredir.debian.org and httpredir.debian.org returns different mirrors for different packages (which it does not do all the time, but reasonably often). I don't like installing in parallel, or downloading and installing at the same time, as they just make the error handling harder, for modest speedup.
What's wrong with binary logs?
Nothing, if they're well-designed and ACID compliant, which journald is not:
It's rare that I choose to defend systemd, but ACID compliance does not mean you never have to deal with a corrupt database. It is a software technique to make sure transactions complete in an atomic, consistent, isolated and durable way but it still presumes a "perfect" system and if a bit flip happens in memory or on disk outside the programming logic then ACID will fail. That is why you have ECC, RAID1/5/6 and ZFS, but even they fail and sometimes you have genuinely "impossible" results like you've added $2+$2 to the account and the result is $5 (bit flip from 0x0100 to 0x0101). If you're using plain text and UTF-8 this can happen there as well, there are combinations that are simply illegal to use. You expect the parser to ignore the "impossible" and carry on, apparently that's what journald is doing too.
Live today, because you never know what tomorrow brings
I have no idea why people insist on boot times. You are aware that a machine spends little time booting and a hell of a lot more running things?
Sure, except when you do reboot it might be important to do it quickly some of the time.
It's absolutely true that it's more important to optimize the common case over the uncommon case, but that doesn't mean the uncommon case is necessarily insignificant.
Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
Do try harder.
http://www.freedesktop.org/wiki/Software/systemd/journal-files/
Is this a requirement for you? Set journald to output classic plain text logs as well. You get all the benefits of a nice utility to sort through your logs and the ASCII text files which you find so critical to your use-case.