Secure Syslog Replacement Proposed
LinuxScribe writes with this bit from IT World: "In an effort to foil crackers' attempts to cover their tracks by altering text-based syslogs, and improve the syslog process as a whole, developers Lennart Poettering and Kay Sievers are proposing a new tool called The Journal. Using key/value pairs in a binary format, The Journal is already stirring up a lot of objections."
Log entries are "cryptographically hashed along with the hash of the previous entry in the file" resulting in a verifiable chain of entries. This is being done as an extension to systemd (git branch). The design doesn't just make logging more secure, but introduces a number of overdue improvements to the logging process. It's even compatible with the standard syslog interface allowing it to either coexist with or replace the usual syslog daemon with minimal disruption.
Text is damn convenient to use. How are you gonna grep a binary file?
LiveJournal. Oh wait....
The binary format part of this is unnecessary, at least as far as I (with limited low level programming experience) can tell. Other people have been suggesting methods which would mean you just need a cryptographic hash in each otherwise plain text line, in a standard manner. Still at least it has got a discussion started.
Set your machine to also log over a secure channel to another machine. Perhaps one that only accepts the syslog entries and no other connections. Problem solved.
The real "Libtards" are the Libertarians!
Back in the late 90's when I first started connecting my home Linux systems to the Internet 24/7, I logged everything imaginable. To prevent tampering/falsification of the logs, I simply printed the log on a continuous-sheet dot matrix printer. Good luck tampering with the printout in my office.
After a while I got to be able to recognize certain types of activity, such as a web user browsing to /index.html, based on the sounds the printer made.
It doesn't really make logging more secure, you can easily just modify the entire log. Plus if someone's modifying your logs they have root permissions on your machine and then you cannot trust your system, they can put hooks on the log read to just hide certain entries if necessary. The only real solution is to NOT trust your own system - send all the data to a remote syslog server with no other services running. Why take a half-measure when you should have gone all the way?
I don't mind having a binary format as an option, but having a text format also available is absolutely essential IMHO. Rsyslog and Syslog-ng already can write to various databases too.
The "chained hashing" is handy to catch alterations, but beyond that, this thing doesn't seem to be bringing much to the table.
If I modify a single line in the log, thereby changing it's hash, do I therefore invalidate (or worse, render unreadable) every entry that follows?
It's called a dot-matrix printer.
Log entries are "cryptographically hashed along with the hash of the previous entry in the file" resulting in a verifiable chain of entries.
So this means that in order for someone malicious to modify a log entry, all they really need to do is then re-hash all subsequent entries?
http://spamdecoy.net - free throwaway anonymous email - avoid spam!
Everyone knows that read-only is more secure.
Where will the journal be located?
Will tail on it give me any usefull information (or I'll have to read thousands of lines until finding the log of the application I want)?
How will it keep indices without uneeded overhead? (Let's get real, log files are rarely read. Why optimize for reading?)
When they change the format of the journal, will I have to update all my log parsers?
Rethinking email
Have fun rotating your logs!
It must have been something you assimilated. . . .
I'm all for making real improvements, and I'm sure that logging could be improved in various ways. However, when I'm looking at logs, it's generally because something is broken and I want to find information on how to fix it quickly and easily. Storing something in straight text makes it extremely accessible. It's not just about using grep, which many people are accustomed to, but also because text viewers are simple. If your computer can't run programs like cat, tail, or nano, then you've got big problems. However, even if you can't run those programs for some reason, you can copy a text file to another system-- any other system-- and read it without any special software or encryption keys.
If you want to make another logging system that also tracks security-related information in a way that's easy to audit, I suppose that's worthwhile. However, if you want even basic diagnostic information to be stored in something other than plain-text, then you'd better have a simple, robust, cross-platform method of reading that data. After all, worrying about hackers is a bit of a fringe case. Most of the time, problems are caused by misconfiguration, software bugs, or bad hardware.
In cases where avoiding tampering is crucial, just log to a write-once filesystem, or, indeed, a printer.
My karma ran over your dogma
There is no real problem this solves. You are far better off logging remotely. This does not stop an attacker from hiding his tracks, you'll just know the logs were altered, but you won't know what was removed, or likely if/when you can start trusting them again. Log remotely, use encryption, and use TCP. You're central/remote logger is your trusted source for logs. You close everything except incoming logs. Parse and alert on the logs from there. Its simple to do, its real time, and solves a lot more issues than this type of solution ever will.
The output hashes the last message with the current message:
No binaries, still grepable, single host and most importantly, there is now a trail that can be verified.
Secure Syslog?!?
I'm still waiting for regular syslog on Windows.
Now, without getting into how much i dislike Pulseaudio (maybe because i'm an old UNIX fart, thank you very much), I think there are really serious issues with "The Journal", which I can summarize as such:
1. the problem it's trying to fix is already fixed
2. the problem isn't fixed by the solution
2. it makes everything more opaque
3. it makes the problem worse
The first issue is that it is trying to fix a problem that is already easily solved with existing tools: just send your darn logs to an external machine already. Syslog has supported networked logging forever.
Second, if you log on a machine and that machine gets compromised, I don't see how having checksums and a chained log will keep anyone from just running trashing the whole 'journal'.
/var/log
rm -rf
What am i missing here?
Third, this implements yet another obscure and opaque system that keeps the users away from how their system works, making everything available only through a special tool (the journal), which depends on another special tool (systemd), both of which are already controversial. I like grepping my logs. I understand http://logcheck.org and similar tools are not working very well, but that's because there isn't a common format for logging, which makes parsing hard and application dependent. From what I understand, this is not something The Journal is trying to address either. To take an example from their document:
MESSAGE=User harald logged in
MESSAGE_ID=422bc3d271414bc8bc9570f222f24a9
_EXE=/lib/systemd/systemd-logind
[... 14 lines of more stuff snipped]
(Nevermind for a second the fact that to carry the same amount of information, syslog only needs one line (not 14), which makes things actually readable by humans.)
The actual important bit here is "User harald logged in". But the thing we want to know is: is that a good thing or a bad thing? If it was "User harald login failed", would it be flagged as such? It's not in the current objectives, it seems, to improve the system in that direction. I would rather see a common agreement on syntax and keywords to use, and respect for the syslog levels (e.g. EMERG, ALERT, ..., INFO, DEBUG), than reinventing the wheel like this.
Fourth, what happens when our happy cracker destroys those tools? This is a big problem for what they are actually trying to solve, especially since they do not intend to make the format standard, according to the design document (published on you-know-who, unfortunately). So you could end up in a situation where you can't parse those logs because the machine that generated them is gone, and you would need to track down exactly which version of the software generated it. Good luck with that.
I'll pass. Again.
Semantics is the gravity of abstraction
It occurred to me shortly after posting that a simple hash could easily be forged, and that a key signing of sorts would be needed to make it secure, though the system would have to be able to sign its own log messages without giving the hacker access to the signing key.
Your answer is right in the summary. I can use standard syslog in conjunction with it, and then have a process running in the background that notifies me if the integrity of the text file is violated, thereby getting the best of both worlds.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
From the FAQ:
we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly.
Not only does it generate logfiles that are not human-readable, they're also in a format that in two years not even their own tool will be able to read. If it is still around in two years, which I doubt.
The summary states that it can be used with your usual syslog daemon. Therefore you can use your usual tools to analyze your logs, but you still have an audit trail to identify log tampering. The downside of this may be more disk i/o.
Can someone explain to me why you can't simply edit the log entry you want to change, and then recalculate every hash for the rest of the file?
Unless you want to change some log entry from months ago with gigabytes of log entries to re-hash this should be doable right?
Unless you keep some additional copy somewhere to compare to, in which case the hash doesn't really add anything.
Nice security, erm, feature..?
http://www-cs-students.stanford.edu/~blynn/gitmagic/ch08.html If this doesn't explain why you are wrong, keep googling. You'll figure it out eventually.
Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
That's a little better. I don't see it as comparable to syslog with unix/linux shell tools. Then again, I'm a powershell noob and would miss my vim key bindings.
I searched but couldn't locate any way to read the log files via the recovery console. Maybe (I hope) someone will enlighten me here.
This is on the same crack as the rest of GNOME 3. They've invented the Windows event log, well done! Now I hand you a trashed system, but you can read the disk. You look into /var/log/syslog ... no, you don't. "We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. The access is granted by a shared library and a command line tool."
Speaking as a sysadmin, I shudder at this incredibly stupid idea. Are they even thinking of how to get something actually readable in disaster?
http://rocknerd.co.uk
Is this a joke? Or is it someone just trying to push their ideology of what they think should be done to the rest of the world to make their idea a standard?
Doing something like this would be a sure way for Linux to shoot itself in the foot. For evidence, one only needs to look as far as Microsoft who insists on doing it their special way and expecting everyone else to do what they deem as "good". The concept of syslog messages are that they are meant to be 'open' so disparate systems can read the data. How to you propose to integrate with large syslog reporting/analysis tools like LogZilla (http://www.logzilla.pro)?
The authors are correct that a format needs to be written so that parsing is easier. But how is their solution any "easier"? Instead, there is a much more effective solution available known as CEE (http://cee.mitre.org/) that proposes to include fields in the text.
> Syslog data is not authenticated.
If you need that, then use TLS/certificates. when logging to a centralized host.
>Syslog is only one of many logging systems on a Linux machine.
Surely you're aware of syslog-ng and rsyslog.
Access control to the syslogs is non-existent.
> To locally stored logs? Maybe (if you don't chown them to root?)
> But, if you are using syslog-ng or rsyslog and sending to a centralized host., then what is "local" to the system becomes irrelevant.
Disk usage limits are only applied at fixed intervals, leaving systems vulnerable to DDoS attacks.
> Again, a moot point if admins are doing it correctly by centralizing with tools like syslog-ng, rsyslog and LogZilla.
>"For example, the recent, much discussed kernel.org intrusion involved log file manipulation which was only detected by chance."
Oh, you mean they weren't managing their syslog properly so they got screwed and blamed their lack of management on the protocol itself. Ok, yeah, that makes sense.
They also noted in their paper that " In a later version we plan to extend the journal minimally to support live remote logging, in both PUSH and PULL modes always using a local journal as buffer for a store-and-forward logic"
I can't understand how this would be an afterthought. They are clearly thinking "locally" rather than globally. Plus, if it is to eventually be able to send, what format will it use? Text? Ok, now they are back to their original complaint.
All of this really just makes me cringe. If RH/Fedora do this, there is no way for people that manage large system infrastructures to include those systems in their management. I am responsible for managing over 8,000 Cisco devices on top of several hundred linux systems. Am I supposed to log on to each linux server to get log information?
Local Security Policy Tool/secpol.msc has MANY options 4 logging w/in its tree items!
Follow them like so down its left-hand side pane:
Security Settings
Advanced Audit Policy Configuration
System Audit Policies - Local Group Policy Object
Beneath that last tree item in the left-hand side pane, are 10 major categories of possible auditing.
Beneath those are 57 subitems for logging as well...
* The rest can be done in other tools (e.g.-> like Windows Firewall logging for IP access etc.)
APK
P.S.=> The SAME can be accomplished from an AD Group Policy GLOBAL NETWORK LEVEL as well, using gpedit.msc/Group Policy Editor (so you don't have to manage EVERY SINGLE WORKSTATION NODE to do it, machine-by-individual-machine)!
Programming custom apps to do logging via API calls to the EventViewer's easy as well...
SO....There you go, "here endeth the lesson"
... apk
It seems pointless. If somebody already has enough privileges on your server to mess with the logs, how is a hash going to help? There's a whole bunch of things an attacker can do that makes this useless.
Most obviously, they can corrupt or erase the contents of the file. Noticeable, but the traces of you accessing can be deleted, so that the admin can't figure out who did it.
The attacker can save an old file, do whatever needs hiding, and replace the file with the old copy. Depending on how it works this may result in logging being continued to the replaced file, or the log daemon keeping to write into a now nameless file, while an old one is visible in the directory instead.
The hash seems pointless. If the attacker can modify the logs directly, they likely have root access, which means they can debug any process, and subvert any cryptography that might be happening. They can also regenerate the log file with the correct hashes but with a few deleted lines, or replace the daemon with one that doesn't log some things.
Attitudes like yours cost the industry jobs. It is best for if we store data away into increasingly inappropriate places so that lusers have to pay us to get their own data.
Hell, going back to standard data formats and reusable tools would be the death of a thousand increasingly bizarre specialty languages alone.
As a penance, you should rewrite diff in python to work on sqlite databases. That should set the industry back another few years.
If only his had been done before... Oh, it has. It's called "asl".
http://opensource.apple.com/source/syslog/syslog-132/
-- Terry
Oh, look, Pottering is breaking something else. It must be a hobby of his.
What Linux really could use for secure logging is mounting /var/log as an append-only file system. If you can only read from and append to a file, it makes it awfully difficult to tamper with it. http://www.freebsd.org/doc/handbook/securing-freebsd.html
I can mostly agree with you. There is one thing you might be missing.
[...]
I think what you are missing is this replacement is intended to prevent "undetected" tampering with the logs. Currently, a cracker can delete the log entries that would identify his or her activities on the machine, thereby going unnoticed. Deleting the log files or destroying the tools, as you suggested, would certainly be a detectable sign that the machine was compromised.
My point is: even with git if someone has access to the repository it *can* be tampered with. It's harder and may take longer than with a plain text file, but it's completely possible. With git, there's even an easy way to do it (git rebase) and I suspect that cracking toolkit will adapt and also make that easier. Note that I assume here that you save the first hash of the tree to a secure location, as documented:
Inspired by git, in the journal all entries are cryptographically hashed along with the hash of the previous entry in the file. This results in a chain of entries, where each entry authenticates all previous ones. If the top-most hash is regularly saved to a secure write-only location, the full chain is authenticated by it. Manipulations by the attacker can hence easily be detected.
If only the topmost hash is saved to a backup location, then I just need to reroll all the logs from that first topmost hash and tampering goes undetected. The only argument for this technique is that you could keep more than just the first hash (say N hashes), which we could argue goes back to logging to a different machine, for a sufficiently high N.
Semantics is the gravity of abstraction
...who mistakenly think they can fix something that's not actually broken.
I'd like to see an OS where all logging is records in a database. With encryption and access control, and replication to remote instances.
--
make install -not war
"just send your darn logs to an external machine already"
They are probably referring to not sending clear text over the network as syslog does. Yes you can setup an ssh tunnel, etc, but a pain.
I like it how Slashdot is full of tech wisdom. Until there is a story on something you actually understand well, and then it's full of blithering idiots opening their mouths before spending like 10 minutes to learn.
My exception safety is -fno-exceptions.
There are RFCs that cover the transmission of syslog messages in a secure fashion. 5424, 5425, etc.
There are tools that store syslog messages - in plain text - in a secure fashion.
syslog-ng is just one of them.
This post is "old" and nothing more than a group of people reinventing the wheel.
The *only* way to solve tampering with log data is to store it on another machine and hope hackers don't get to that.
If a hacker gains access to a system with log files on it, the best you can do is make the logging tamper-evident. This means that if the hacker modifies the data, in any way, it can be detected. This includes hash recalculation.
Making the system tamper-evident with hashes simply means that all hashes require a secret input and that the input is only ever stored on the system for the next entry. If you know the secret input for hash#0, then you can calculate the secret input for hash#n, but knowing the secret input for hash#n does not tell you what it was for hash#(n-1). Similarly, the secret input for hash#0 is not stored on the system.
Even suggesting the move to binary logs is so stupid that its proponents should be shot to death...
However you can have the best of both world:
log message 1 in clear text + (CRYPTO HASH of this message + plus previous log messages)
log message 2 in clear text + (CRYPTO HASH of this message + plus previous log messages).
But binary logs. Really?
Who's lacking neurons *that* badly?
Pulse Audio, systemd, Journal ...
Pottering must be .. emmm... disabled to create software :)
This kind of nonsense has been standard operating procedure in the linux world for years. Adding unnecessary complexity to everything is what they do. That's why people who like unix are using openbsd, netbsd or plan 9.
If someone with technical skills sufficient to break into a machine and effect syslog data has gained root on a machine, how is an alternative syslogd going to make it more difficult for them? If they have root an alternative logging daemon is not going to prohibit them from replacing any binary on the machine. It isn't going to prohibit them from using disk tools to make changes to inodes. It isn't going to effect any of the 10 or so things I can think of off the top of my head that can be used to cover your tracks.
1) Access machine.
2) Gain root
3) Stop the alternative logging daemon
4) Replace this binary alternative with my own that reports disk sector errors at the inode location of the log if a log file is deleted or replaced.
5) Replace the disk tools, in addition to the usual replacements -- cat, zcat, grep, zgrep, w, last... - dropped with a rootkit.
6) Replace the log files with random binary data.
Or! You can use a network detection software to log connections to ports not running known services. Or! You can use tripwire. Or! You can send log data to syslogd like almost everyone here has suggested.
Fuckholio'ing one of Linux's greatest attributes is unacceptable.
Having to work for a living is the root of all evil.
If we were to accept a binary format, then at least it shouldn't be from a group that says up front:
At this point we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. ... we don’t want any other software to read, write or manipulate our journal files directly
This is absolutely unacceptable for projects in *nix land intending to serve such a central role as logging.
Reading the actual original document, I don't think it focuses so much on security. But to the extent it does, it's pretty pointless. They make noise about an authenticated chain of entries so you can't just modify the middle, *but* that provides no benefit as the attacker can then just rebuild the chain from that point forward. Their answer is to send it to some place that cannot be modified once transmitted. This is exactly the same as remote syslog policies, no additional security, but added complexity for no gain.
Additionally, they *could* have a system with plaintext and a binary format in place and I recommend they change their minds to do so. The binary blob can contain offsets into a corresponding text file. Thus the good old unix way (which the systemd people seem intent on destroying) is preserved while at the same time get their enhancements.
They *do* have some valid points. Syslog can't cope with binary data, it doesn't provide a good per-user logging facility, large text files are hard to search, and syslog has insufficient service/event type facilities making complex analysis a requirement in some scenarios. Even in a simplistic case, I have been left at a loss for 'what string *should* I grep for?' Many services ignore syslog because of it's limitations as pointed out in the artcile, making things that much more complicated.
But at the exact same time they bemoan so many services doing different logging, they propose making yet another facility and recommend keeping rsyslog running because they aren't going to handle syslog messages. They tell people 'tough you have to use systemd' and 'tough you must use our logging'.
They dismiss java-style namespace management due to variable width, which I think is just going *too* far to acheive theoretical performance gains. They get *very* defensive about UUIDs, and I accept when managed correctly they are unique, *but* it adds a layer of obfuscation unless you have a central coordinating master map of UUID to actual usable names. Uniqueness is an insufficient criteria. Have both worlds. An application submits a message with both a human-readable namespace *and* a UUID. If your logging facility already has the UUID, ignore the namespace. If your hash table does not have that UUID, store a mapping between the UUID and namespace. Then your tool has the added bonus of having a way to dump a quick list of currently observed message types to search by.
XML is like violence. If it doesn't solve the problem, use more.
I am not sure how hashing log entries in sequence is going to help. If you were to tamper the logs, you could still recalculate the hashes since the first entry you modified.
However, if you would add a hardware device, through which all the log entries would be filtered, and this hardware device would have a read-only register containing the last (chained) hash, then it could be made secure.
There's lots of flack over the binary log format. Syslog entries are pretty clearly delineated, so why not just store the hashes for each entry separately from the text-based log file? You can then just provide a tool to check the integrity, without losing any of the advantages of the current log format.
Higher Logics: where programming meets science.
We've had much success using something as simple and widely supported as a remote syslog server (with a twist - more later), you know the @ notation from /etc/syslog.conf.
Using this you'll get a complete history of what the hacker does up until the time where syslog is stopped. The hacker can't mess with stuff on a different server so just let him delete/modify logs to his hearts content on the compromised server.
Now, someone might argue that the hacker just grabs the logging server name from /etc/syslog.conf and attacks that was well. Well, good luck with that. We stream to a server that doesn't exist except as a honey pot, and let another server sniff the traffic and save it the usual way. But this might be overkill - we've had a dozen successful attacks and all deleted logs but none went after the remote server, nor bothered to kill off the syslog daemon.
"For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
More insanity from Redhat developers.
First there was NetworkManager, created back in 2004 (per wikipedia) but find someone that actually likes it. I remove that piece of junk software from every Fedora install since it appeared. NetworkManager is known for more issues across most Linux distributions that I have ever read, especially strangeness due to upgrades of that software.
Then there is PulseAudio. I'm glad I never had to use any of my Linux boxen for audio use using PulseAudio since it's another piece of junk software that I remove...provided I don't load any "X" style desktops that seem to always require it.
The Redhat developers decided that network interface naming needed to be fixed. They thought the OS software would get confused with multiple device names that were not reflected in the BIOS. That "genius idea" did not break anything in my world, but why do certain Redhat developers always stir up stuff that doesn't need stirring up? System admin work is hard enough, but having a vendor unilaterally decide/impose a new network device naming method for you? Geez!
Then certain Redhat developers came out with "systemd" in Fedora. yet another piece of poorly documented crap. It seems to be a recurring pattern in Leo Poettering's work given "The Journal" proposal, not to mention the snarky remark in the FAQ on Google Docs that says "go read the source code" to learn the binary file format. When I reported bugs to the developers of "systemd" they acted like they were doing me a favor by fixing them. Geez! Every version of Fedora (all the way back to FC1 and before to Redhat 4.0) that I ever ran, installed, or upgraded would retain the run/stop settings of system services during upgrades...except for upgrading from F14 to F15. That upgrade messed up a few system services settings for me (lucky I check that stuff after upgrades and updates) and the "systemd" developer tried to tell me, "We don't retain those settings across upgrades so I'm doing you a favor by accepting your bug report and fixing it. Now give me brownie points in your bug report for doing this fix." That attitude from the Redhat Fedora developer community permanently pushed me away from supporting Fedora & Redhat in my personal use. Now I use ArchLinux and love it!!
I really wish the Redhat Fedora developer community would work on real bugs (like the half-witted Marvell Thor SATA chipset implementation by the ATA/SATA subsystem maintainer [and Redhat developer]...and well-documented on the web) and not look for "imaginary" issues to stir up. Perhaps Redhat Fedora developers should focus on making their OS upgrade seamlessly across releases while retaining all of the user's settings across those upgrades. If redhat Fedora is really trying to move their OS into the desktop world, that is something that will be greatly appreciated by desktop admins & help desk staff. FWIW two other well-known non-Linux OS seem to get the system upgrade process right and without mucking up the current settings unless they absolutely have to.
Microsoft's logs are in binary format, and are a huge PAIN to deal with. Thankfully, based on the information posted about this, the old syslog daemon can still be used which is good. I am not overly concerned with security and logs, that's what SELinux is for. If the system is compromised, there are more problems to deal with. For me, I'm sticking to the same syslog daemon which has been around for decades and just works every time. Plus, having logs in ASCII text is a huge benefit to me.