Oracle Adds Data-integrity Code To Linux Kernel

Yay for Linux! by Anonymous Coward · 2008-12-11 08:16 · Score: 0

Seems like Linux is going to have a great momentum in these coming months.

Re:Yay for Linux! by Anonymous Coward · 2008-12-11 08:18 · Score: 0

It's finally the YOLOTD!...
Re:Yay for Linux! by morgan_greywolf · 2008-12-11 08:24 · Score: 3, Insightful

The Year of Linux on the Database? Nah, that happened a long time ago.

--
My blog
Re:Yay for Linux! by Anonymous Coward · 2008-12-11 09:45 · Score: 0

Year Of Linux ... Oh, The Drama, I knew it would happen?

Dumb question... by geminidomino · 2008-12-11 08:24 · Score: 3, Insightful

How badly does this affect performance?

Re:Dumb question... by afidel · 2008-12-11 13:42 · Score: 3, Informative

It doesn't matter much. This patch adds T10-DIF which basically brings minicomputer level data integrity to the commodity computer market. It adds about 1.54% to data storage requirements (8 bytes of ECC per 512 byte block, just like the AS/400) and some small amount of code at the OS and APP layers to check the CRC's. With Oracle I would imagine this might actually INCREASE performance for the most fault intolerant environments since it wouldn't need to do a read after write if the storage system acknowledged a successful T-10 DIF block save.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
Re:Dumb question... by afidel · 2008-12-11 16:05 · Score: 1

Hmm, on second thought after doing some more research it MIGHT matter:
CRC is somewhat expensive. 200-300 MB/s on a modern CPU looking into ways to optimize
SSE4 will have a CRC instruction (any poly)
linky(pdf)

Not sure what he means by CPU there, if he means core then it's not so bad (other than freaking expensive per core Oracle licensing), if that's per processor then that's very bad. Guess I might only be able to implement this on the new Shanghai beast we got last week or a new Corei7 Xeon system we will probably get next year once they are available.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Oracle submitted a 2nd patch by aztektum · 2008-12-11 08:26 · Score: 3, Funny

It adds a 2nd layer of metadata that is used to verify the first layer of metadata wasn't corrupted so you can be EXTRA confident that your original data was actually handled correctly.

--
:: aztek ::
No sig for you!!

Re:Oracle submitted a 2nd patch by Vadatajs · 2008-12-11 11:05 · Score: 0

It's metadata all the way down.
Re:Oracle submitted a 2nd patch by Anonymous Coward · 2008-12-11 21:09 · Score: 0

didn't nsa selinux patch up the dying tubes for us all?? how many polish governors of illinois does it take to fill a senate seat?

It's been there already by Anonymous Coward · 2008-12-11 08:33 · Score: 0

Nice one, some time ago people used magic numbers to distinguish structures in the dumps, now they have meta data for that.

As long as it applies to every data in transit, it should bring stability into the kernel.

Treat Your Data Like Ice Cream.. by firespade · 2008-12-11 08:39 · Score: 0

Does one OR two layers make you feel more confident?.. Why not put more effort in back checking I/O operations rather than adding more data to the stream? Is this a band aid fix? Apparently not in the eyes of Oracle..

Re:Treat Your Data Like Ice Cream.. by Anonymous Coward · 2008-12-11 08:51 · Score: 0

I'll take your advice. From now on, I'd like my data covered in a nice chocolate sauce.
Re:Treat Your Data Like Ice Cream.. by Anonymous Coward · 2008-12-11 10:37 · Score: 0

Why are you asking me if three layers make me feel more confident?

Security??? by Anonymous Coward · 2008-12-11 08:40 · Score: 0

Did anyone else read the headline and think "Integrity" in terms of security (you know, as in the good ol' CIA Triad)? And then if you did, did you also knee-jerk a response like this: "What?! Oracle adding security code to the Linux kernel? With THEIR security reputation of their products!? $#@%"

Re:Security??? by Workaphobia · 2008-12-11 09:30 · Score: 4, Insightful

Integrity is a security principle, and that is the sense that they're using the word in the summary. It's pretty much the only definition of the word that makes sense in a computing context. More precisely, we're talking about confidence that the data stored in the system is the same as the data retrieved at a later time. The only difference between this and a more cryptographic sense of the word is that this doesn't attempt to guard against malicious attacks if an adversary had offline access to the disk. (Or so I presume, having not RTFA'd).

--
Evidently, the key to understanding recursion is to begin by understanding recursion. The rest is easy.

Then Oracle submitted a 3rd patch by butalearner · 2008-12-11 08:48 · Score: 2, Funny

It pre-corrects a future corruption in the as-yet-unimplemented third layer of metadata. Kernel developers have decided to add the third layer and accept the patch on the grounds that the corruption might still have occurred even if Oracle hadn't said anything.

Data integrity is great and all... by Anonymous Coward · 2008-12-11 08:50 · Score: 0

I thought I'd be uber pimp after using XFS, boy was I wrong. That file system becomes corrupt after about every six months for me, with no way to repair it (nothing I've tried works). I'm running an ext3 filesystem now for my root partition and data being on an ext3 and another on a seperate xfs drive (luckily this filesystem never became corrupted). No problems so far for about a year.

I wonder if these patches make a difference in different filesystems or what exactly the meaning is? It says transactions "in transit" well if the hardware is bad I wonder how the software knows whether what it thinks is valid really is?

Sounds like a layer of code to slow things down without really providing anymore protection against errors.

Re:Data integrity is great and all... by Anonymous Coward · 2008-12-11 10:46 · Score: 0

xfs corruption was fixed in something like 2.6.9. Get off your ass and update your kernel.
Re:Data integrity is great and all... by jabuzz · 2008-12-11 12:04 · Score: 1

I have been running it since well, when it became available for Linux and I had to hand patch and compile my own kernel. Not once have I had any file system corruption in the intervening years. Well apart from when a disk developed bad sectors, but that is hardly the fault of XFS...
Some time ago (I forget when) I did have a few files truncated to zero on a kernel panic usually a failed restore, and usually my bookmarks. Not had that in six or seven years now though.
They have even fixed the issue where you needed scads of RAM to check a large file system. My only beef with XFS is that you cannot size it smaller.

Terribly old news by zdzichu · 2008-12-11 08:56 · Score: 3, Informative

Block integrity patches were discussed in excellent article on LWN in July 2008. Kernel 2.6.27 was released in October 2008. This is old news.

--
:wq

Re:Terribly old news by XaXXon · 2008-12-11 09:10 · Score: 4, Insightful

Oh my god! Not news from October. That was going on two months ago. Everyone knows everything that happened two months ago. What were the editors thinking? Fire them immediately. Let's all go to digg or reddit or myspace where they don't do things like post things that are almost two months old. PANIC PANIC PANIC!!!
Wow, just let other people read it and go on about your business not caring.
Re:Terribly old news by Nick+Ives · 2008-12-11 10:06 · Score: 4, Informative

That LWN writeup is far better too though, TFA is terrible. LWN makes it clear that this adds device checksum support, i.e. if your SATA drive supports adding checksum data to blocks this patch will enable that functionality.

--
Nick
Re:Terribly old news by Anonymous Coward · 2008-12-11 11:57 · Score: 0

LWN states "Stories of blocks which have been corrupted, or which have been written to a location other than the one which was intended, are common."
How can I verify that? Does anyone have ideas for experiments?
Re:Terribly old news by commanderfoxtrot · 2008-12-12 03:36 · Score: 1

Thanks for that summary. It's been many years now since I kept up with the kernel changelogs and articles... the kernel now Just Works.

--
http://blog.grcm.net/

Congratulations... Oracle by HighOrbit · 2008-12-11 09:07 · Score: 2, Insightful

You've invented the Checksum

On a more serious note (yes I did RTFA), somebody please explain where this fits. Other than network or disk errors (which generally already have error detection schemes), I'm not sure what the target problem is that this is supposed to fix. The article says "the code helps maintain integrity as data moves from application to database, and from Linux operating system to disk storage", that it checks I/O operations, and that "code contribution includes generic support for data integrity at the block and file-system layers". That's still not clear what they think the problem is. Don't most of the modern file systems already check data operations?

PR by Anonymous Coward · 2008-12-11 09:09 · Score: 0

This is obviously pure PR. It reads like a press release - the journalist understood even less about it than (s)he usually does.

Seems like Oracle is being clever at creating buz about their participation in Linux development - while using the words that they use to sell there products. Having "Linux", "Oracle" and "Data integrity" in the same sentence, without paying for it, is very nice.

Now, it doesn't mean that they haven't done something good. But until someone explains me what it is, I'll take it with a grain of salt.

Re:Bloatware by Anonymous Coward · 2008-12-11 09:09 · Score: 0

Sure it might bloat the size of the source code, but hopefully you can choose to include or not include it during a recompile. So it's not so bad.

erasure codes by two+basket+skinner · 2008-12-11 09:12 · Score: 1

link is skim on details. any word if there is error correction or is it just detection? what does this add that say erasure coding (reed-solomon) lacks?

Re:erasure codes by Detritus · 2008-12-11 15:31 · Score: 1

It's just error detection. Recovery is handled at a higher level.

--
Mea navis aericumbens anguillis abundat

Dumber question(s)... by Anonymous Coward · 2008-12-11 09:15 · Score: 0, Flamebait

Is this as much of a paranoiacs nightmare as NTFS Alternate Data Streams?

What exactly is the "Data Integrity Payload" that we will be adding to these files... how can this be abused?

and most importantly... Most people won't need this so, Will this sucker be turned on by default? Will there be an easy way to purge your files of unwanted metadata?

How long will it be before systems start requiring certain metadata attached to files before accepting the file? Is this the foot in the door for DRM in the kernel?

Re:Congratulations... Oracle by setagllib · 2008-12-11 09:25 · Score: 2, Informative

I don't know where it fits either, but ZFS and eventually BTRFS actually have checksums at the block level, and can heal over corrupted blocks using redundant copies whose checksums do work. That alone is enough reason to use ZFS for a file server, but similar improvements could be made inside the Linux stack without a new filesystem on top. However ZFS' reliability also comes from copy-on-write updates which is not trivially installed into an existing filesystem.

--
Sam ty sig.

vs a journaled fs? by pak9rabid · 2008-12-11 09:37 · Score: 1

Excuse me if this is a dumb question, but how does this differ from the journal in many existing filesystems?

Re:vs a journaled fs? by JohnFluxx · 2008-12-11 11:30 · Score: 2, Informative

The purpose of a journal is to make sure that operations either happen or they don't happen - i.e. you don't leave the filesystem in some half way state if the power goes out.
It doesn't verify the actual data written or anything.
Re:vs a journaled fs? by pak9rabid · 2008-12-12 04:10 · Score: 1

Ah ok...thanks for clearing that up.

Re:Congratulations... Oracle by phantomcircuit · 2008-12-11 09:51 · Score: 2

I'm not certain but it appears to be checksumming data while it is moving around the kernel after a write or read call is made.

Seems like something that should be handled in hardware with ECC, but what do I know.

Looks like it is T10 SCSI by phantomcircuit · 2008-12-11 09:55 · Score: 1

... whatever that means.

Info

Re:Congratulations... Oracle by scheme · 2008-12-11 10:07 · Score: 3, Informative

I'm not certain but it appears to be checksumming data while it is moving around the kernel after a write or read call is made.

Seems like something that should be handled in hardware with ECC, but what do I know.

Kernel bugs can cause data to get corrupted and hardware ECC won't correct that. Likewise with transfers from memory to disk. Ultimately it'll need to be a hardware/software thing but the software portion is needed as well.

--
"When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it

Re:Congratulations... Oracle by afidel · 2008-12-11 13:48 · Score: 1

This is industry standard checksumming to ensure end to end data integrity from the disk to the storage system to the HBA to the OS to the app. I'm quite stoked for this since my SAN vendor (Xiotech) has the first system to support the standard (Emprise 5000/7000) and we have Oracle 10/11G already in our environment.

--
There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.

Re:Congratulations... Oracle by MikeBabcock · 2008-12-11 14:20 · Score: 1

RAM errors.

Spontaneous bit flips change data in transit.

It also helps against errors in kernel code or malicious data injection attacks

--
- Michael T. Babcock (Yes, I blog)

ECC ram officialy obsolete? by Anonymous Coward · 2008-12-11 14:29 · Score: 0

Im confused...
did oracle just remove the whole point in having ECC ram in the first place? Could this also possibly lead to the kernel making the system much more resistant to say... radiation? (certain particles can cause a bit to get flipped when they pass through the computer, causing kernel panics and various unpleasantness) I am but a lowly college student from Mississippi so I know very little about this subject, any idea from the more educated slashdotters?

Re:Congratulations... Oracle by Detritus · 2008-12-11 15:13 · Score: 2, Informative

One of the problems that this is supposed to detect is blocks getting written to the wrong place or being read from the wrong place. I think it's one of those rare problems that stops being quite so rare when you have huge amounts of data stored on cheap hardware.

--
Mea navis aericumbens anguillis abundat

Re:Ooh wow! by Timothy+Brownawell · 2008-12-11 15:22 · Score: 1

Maybe linux will break .8% of the market with this groundbreaking advance.. *snicker*

Windows 37.4% (268), Linux 34.6% (248), Unknown 19.2% (138), Macintosh 7.6% (55), FreeBSD 0.5% (4), Solaris 0.4% (3).

I think it depends on which market you're talking about.

Re:Congratulations... Oracle by Detritus · 2008-12-11 15:27 · Score: 1

ECC usually covers specific paths or devices, but it doesn't give you an end-to-end integrity check. A similar situation happens with IP packets. You can disable packet checksums if you like to live dangerously. Then, all you need is a bit of noise or a hardware problem to silently corrupt data that flows over the network.

--
Mea navis aericumbens anguillis abundat

Re:Congratulations... Oracle by Anonymous Coward · 2008-12-11 15:37 · Score: 0

I don't know where it fits either, but ZFS and eventually BTRFS actually have checksums at the block level, and can heal over corrupted blocks using redundant copies whose checksums do work. That alone is enough reason to use ZFS for a file server, but similar improvements could be made inside the Linux stack without a new filesystem on top.

Without changes to the filesystem where do you put the new checksum for each filesystem block?

Re:Congratulations... Oracle by illumin8 · 2008-12-12 09:05 · Score: 1

The article says "the code helps maintain integrity as data moves from application to database, and from Linux operating system to disk storage", that it checks I/O operations, and that "code contribution includes generic support for data integrity at the block and file-system layers". That's still not clear what they think the problem is. Don't most of the modern file systems already check data operations?

You might not understand why we need it, but trust me, it is needed. Not all storage device drivers are created equally, and some will happily report to the kernel that the write operation was successful even if it wasn't, and you end up with corrupted data. When Oracle operates on a trusted environment like Solaris on Sparc, this type of integrity is built in to the operating system and it's not necessary to do a read after write to verify the data was written correctly. On Linux, and other untrusted operating systems where this doesn't happen, Oracle has to do a read after write to make sure the data was written correctly. This slows things down quite a bit.

--
"When the president does it, that means it's not illegal." - Richard M. Nixon

Re:Congratulations... Oracle by Xabraxas · 2008-12-12 11:53 · Score: 1

You might not understand why we need it, but trust me, it is needed. Not all storage device drivers are created equally, and some will happily report to the kernel that the write operation was successful even if it wasn't, and you end up with corrupted data. When Oracle operates on a trusted environment like Solaris on Sparc, this type of integrity is built in to the operating system and it's not necessary to do a read after write to verify the data was written correctly. On Linux, and other untrusted operating systems where this doesn't happen, Oracle has to do a read after write to make sure the data was written correctly. This slows things down quite a bit.

The article I read about this states that Linux is the first operating system to implement these standards, T10 PIM and DIE. It says that they are looking to implement the same technology in Windows, Solaris, and other Unixes. It also states that this is implemented both in hardware and software. In what way does this differ from Solaris/SPARC's data integrity implementation?

--
Time makes more converts than reason

Re:Congratulations... Oracle by illumin8 · 2008-12-13 05:33 · Score: 1

The article I read about this states that Linux is the first operating system to implement these standards, T10 PIM and DIE. It says that they are looking to implement the same technology in Windows, Solaris, and other Unixes. It also states that this is implemented both in hardware and software. In what way does this differ from Solaris/SPARC's data integrity implementation?

It may not be the same implementation, but Solaris on Sparc has built in ECC across all data paths, including CPU -> memory, CPU -> I/O, etc. I believe it is a combination of hardware and software (kernel) that does this.

The article, I believe, is talking about implementing the same level of software checking in x86 versions of the operating systems mentioned.

--
"When the president does it, that means it's not illegal." - Richard M. Nixon

Re:Congratulations... Oracle by setagllib · 2008-12-13 12:40 · Score: 1

Below the filesystem. ZFS can export zvols, which are just block devices whose storage maps on to ZFS blocks. They get checksumming and copy-on-write semantics, even snapshotting, and yet still support any filesystem on top, a whole partition table, a virtual machine disk, or whatever. They're just block devices, but they still get most of the reliability advantages of ZFS itself.

It's not as efficient as storing a ZFS filesystem which can track its used/free space and feed that in to the reliability systems, but if you want reliability, you get a lot of it for zero effort.

What I'm saying is that, as far as I understand, the same volume reliability features can be implemented for Linux without a new filesystem. You'd just have a new integration of existing RAID and block mapping features, plus block level checksumming and copy-on-write (feeding in to the device mapper). Even RAID-Z could be implemented in this way without needing a new filesystem. You can already run Linux filesystems over iSCSI to a zvol, so why not just implement the zvol directly in Linux?

--
Sam ty sig.

Re:Congratulations... Oracle by ToasterMonkey · 2008-12-13 12:57 · Score: 1

SAN vendor (Xiotech) has the first system to support the standard (Emprise 5000/7000

Which standard?

Has Xiotech added Emulex HBAs to the compatibility matrix for those systems yet? ROFL.

Slashdot Mirror

Oracle Adds Data-integrity Code To Linux Kernel

53 comments