For MSDOS 2 Microsoft made both slash and backslash work. This seems reasonable so that you could run old command-line programs and type something in that would get past their assumption that '/' started switches and should be parsed out.
This is all perfectly reasonable.
However after MSDOS 2 Microsoft went downhill fast, becoming exactly what everybody here hates. It would have been *TRIVIAL* to make all the new software prefer slash. But some arrogant asshole there realized that this would make it easier to port Unix software back & forth and thus possibly harm their lock-in, and we got the shit they are producing now.
Actually this backfired. I'm pretty certain that if NT was a Unix clone with Windows GUI, there would be no Linux, as a vast number of people, including me, would probably have happily used it.
I think the "dumb client" would be fine with Unix style behavior, where it keeps getting the previous version of the file. Most of them are just opening it to load it into memory and will close it soon afterwards.
The reason for the default is for back-compatibility with older versions of Windows. But it would be nice if they risked it and changed this. All we want is for delete (and the atomic rename) to work and let the program reading the file to continue reading the old version. It is ok (in fact probably better than Unix) if other programs are prevented from writing the file. I just want to allow rename and delete.
Gah. Rather unfortunate that this is an option on the file reader, since I have little control over that. I would like to see an option on the rename function instead.
Renaming files has worked for remote mounts for quite awhile and we rely on that to work around the file locking bug. I think it is broken if SMB/NFS is not being used, however.
That sounds right but I never learned about it. They need to fix their rename() function to call this.
I guess I can kludge in a macro to fix this.
Found the man page and it does not work for directories so this does not do the job. But it gets the majority of uses which is single file updates. It also is not clear what happens if another process has the old file open though I'm guessing it results in an error?
I'm a bit confused about the api. There must also be some call to indicate the break between the transactions, right? What are they?
What I want to make sure is if I *ignore* the error from the delete, that the rename is still done. I don't want it to say "there was an error in this transacted set and therefore I should roll it all back", I want it to complete.
I'm guessing you can make non-transacted calls in the middle of transacted ones, is this useful? It seems like a reasonable api would be to add start/stop calls but then reuse the existing calls, but it would prevent this mixing.
That's acceptable I think. Single file update by many programs is not really used anymore except by databases, and they can implement their own locking.
The ability to lock a whole lot of operations file operations into a single block is a great idea and it would be nice to see it elsewhere.
On windows if it can be used to group a delete and rename together then we can finally have atomic rename, which would remove one of the big obstacles to making software port between Windows and Unix. Do you know if this works? I can imagine problems like unacceptable overhead, or that errors break the atomicity (and thus the need to detect if the file exists, which would make it non-atomic).
If the file already exists it returns an error and does not do the rename. You have to delete the old file and then do the rename as two separate calls and thus it is not atomic.
Want to make sure that no unauthorized applications are secretly recording your activities? This denies access to the frame buffer from remote viewing.
Wrong. This is only possible if you control the keys to the TPM. If you cannot set the keys you cannot implement any method of making sure unauthorized applications (who do have the keys) are not running.
The reason you cannot set the keys is because it would also allow you to set the keys the same as another machine, and thus play media that is authorized only for that other machine. Otherwise it would be brain-dead obvious that the owner would be able to set their own keys. The fact that this is not allowed makes the lies behind this whole scheme obvious.
Considering that Microsoft has so far failed to make an atomic rename() call then I don't thing they should feel very proud of their results. This bug is that EXT4 breaks a feature that Microsoft has failed to implement in the first place!
Can you combine delete and rename into one of these?
Congratulations, you have finally implemented atomic rename, which Unix has had for THIRTY YEARS on machines with 16K of memory. And it requires 4 calls instead of one. Wow I am so impressed.
Transactions like this are pretty interesting but don't fool yourself into thinking the complicated solution is what is needed all the time.
Atomic rename can be used for this for most uses, including the ones causing the problems.
NTFS style transactions are useful if there are a either many files that have to be insync, or where several programs are writing portions of the same file. However both these cases are quite rare compared to the "save the new version" api that atomic rename allows you to do.
You added a lot of unnecessary steps that are needed on Windows but not POSIX. Here are the only steps needed:
- open file.new for writing - write new contents to file.new - fdatasync() file.new - link file.new to file - unlink file.new
Big nasty thing is that if you want to be perfectly safe, "file.new" has to somehow be a unique name, just in case another program tries to update the file at the same time. It would be nice if the fs or operating system supported this directly.
Write data to a NEW file.
Close file
Rename to the old filename (an atomic operation)
After a crash the program should see either the old or new file. There is no need to worry about finding a backup file.
EXT4 breaks the rename so that both files are lost.
Yea, sticking a sync in there "fixes" the symptom but only by slowing everything down. Please try to understand that the old file (ie the "backup" in your solution) is OK!!!
POSIX implies that rename() (well actually link()) is atomic. This breaks that assumption, as far as most programs are concerned.
Yes you can redefine what "atomic" means in order to somehow imply that EXT4 is obeying it. I mean we could say that each letter is indidually changed and thus if the crash only leaves the first letter changed in the file name is ok.
This violates POSIX for all practical understandings of the text. It has nothing to do with write(), it is rename()/link() that is at fault.
If the program crashes, the new file should disappear as though it had no effect. It should not act like close().
Also it would be useful, though not a requirement, that there is something that can be done to the file so that it just goes away without changing anything. Programs can use this if they determine something is wrong while they are writing it. I'm not sure what this action should be.
One error you made is that it can crash between or during step 4 and the file is not truncated. Ie the crash point is before 5, not 4 as you implied.
I think your cons can be addressed:
Must create new open() flag:
I suspect O_CREATE can be reused, and that the vast majority of programs using this flag actually expect the behavior you define. Whether O_CREATE acts this way or not could be controlled with some global switch the program can do at startup.
Need free space for temp copy:
This may be tricky but the only safe way to update a file now needs the same space. It is plausable that a file system could be designed to need a lot less free space, it is only needed *during* step 4 (otherwise the file is just in memory). Thus only the space for one file at a time may be needed.
Can still lose data:
Make calling fsync() on this file do steps 4+5 above, and after that it is as though the file acts normally.
Any description of the real bug must contain a rename() call.
The question is what happens if you close the file and then rename() it. Now I know that Windows lacks atomic rename, so lets first delete the destination file. The possible results after a crash and recovery should be:
1. The old file is still there (OK)
2. The old file is missing and the new file exists with the new data in it (this is a bug with Windows and does not happen on Unix with either EXT3 or 4. It is not good but at least a good copy of the data is in the new file)
3. The old file has the new contents and the new file is missing. (OK)
Something like EXT4's bug results in the following results on Windows:
4. The old file contains data different than either it's previous contents or what was written to the new file (typically it is empty) and the new file is missing.
5. The old file is missing and the new file contains data different than what was written to it.
I suspect Windows does NOT have this bug, as it really is an incredible annoyance that a crash can destroy all copies of the data on your disk and they would have noticed and fixed this pretty quickly.
Wrong. The longer delay is not the bug. If EXT3 crashed during those 5 seconds, the bug did not happen. If EXT4 was changed to flush after 5 seconds, and it crashes during those 5 seconds, the bug still will happen.
The problem is that rename(A,B) does not force the contents of A to be up to date before the rename happens. Virtually every Unix program in existence that tries to safely update files assumes this is true.
Sticking fsync() in there, as about a thousand idiots have suggested, is NOT the solution. It will "fix" it but only at the cost of slowing everything greatly. The thing they are missing is that it is ok if the *old* file is still there after a crash. What is unacceptable is that neither the old or new file are there in EXT4.
Wrong. KDE opened another file. This file was truncated to zero at some time. They then wrote the data and closed the file. They then renamed it to replace the file in question.
At no point did they truncate the file in question. So this result is certainly unexpected!
Bullshit. NTFS is not the "only" system. Unix has had atomic rename for 30 years. This is the function that is wanted, and is the function that EXT4 breaks.
Windows of couse has never implemented atomic rename, and manages to come up with 50 different options and file flags to try to make up for it. And now the same idiots are invading Linux, saying "call fsync!" as though fixing the symptoms byy making the system slow is how to fix a bug.
This is getting really sad as it is obvious that the knowledgable parites are in the tiny minority here.
Every single person who mentions "fsync" as a solution is, for lack of a better word, stupid.
For MSDOS 2 Microsoft made both slash and backslash work. This seems reasonable so that you could run old command-line programs and type something in that would get past their assumption that '/' started switches and should be parsed out.
This is all perfectly reasonable.
However after MSDOS 2 Microsoft went downhill fast, becoming exactly what everybody here hates. It would have been *TRIVIAL* to make all the new software prefer slash. But some arrogant asshole there realized that this would make it easier to port Unix software back & forth and thus possibly harm their lock-in, and we got the shit they are producing now.
Actually this backfired. I'm pretty certain that if NT was a Unix clone with Windows GUI, there would be no Linux, as a vast number of people, including me, would probably have happily used it.
Whoosh!
NASA didn't build most of the stuff that went into space; private companies did.
And food stamps are not socialist because the food was manufactured by private companies.
I think the "dumb client" would be fine with Unix style behavior, where it keeps getting the previous version of the file. Most of them are just opening it to load it into memory and will close it soon afterwards.
The reason for the default is for back-compatibility with older versions of Windows. But it would be nice if they risked it and changed this. All we want is for delete (and the atomic rename) to work and let the program reading the file to continue reading the old version. It is ok (in fact probably better than Unix) if other programs are prevented from writing the file. I just want to allow rename and delete.
Gah. Rather unfortunate that this is an option on the file reader, since I have little control over that. I would like to see an option on the rename function instead.
Renaming files has worked for remote mounts for quite awhile and we rely on that to work around the file locking bug. I think it is broken if SMB/NFS is not being used, however.
That all makes sense. I should have looked at the msdn pages and seen that there was a transaction argument to the calls.
You are right that a crash had better act like rollback for all the unfinished transactions! I would be reasonably confident they got that right.
No I meant "what happens if another process has the file open for read?".
On Unix this is irrelevant, it keeps reading the old copy of the file, which is deleted when the file is closed.
On Windows I know that you can't delete() these files, so I am guessing that rename() even with this flag does not get around this annoyance.
That sounds right but I never learned about it. They need to fix their rename() function to call this.
I guess I can kludge in a macro to fix this.
Found the man page and it does not work for directories so this does not do the job. But it gets the majority of uses which is single file updates. It also is not clear what happens if another process has the old file open though I'm guessing it results in an error?
I'm a bit confused about the api. There must also be some call to indicate the break between the transactions, right? What are they?
What I want to make sure is if I *ignore* the error from the delete, that the rename is still done. I don't want it to say "there was an error in this transacted set and therefore I should roll it all back", I want it to complete.
I'm guessing you can make non-transacted calls in the middle of transacted ones, is this useful? It seems like a reasonable api would be to add start/stop calls but then reuse the existing calls, but it would prevent this mixing.
That's acceptable I think. Single file update by many programs is not really used anymore except by databases, and they can implement their own locking.
The ability to lock a whole lot of operations file operations into a single block is a great idea and it would be nice to see it elsewhere.
On windows if it can be used to group a delete and rename together then we can finally have atomic rename, which would remove one of the big obstacles to making software port between Windows and Unix. Do you know if this works? I can imagine problems like unacceptable overhead, or that errors break the atomicity (and thus the need to detect if the file exists, which would make it non-atomic).
In what way rename in Windows is not atomic?
If the file already exists it returns an error and does not do the rename. You have to delete the old file and then do the rename as two separate calls and thus it is not atomic.
Want to make sure that no unauthorized applications are secretly recording your activities? This denies access to the frame buffer from remote viewing.
Wrong. This is only possible if you control the keys to the TPM. If you cannot set the keys you cannot implement any method of making sure unauthorized applications (who do have the keys) are not running.
The reason you cannot set the keys is because it would also allow you to set the keys the same as another machine, and thus play media that is authorized only for that other machine. Otherwise it would be brain-dead obvious that the owner would be able to set their own keys. The fact that this is not allowed makes the lies behind this whole scheme obvious.
closing a file descriptor does an fsync()
This has been false for decades, on every operating system in common use, not just Linux.
Considering that Microsoft has so far failed to make an atomic rename() call then I don't thing they should feel very proud of their results. This bug is that EXT4 breaks a feature that Microsoft has failed to implement in the first place!
Can you combine delete and rename into one of these?
Congratulations, you have finally implemented atomic rename, which Unix has had for THIRTY YEARS on machines with 16K of memory. And it requires 4 calls instead of one. Wow I am so impressed.
Transactions like this are pretty interesting but don't fool yourself into thinking the complicated solution is what is needed all the time.
Atomic rename can be used for this for most uses, including the ones causing the problems.
NTFS style transactions are useful if there are a either many files that have to be insync, or where several programs are writing portions of the same file. However both these cases are quite rare compared to the "save the new version" api that atomic rename allows you to do.
You added a lot of unnecessary steps that are needed on Windows but not POSIX. Here are the only steps needed:
- open file.new for writing
- write new contents to file.new
- fdatasync() file.new
- link file.new to file
- unlink file.new
Big nasty thing is that if you want to be perfectly safe, "file.new" has to somehow be a unique name, just in case another program tries to update the file at the same time. It would be nice if the fs or operating system supported this directly.
You are thinking in a Windows-centric way.
The correct way to do it is:
Write data to a NEW file.
Close file
Rename to the old filename (an atomic operation)
After a crash the program should see either the old or new file. There is no need to worry about finding a backup file.
EXT4 breaks the rename so that both files are lost.
Yea, sticking a sync in there "fixes" the symptom but only by slowing everything down. Please try to understand that the old file (ie the "backup" in your solution) is OK!!!
POSIX implies that rename() (well actually link()) is atomic. This breaks that assumption, as far as most programs are concerned.
Yes you can redefine what "atomic" means in order to somehow imply that EXT4 is obeying it. I mean we could say that each letter is indidually changed and thus if the crash only leaves the first letter changed in the file name is ok.
This violates POSIX for all practical understandings of the text. It has nothing to do with write(), it is rename()/link() that is at fault.
Actually one important detail:
If the program crashes, the new file should disappear as though it had no effect. It should not act like close().
Also it would be useful, though not a requirement, that there is something that can be done to the file so that it just goes away without changing anything. Programs can use this if they determine something is wrong while they are writing it. I'm not sure what this action should be.
This extra flag is an EXCELLENT idea!
One error you made is that it can crash between or during step 4 and the file is not truncated. Ie the crash point is before 5, not 4 as you implied.
I think your cons can be addressed:
Must create new open() flag:
I suspect O_CREATE can be reused, and that the vast majority of programs using this flag actually expect the behavior you define. Whether O_CREATE acts this way or not could be controlled with some global switch the program can do at startup.
Need free space for temp copy:
This may be tricky but the only safe way to update a file now needs the same space. It is plausable that a file system could be designed to need a lot less free space, it is only needed *during* step 4 (otherwise the file is just in memory). Thus only the space for one file at a time may be needed.
Can still lose data:
Make calling fsync() on this file do steps 4+5 above, and after that it is as though the file acts normally.
You are misunderstanding the bug.
Any description of the real bug must contain a rename() call.
The question is what happens if you close the file and then rename() it. Now I know that Windows lacks atomic rename, so lets first delete the destination file. The possible results after a crash and recovery should be:
1. The old file is still there (OK)
2. The old file is missing and the new file exists with the new data in it (this is a bug with Windows and does not happen on Unix with either EXT3 or 4. It is not good but at least a good copy of the data is in the new file)
3. The old file has the new contents and the new file is missing. (OK)
Something like EXT4's bug results in the following results on Windows:
4. The old file contains data different than either it's previous contents or what was written to the new file (typically it is empty) and the new file is missing.
5. The old file is missing and the new file contains data different than what was written to it.
I suspect Windows does NOT have this bug, as it really is an incredible annoyance that a crash can destroy all copies of the data on your disk and they would have noticed and fixed this pretty quickly.
Wrong. The longer delay is not the bug. If EXT3 crashed during those 5 seconds, the bug did not happen. If EXT4 was changed to flush after 5 seconds, and it crashes during those 5 seconds, the bug still will happen.
The problem is that rename(A,B) does not force the contents of A to be up to date before the rename happens. Virtually every Unix program in existence that tries to safely update files assumes this is true.
Sticking fsync() in there, as about a thousand idiots have suggested, is NOT the solution. It will "fix" it but only at the cost of slowing everything greatly. The thing they are missing is that it is ok if the *old* file is still there after a crash. What is unacceptable is that neither the old or new file are there in EXT4.
Wrong. KDE opened another file. This file was truncated to zero at some time. They then wrote the data and closed the file. They then renamed it to replace the file in question.
At no point did they truncate the file in question. So this result is certainly unexpected!
Bullshit. NTFS is not the "only" system. Unix has had atomic rename for 30 years. This is the function that is wanted, and is the function that EXT4 breaks.
Windows of couse has never implemented atomic rename, and manages to come up with 50 different options and file flags to try to make up for it. And now the same idiots are invading Linux, saying "call fsync!" as though fixing the symptoms byy making the system slow is how to fix a bug.
This is getting really sad as it is obvious that the knowledgable parites are in the tiny minority here.
Every single person who mentions "fsync" as a solution is, for lack of a better word, stupid.