Domain: auckland.ac.nz
Stories and comments across the archive that link to auckland.ac.nz.
Comments · 387
-
Re:openbsd rm
These methods have been proven to NOT work as good as one might think they do. The best way to delete a file in any drive: burn the drive until it melts.
-
Re:'Greatest and Luckiest of Mortals' indeed
not invented, discovered
In 'META MATH! -- The Quest for Omega', Gregory Chaitin writes:
also, Leibniz also independantly devised the system of calculus at the same timeNewton was a great physicist, but he was definitely inferior to Leibniz both as a mathematician and as a philosopher. And Newton was a rotten human being---so much so that Djerassi and Pinner call their recent book Newton's Darkness.
Leibniz invented the calculus, published it, wrote letter after letter to continental mathematicians to explain it to them, initially received all the credit for this from his contemporaries, and then was astonished to learn that Newton, who had never published a word on the subject, claimed that Leibniz had stolen it all from him. Leibniz could hardly take Newton seriously!
But it was Newton who won, not Leibniz.
Newton bragged that he had destroyed Leibniz and rejoiced in Leibniz's death after Leibniz was abandoned by his royal patron, whom Leibniz had helped to become the king of England. It's extremely ironic that Newton's incomprehensible Principia---written in the style of Euclid's Elements---was only appreciated by continental mathematicians after they succeeded in translating it into that effective tool, the infinitesimal calculus that Leibniz had taught them!
Morally, what a contrast! Leibniz was such an elevated soul that he found good in all philosophies: Catholic, Protestant, Cabala, medieval scholastics, the ancients, the Chinese... It pains me to say that Newton enjoyed witnessing the executions of counterfeiters he pursued as Master of the Mint.
[The science-fiction writer Neal Stephenson has recently published the first volume, Quicksilver, of a trilogy about Newton versus Leibniz, and comes out strongly on Leibniz's side. See also Isabelle Stengers, La Guerre des sciences aura-t-elle lieu?, a play about Newton vs. Leibniz, and the above mentioned book, consisting of two plays and a long essay, called Newton's Darkness.] -
Re:Could be argued
Could be argued, for that matter the entire concept of "random" is truly just a human thought construct.
You might want to read some of Chaitin's work, and then reconsider. -
Re:A good start, a long way to go.
It's okay. Despite other posters muttering things about crime rates, I've never been attacked or pickpocketed, and haven't heard of it happening to anyone I know - this includes the friend who walked through the Wellington CBD at 3am holding an iBook (not in a bag). But I don't know how this compares to other countries.
Technology prices aren't too bad: see http://www.pricespy.co.nz/
Unlike Australia, we haven't signed a free trade agreement with the US giving us a local equivalent of the DMCA (yet).
And our major cities hardly ever have blackouts! -
Re:Already have oneDick from your speakers?
Maybe you Thought you could replace SSL/SSH with something much better you designed this morning over coffee.
Alternatively, maybe you thought Hey, skins, what a cool idea. -
They left out a couple
For a start, they left out the S programming language (started in 1976), for which John Chambers won the ACM Software Systems Award. This, and its Libre dialect R (thanks to Robert Gentleman and Ross Ihaka at University of Aukland), are in daily use by folks who have to write programs to use data.
-
quote on math books
I'm too lazy to look up which mathematician/physicist said this:
"There are only two kinds of math books: those you can't read past the first page, and those you can't read past the first sentence."
Anyway, Chaitin's other books are really interesting too. There is one called "The Limits of Mathematics" which discusses Godel's proof and even "shows" it interactively with some LISP code at the end. The whole book is free online here, which is a great deal for a very interesting Springer text. Some people think Chaitin too arrogant, but there's not denying he's a great mind. -
Re:Can someone explain why 35 times?Once is sufficient if all you care about is someone connecting the hard disc up to a machine and attempting to recover confidential information via the standard IDE/SCSI protocol and bus.
But if you're concerned about someone ripping the drive open and using electron microscopy to work out the alignment of the molecules (and from that, the data they store), then theory (and experiments?) shows that the multiple-pattern-wipe technique is sufficient to guarantee data is destroyed.
For most data, therefore, one all-zeros wipe is probably sufficient and will take the least time. But for some users and some data, more wipes will be appropriate.
Peter Gutmann's paper is a good place to start for more detail.
--
-
Re:BBC Coverage
There's also a nice site in New Zealand (even though we won't be able to see the Transit this time round) about the T of V and Cook's 'voyage of discovery' that took place partially funded by the Royal Society sending him to Tahiti to record the transit. http://transitofvenus.auckland.ac.nz/
-
Two sides of the same elephant>Godel's Incompleteness Theorem doesn't apply to Turing's Theorem.
"A great many different proofs of Godel's theorem are now known, and the result is now considered easy to prove and almost obvious: It is equivalent to the unsolvability of the halting problem, or alternatively to the assertion that there is an r.e. (recursively enumerable) set that is not recursive."
International Journal of Theoretical Physics 22 -
Re:Cryptlib contains code that violates GPL
The funny thing is that Cryptlib is supposedly GPL
Is not.I know this is slashdot, but is it too much to expect 6 seconds of research?
-
Re:Warnings... [University Attacks]
Yes and the responce from the IT people is now to
delete all e-mails with a .zip attachment with no
warning to sender or recipient.
See their notice -
Re:Warnings... [University Attacks]
Someone is targeting broadband (perhaps University) connections.
Could be. I'm working at a University in New Zealand and got one.
It was pretty obviously faked. There was the fact that it had a password protected zip file and the appalling grammar in the message itself. Our IT staff aren't that bad.
But the funniest and most obvious part was where it told me to visit www.ac.nz for more information.
Clearly whoever wrote the virus didn't intend for it to escape into the rest of the world, as it just assumes the right most two parts of the url are in fact the full domain.
-
Re:were FreeSwan users afforded "luxury of ignoranI'd say it mostly depends on your distribution. Mandrake 9.2 comes with SuperFreeS/WAN. SuSE is excellent too. You can get it working within minutes by adding just a few lines to ipsec.secrets and ipsec.conf.
RedHat on the other hand preferred to distribute CIPE (which turns out to be insecure)instead of FreeS/WAN, so you had to compile your own kernel or use binary modules from the FreeS/WAN site. Unfortunately these binary RPMs only contain the X.509 patch and no extra features like SuperFreeS/WAN.
I believe Debian required some compiling too.
-
Re:Shocking!
Ok first of I have a degree, so you know my perspective. You're right about a some things. It's true that you can coast through a comp sci degree and pass. It's also true that competition for a good grade is something you haven't experienced. Getting good marks (like A's) are harder. You're in an environment where you are competing with 100 or more other people for that A and only a certain percentage of A grades are allowed to be given.
The problem with certifications is that they don't teach problem solving skills or don't develop those abilities. A degree course fills those better. Have a look at the problems on the topcoder website. Note that Google and nVidia hire people from here. Note that the problems presented have nothing to do with the APIs or languages that nVidia or Google may or may not use. These problems are good examples of the problem solving ability that degrees push and certification fails to check (at least that's my impression of the MCSD stuff that MS sell)....You can pass a certification via parrot fashion learning since they teach a different set of skills - mainly basic API knowledge as far as I can tell, from having looked over the MCSD stuff 2 or 3 years ago.
If I were in a position to hire I wouldn't pass over someone that didn't have a degree, but I'd make sure that they had solid problem solving programming ability. I'd do the same for someone with a degree basically because I know that you can coast through and pass - (note that you can't coast and get good marks though because of the competion and percentage barrier on the number of people allowed certain grades). I'd do the same for people with so-called relevant work experience because I know that you can coast through work as well. Problem solving is something that will indicate their ability to learn whatever the next flavour API is.
Just don't make the mistake of tarring everyone with a degree with the same brush. I'll stick my neck out here on Slashdot ... here's a link a piece of code that I've written that solves a hard problem. -
Animal Cultures
Over the past five years there's been a major research effort looking at primate cultures mainly under the guidance of Cristophe Boesch (Chimps - Pan troglodytes spp) and Carole van Schaik (Orang-utans - Pongo pygmaeus), and even Monkeys (the village idiots of the primate family) have been shown to have culture traits.
Anyway, a great webpage on this from Boesch's team Chimpanzee Culture
See also -
Whiten et al. Nature, 399:682-685
van Schaik et al. (2003). Orangutan cultures and the evolution of material culture. Science 299:102-105.
Perry & Manson (2003). Traditions in Monkeys. Evolutionary Anthropology 12:71-81
Oh, and it's not only primates - Fish biologists have also jumped on board -
Bshary et al (2002). Fish cognition: a primate's eye view. Animal Cognition 5:1-13
which shows that fish can do all sorts of massively complex social behaviors - e.g. predator avoidance and something which is very cool, inter-specific (ie: different species co-operating) co-operative hunting. For example: Moray eels (Gymnothorax javanicus) and Red sea coral groupers (Plectropomus pessuliferus). The Morays sneak through holes whilst groupers wait to catch escaping fish - they actually 'go hunting together' and signal each other by shaking their bodies.
Oh, and let's not forget the bird-people:
Corvus Moneduloides
Hunt & Gray (2003). Diversification and cumulative evolution in New Caledonian crow tool manufacture. Proceedings of the Royal Society of London, Series B, Biological Sciences.
Lefebvre et al (2002). Tools and Brains in Birds. Behaviour, 139, 939-973. -
Tell that to Professor Turing...
If you were to logically verify all the code to prove there were no bugs anywhere (yes, this is possible), it would cost orders of magnitude more to develop (which is why nobody ever verifies an entire program). -
Can you say "Kolmogorov complexity"?
One definition of randomness, and one that seems quite reasonable is that a string is "random" if it cannot be compressed to smaller than it is, i.e. listing its characters itself is the most compact possible description. Formally, a string is random if there exists no algorithm generating the string whose description on some universal Turing machine is smaller than the string itself (this is the definition used in the field of Kolmogorov complexity). A string of a billion digits making up Pi, for example, is not random by this definition, as one can easily write a short program, whose length would certainly be less than one billion characters, whose output is the digits of Pi. Think of it this way: the most general form of pattern matching device that we know of is a Turing machine, and if the best device you can construct to match that pattern is as complex or more complex than the pattern itself, then well, you have total randomness. Unfortunately, rigorously proving that a particular string is random by this very strong definition is extremely difficult, as you run into undecidability everywhere you turn.
This is the sort of stuff that real theoretical computer science is made of. For a very good overview of the theory of Kolmogorov Complexity and algorithmic information theory, Gregory Chaitin's home page is a good starting point
To go back to the Voynich manuscript, if there is some sort of regularity that can be discerned from it, then perhaps a context-free or context-sensitive (or something in between) language may be found to characterize it. Once you have such a syntactic characterization, perhaps it might be possible to divine the semantics from context. The shape of the grammar that results may well prove whether the Manuscript is in fact a real language, a fabrication, an elaborate cipher, or just total gibberish.
-
Yes, "Moore's Law"
The term is correct -- see here for details. Within the realms of science, a law is specifically a generalization that may be made based *on observation*. Moore was making a generalization based on past observation. He was not making a theoretical claim about the future. All this is quite proper for a law. As a matter of fact, if he was hypothesizing that processor speeds will stop doubling in the next twenty years, that would be a hypothesis, since it's not a generalization based on a body of observation.
Also, WRT your mention of a mathematician -- mathematicians are not scientists -- they are in a class of their own, along with logicians and many computer "scientists". This group works with absolutes, with provable concepts. Scientists do not do this. Science is a system designed to deal with observation and produce effective models of observed things. -
Re:Don't read the originalsOh come on. The problems in CS are the same today as they were 100 years ago: "how do we compute and how efficienct can it get?" Alot of people grew up reading popular science accounts of the famous people who started computation theory in the early 1990s that did as much idiolizing the individuals as explaining the theory they created. But these people only lived a generation-or-two ago. I mean christ Godel was still alive in the 1970s. So I don't see whats so hard about reading the great papers--the language of the 1930s is almost identical to today's.
Wait, I do have a point. Which is this: the biggest problem in math is trying to find a good enough anaology for a problem that seems unsolvable (not in the technical sense). And what you'll get from reading the original papers is these famous people explaining their analogy and why the other anaologies of the time aren't appropriate. I mean these are just regular people working these ideas out over decades and decades, in turn supported by thousands and thousands of others doing the same. By reading the papers, you get both sides of the story, rather than just "oh, this theorem says this".
That being said, this book by Gregory Chaitin, and his other writings as well, relate the limits of human/mathematical knowledge to programming languages. It is very easy to read, too. And Godel's paper: well everyone is intimidated by it, but 3/4 of the paper is just him writing 45 different programs, and then using those 45 to write the 46th, which is his famous theorem.
-
Re:Data Recovery?Take a look at this paper relating to secure erasure of data on magnetic media.
It's a bit old (written in '96), but I don't think that disk technology has progressed any further in that time - HDDs from that age still work in computers these days.The short version is that Gutmann discovered that the only way to remove data so securely that it's beyond any kind of retrieval is to use a magnet so strong that it physically destroys the disk. Anything less is not 100$ secure - TEM gets around most things, particularly as track densities increase and bit over-lap becomes much more of a problem.
-
Shattered platters are readableUse google. I found these after a bit of searching:
- http://www.drlabs.com/faq.html#9
Using Magnetic Force Microscopy, even a shattered micro-fragment of a hard drive platter can be read.
- http://www.cs.auckland.ac.nz/~pgut001/pubs/secure
_ del.htmlTo start getting useful images of a particular track requires more than a passing knowledge of disk formats, but these are well-documented, and once the correct location on the platter is found a single image would take approximately 2-10 minutes depending on the skill of the operator and the resolution required. With one of the more expensive MFM's it is possible to automate a collection sequence and theoretically possible to collect an image of the entire disk by changing the MFM controller software.
- http://www.drlabs.com/faq.html#9
-
Re:Why is some software more secure than others?
How do secure software authors then avoid the kind of security holes that are difficult to find? By keeping the code simple.
You're way off base in this case. SSL requires the use of X.509 certificates, and it was in the cert parsing code that these new vulnerabilities were found. X.509 means ASN.1 formats, which have at least two different encoding rules, BER and DER that both must be supported; implicit versus explicit tags; several different ways of encoding packet lengths, and a host of other complexities. There's no way to write this kind of code and just keep it simple as you describe. Any implementation of SSL which is going to interoperate with other systems on the net is going to face these complexities.
I've written certificate handling code so I know how complicated it is. Also worth reading is Peter Gutmann's somewhat dated but still insightful X.509 Style Guide which describes some of the horrors an X.509 implementation has to deal with.
In this case the failures were mostly in the error handling, and any developer knows that this tends to be the hardest part of your program to get right. Not only are there a lot more ways things can fail than go right, but they can fail in many more places in your code and it is very difficult to make sure your program can recover gracefully from everywhere something might go wrong.
Also, I'm not sure if it's public yet, but a lot of other implementations are affected by this besides OpenSSL. See the CERT advisory when it comes out and you will find some of the biggest names in the security business got burned by this. It's absurd to suppose that your cosmic insights are somehow being overlooked by companies that base their reputations on security. -
Re:Peter Gutmann and tinc
There are some points where tinc could certainly be improved (some of it already planned for 2.0), but we don't believe the "real problem" he mentions actually exists. We have told him our objections to his writeup, and asked if he could prove or make it more plausible that an attack on the authentication protocol is possible.
Beautiful. Bloody beautiful.
You, with no real security experience, seem to think you know more than Peter Gutmann, who has been working on cryptography and computer security since the mid-1990s. That's rich.
He still hasn't convinced us.
Newsflash: That's not how security works. He has no obligation to convince people who are ignorant of security that there software is flawed. Security needs to demostrate why it is provable and verifiably secure.
Peter Gutmann believes there is a possible MITM attack ("Chess grandmaster attack"), but hasn't shown us how, just that he believes it's there.
I doubt Peter was imagining things. If you cannot follow his comments, I am seriously concerned about how prepared you are to write any security software. I found his writing quite clear, though perhaps you do not understand it is not a flaw in a number of lines of code, but a flaw in the entire protocol being used, a protocol you (your group) created.
We're still in discussion, and if we believe there is really a problem we will fix it.
Wow, I'm now inspired to never to use your software.
-
Re:Edward the Great
Apparently nobody studies history anymore, but Mr. Teller should be recognized as a national hero of science.
Except, on closer inspection, he was also somewhat crazy. He thought nuclear bombs should be used for stuff like earthmoving. Once he proposed an idea along these lines to Greece's Queen Frederika. Supposedly she replied, "Thank vou, Dr. Teller, but Greece has enough quaint ruins already."
Don't get me wrong; I am pro-nuclear. I just think Dr. Teller was a bit too rabid about flinging bombs around.
Just found this link. It's a chapter from Carl Sagan's "The Demon-Haunted World" that talks about Teller. -
Re:New Zealand
A friend of mine wrote up a rather entertaining summary of the Great Auckland blackout. Hope they don't mind the Slashdotting. http://www.cs.auckland.ac.nz/~pgut001/misc/mercur
y .txt -
Re:Suggestions
Good stuff, thanks. Don't know why someone modded that "troll."
We don't want to DIY, but we may have a packaging problem with standard rack-mounted components. For some obscure reason (or possibly no reason at all), the midddeck lockers are just a hair too small to accomodate rack-mount equipment. So, ideally I'd have an internally redundant RAID controller that I could repackage easily.
-
Re:OzAt the time you write your code you don't know the actual value of some symbols. It is varying depends on other run-time values. Thus it's called "variable".
Your question shows that you don't know what is functional programming. To understand that I advise you to read "Why Functional Programming Matters" (HTML short version).
-
I like the joke at the bottom of the X.509 link
At the bottom of the X.509 certificates link
An engineer, a chemist, and a standards designer are stranded on a desert island with absolutely nothing on it. One of them finds a can of spam washed up by the waves.
The engineer says "Taking the strength of the seams into account, we can calculate that bashing it against a rock with a given force will open it up without destroying the contents".
The chemist says "Taking the type of metal the can is made of into account, we can calculate that further immersion in salt water will corrode it enough to allow it to be easily opened after a day".
The standards designer gives the other two a condescending look, gazes into the middle distance, and begins "Assuming we have an electric can opener...". -
Re:Thats the best way...Why not use
/dev/random or another pseudo-random number generator instead of /dev/zero, or at least do one round of zero's, one round of random data, and repeat say... 5-10 times? :)Well, one problem with that method is that the data can still be recovered. Read this paper for more information.
-
Re:Why do I find that so funny?
If you're reading about crypto, and you have not heard of Peter Gutmann, then you are either just *starting* to read about crypto, or you have missed out some of the most important *practical* parts of your reading!
Check also the X509 Style Guide. Outstanding and insightful. Trust no one claiming to know about PKI unless they have read and understood this
:-) -
Re:What about RAM?I am sorry, but you do not know what you are talking about. You can recover data from SRAM and DRAM
... well after it has been powered off!Take a look at this paper or this posting for examples.
No offense implied either. I just thought you might like to reconsider your opinion on this topic.
-
Re:What about RAM?
Yes, you can recover data from that. It involves a scanning electron microscope though, and reading every bit of data individually.
As the storage capacitor flips state, the electromagnetic field that forms at its junction stresses the thin oxide layer around the junction, and stresses it even more the longer the data is held for. This feature can last for a few hours at least, it all depends on how long you have just stored your data for. The only way that you can really prevent this from occurring is to implement continuous bit-flipping in memory.
Read Secure Deletion of Data from Magnetic and Solid-State Memory if you don't believe me. -
A story of DISK, SRAM and DRAM data recoverySummary of the long posting below:
- Data from a hard disk that as been wiped multiple times can be recovered.
- Data left in SRAM and DRAM for a long period of time can be recovered even though the system has been powered off for a while and the SRAM has been cleared.
- While it is hard to recover wiped and old data, it is not impossible.
First, a little background:
I belong to a group that polls/tracks certain elections around the world. In one recent election, there were a number of claims of voting irregularities. Our group became part of a post-election analysis team to look into these irregularities.
We were able to determine that one desktop system in particular contained some critical raw voting data (raw precinct counts of per ballot slot data). The election officials were more than reluctant to give us a copy of that raw data. By the time we were granted a order requiring the election officials to let us access the data, someone had attempted to throughly wipe the desktop system of all traces of data.
We thought we had lost that critical data. But thanks to a chain of contacts we were referred to a consultant that specializes in extremely difficult data recovery. After checking some references (and obtaining more money from OUR client: the consultant was VERY expensive), we hired this consultant.
Much to the surprise of the election officials we obtained an order that allowed us to physically take possession of the system. The system was turned over to the consultant who recovered enough critical election data for our needs.
The recovery included data from the wiped system hard drive as well as from SRAM and DRAM.
Regarding disk recovery:
The disk drive had been wiped by a utility that, we presume, had been run from a CDROM. The wipe tool wrote over the entire disk 35 times, 8 of them were random and 27 of them were fixed patterns of 3 bytes each.
Not all disk data was recovered. Part of the reason was that the data recovery method was not 100% perfect. Part of the reason that some data was not recovered was a simple matter of time. (The consultant was in between two already committed projects and only had a limited amount of time to work for us.)
The consultant did recover some deleted files that were critical to our work. Not everything was recovered, however. Parts of the swap/VM-paging area that might have contained some useful data were not recovered. Also some disk data critical to file and directory layout was not recovered making recovery of parts of the file system layout difficult to map.
Still, some important files (a spreadsheet, simple database file, browser cache, some EMail, etc.) were recovered even though the drive had been wiped 35 times!
Regarding SRAM recovery:
n3rd posted a comment asking about recovering data from RAM.
There are methods that can recover RAM data. Both SRAM and DRAM can be recovered.
According to the consultant, the storage of the same data in SRAM over a long period of time has the effect of altering the preferred power-up state. They said that SRAM can ''remember'' data for days after it held it for a long period of time. This memory can be determined by a ''partial powerup'' (I presume they mean a lower than normal voltage?) and then going ''full on'' and reading the initial values of memory.
In the case described above, the SRAM had been deliberately cleared prior to our group taking possession of the system. The consultant was able to recover the original data even though the SRAM had been cleared and the system has been powered off for more than a day. A simple clearing of memory was not enough to wipe out the long held memory effect.
Regarding DRAM recovery:
DRAM data was also recovered. Data left in DRAM for a long period of time can leave an ''impression'' thru a process somewhat different from SRAM.
As explained by the consultant: With DRAM, recovery comes not from detecting any left over charge, but rather detecting the stress (or lack of stress) from the thin oxide of the cells storage capacitor dielectric. The effect of this stress can be measured by using the DRAM self-test feature. In self-test mode, a small voltage is applied to a cell in order to measure its margin for error. The self-test margin is increased or decreased by the amount of oxide stress.
Not all of the DRAM memory was recovered. However certain critical portions of the DRAM held values for long enough period of time that data was recovered, even though the system has been powered off for more than a day. Data recovered included memory associated with a browser and a spreadsheet. Even though both the browser and the spreadsheet were closed prior to the system being wiped, they were left running long enough to leave behind their DRAM oxide stress.
Based in part on the recovered data, we concluded that candidate A was declared the winner due to a ''mistake'' in mapping ballot slot numbers to candidates. In some cases the slots for candidate A and B were reversed.
An incorrect vote count was reported by the election officials. It is our guess that when we came around asking for the raw data, someone began to collect it. At some point some official(s) discovered the blunder. The system was left on while they stalled for time. When it was clear that we were going to force them to turn over the data someone wiped the system and shut it down.
BTW: The majority of the election officials involved were supporters of candidate B. Even though their blunder caused them to declare candidate A the winner, they still tried to coverup their mistake.
Our conclusion was that the attempt to coverup the mistake was motivated by not wanting to admit the major blunder instead of because of candidate A's influence. This conclusion was reached in part because of messages that we recovered on another system that was not wiped. However we would have never been able to find that other system, nor would we have been able to match the raw slot numbers with the reported vote counts by candidate name without the help of the data recovery consultant and the critical data that they recovered.
I'll offer a few observations:
- Volatile data such as SRAM and DRAM is not as volatile as you might think.
- With enough will, skill and effort, old data can be recovered from a disk that has been overwritten multiple times.
- Packages such as PGP file wipe, GNU shred or Boot and Nuke are likely to only make it harder, but not impossible to recover the data.
- To quote from a paper by
Peter Gutmann:
'' Data which is overwritten an arbitrarily large number of times can still be recovered provided that the new data isn't written to the same location as the original data (for magnetic media), or that the recovery attempt is carried out fairly soon after the new data was written (for RAM). For this reason it is effectively impossible to sanitise storage locations by simple (sic) overwriting them, no matter how many overwrite passes are made or what data patterns are written.''
And even though in that paper next says:'' However by using the relatively simple methods presented in this paper the task of an attacker can be made significantly more difficult, if not prohibitively expensive.''
For our consultant, the recovery process was hard but not extremely difficult. It was expensive for us, however. :-( But we were happy to pay to have it done. :-) - Whoever wrote the 35-pass disk wipe tool must have read that paper, or one similar to it because the overwrite patterns looked similar to the recommended list.
P.S. I know that some people doubt that one can obtain old data from SRAM and DRAM after poweroff. I did too until it was done for our group. To those who still doubt this: I will refer you to Peter Gutmann's paper on Secure Deletion of Data from Magnetic and Solid-State Memory for another source on data recovery methods.
-
Re:Data Layers debunkedAccording to various reports (scientific papers, etc. not stories) it is quite possible to recover multiple generations of data from harddrives.
Yes, I'm familiar with some of those, starting with Guttman's now-ancient 1996 paper Secure Deletion of Data from Magnetic and Solid-State Memory. The OP's sentence that I was responding to was "Theoretically anything that has previously been on the drive should be recoverable through such methods." But it's nowhere near as simple or as "reliable" as that. Besides, I haven't seen any papers in the last few years that talk about doing this with today's drive capacities. Guttman's paper talks about the more advanced drives at the time as being easier to securely erase:
The latest high-density drives use methods like Partial-Response Maximum-Likelihood (PRML) encoding [...] Since PRML codes don't try to separate peaks in the same way that non-PRML RLL codes do, all we can do is to write a variety of random patterns because the processing inside the drive is too complex to second-guess. Fortunately, these drives push the limits of the magnetic media much more than older drives ever did by encoding data with much smaller magnetic domains, closer to the physical capacity of the magnetic media (the current state of the art in PRML drives has a track density of around 6700 TPI (tracks per inch) and a data recording density of 170 kFCI, nearly double that of the nearest (1,7) RLL equivalent. A convenient side-effect of these very high recording densities is that a written transition may experience the write field cycles for successive transitions, especially at the track edges where the field distribution is much broader [15]. Since this is also where remnant data is most likely to be found, this can only help in reducing the recoverability of the data). If these drives require sophisticated signal processing just to read the most recently written data, reading overwritten layers is also correspondingly more difficult. A good scrubbing with random data will do about as well as can be expected.
In addition, remember that many parts of a disk undergo a *lot* of reading and writing of different bit patterns. Recovering a prior generation of data may in fact mean recovering what was written at a particular spot thousands of writes ago. That's just not always possible.And even when it is, it can be guarded against, as I alluded to in my post. The thrust of the abovementioned paper, in fact, is how to delete data so that it can't be recovered, even with the use of advanced techniques.
In short, the notion of realistically recovering data that's been properly erased - not just by an OS-level format - even with hundreds of thousands of dollars at your disposal, is more of a myth than anything else. It's a possibility for security wonks to scare each other with and try to guard against, not something that's happening in practice. Companies that do professional recovery don't even remotely get into this kind of thing, for example, and they're the ones who might have the financial incentive to do so.
-
Re:Oh, man. Hear it comes.
Don't forget degaussing. Someone is going to have to make the obligatory link to Secure Deletion of Data from Magnetic and Solid-State Memory, so there it is.
-
Re:PGP!
Also, what's the one-line unix command (running MacOS X here).
There's dd, of course, but there's also shred, which is included in GNU's fileutils package that also includes stuff like chmod, so if you're running a system populated with GNU tools you probably already have it. (Don't know about OSX, since it's supposed to be BSD-based.)
From the man page:
DESCRIPTION
Overwrite the specified FILE(s) repeatedly, in order to make it harder
for even very expensive hardware probing to recover the data.
From the info page:
This uses many overwrite passes, with the data patterns chosen to
maximize the damage they do to the old data. While this will work on
floppies, the patterns are designed for best effect on hard drives.
For more details, see the source code and Peter Gutmann's paper `Secure
Deletion of Data from Magnetic and Solid-State Memory', from the
proceedings of the Sixth USENIX Security Symposium (San Jose,
California, 22-25 July, 1996). The paper is also available online.
-
Not really deleted
It will be interesting to see just how much the AG wants those logs. It is very hard to really delete things. See
this paper to find out just how hard it it. -
Re:Godel, Omega, Algorithmic Complexity Theory???
Your analogy is incorrect.
Godel's theorem shows that if a set of axioms and an algorithm for proving theorems from these axioms are chosen (ie. a formal axiomatic system is constructed), this system is either capable of proving a result which is false, or incapable of proving a result which is true. The proof of godel's theorem does not proceed within the axiomatic system chosen, but constructs a statement and then appeals to the reader to _observe_ that the axiomatic system is either incomplete or incorrect. There is no way of proving Godel's theorem correct from within the axiomatic system under consideration as it will either consider the Godel statement unassailably true (incorrectly) or have nothing at all to say about it.
It has been shown that simply choosing a second formal system with which to analyse the first does not eliminate the problem as the second system will have its own Godel statement, with the same attendant difficulties.
The point is that an informed reader can "see" the validity of Godel's theorem, but it is impossible to prove it in its most general form within a given axiomatic system.
Since any algorithm is equivalent to an axiomatic system (I think Church proved this one), the mathematician is capable of doing something which an algorithm absolutely cannot.
I agree that Roger Penrose shot himself in the foot by pissing everyone off with his arrogance and going off on his quantum gravity tangent, but this argument does carry weight and is a hell of a lot more sophisticated and difficult to contradict than your simplified version. It may well be that the saddest thing about Penrose' fall from grace is that this argument has been unfairly dismissed by a large number of people who simply do not understand it.
As far as Chaitin and Omega are concerned, the results show Godel incompleteness to be not simply a matter of a single obscure statement within each formal system, but a far more general problem of how much complexity a formal system can exhibit. Regardless of whether you agree with the argument above, you should read this. (He has nothing to say about AI, etc. Though I have a feeling that might be because for a mathematician that might mean instant loss of credibility).
Incidentally, in claiming to be the same as the Godel-based argument, your simplified version assumes that Roger Penrose' mind is equivalent to a computer. Therefore to use your simplified argument to argue for that very proposition is circular.
-
not only are you paranoid...
you are wasting your time. Equipment to recover these files costs on the order of US $2000... you're not stopping many people, huh? When you really want them gone, you should burn your hard disk platters until very little remains and scatter the ashes over the sea.
-
More large sites in New Zealand
Massey university just announced that it is going to build a 128 node beowulf cluster (no imagination necessary!). Auckland University have recently got an IBM Regatta class machine.
Just a (quite impressive) stone's throw away from Weta is NIWA's Cray T3E
bash-2.03$ uname -a
sn6908 kupe 2.0.5.51 unicosmk CRAY T3EI love running that uname
:-) -
Secure Deletion of Data
I'd be interested to hear what the Lee Tydalska has to say about secure deletion of data (i.e. how can you be sure you have destroyed data on a harddrive/cd-rom/floppy/etc). Peter Gutmann wrote a paper on how to destroy data. In the paper, he argues that by overwriting your harddrive multiple times with highly sophisticated patterns, it will be almost impossible to recover the data. I wonder if industry people agree with him.
-
This was my final year project thesis
This was my final year project thesis. Just remember the golden rule unstructured 2 structured == convert 2 XML I wrote a [very bad] program in C++/Perl/tcsh IPC=pipes to add XML tags to English, and then index them into a search engine which would use the lingual data stored in the XML tags to help the search.
NIST does a MASSIVE competition on this annually. I don't want to be an XML-buzzword whore <Arnold Schwarzenegger accent> (XML commando eats Green berets, C++, Java, Perl, COBOL for breakfast)</Arnold Schwarzenegger accent> but you can't beat XML for easily converting anything that you can make sense out of into computer readable format. Real h3cKoRs use SGML, but us underlings have to stick with things we can understand like XML. As for expandability, if we want to encode something else into the document, then just tag-it-and-go
It took me 200 hours to fish out all these links (before the Google days), I don't want anyone to have to waste as much time as I did feeding the search engines exotic foods. It's a year old so pardon me for the odd broken link, armed with these you could probably turn jello into XML ;-)
My favourite bookmarx
PROJect[21 links]
Beginners' Guide[13 links]
Berkeley Linguistics Dept. Course Summaries, general stuffzzzzzzzzzzzzzzCryptic IR Vocabulary defined
Explanations of weird words like hypernym zzzzzzzzzzzzzzHow do we produce and understand speech
How Inverted Files are Created - Univeristy of Berkeley zzzzzzzzzzzzzzNLP Univ. of Indiana, very good basics e.g. word sense d
Simple langauge - useful.... zzzzzzzzzzzzzzWhat is Natural Language Processing, links
What is POS tagging........ zzzzzzzzzzzzzzWord Sense Disambiguation defined
Word Sense Disambiguation in detail, scroll down far zzzzzzzzzzzzzzWord Sense Disambiguator - LOLITA (tested at MUC-7 and SENSEVAL competition as best)
XML for the absolute beginner
HTML, XML stuff + parsers[19 links]
Apache plug-in that uhhh does stuff with XML zzzzzzzzzzzzzzConvert COM to XML
convert XML, HTML to Unix pipeable formats zzzzzzzzzzzzzzconverters to and from HTML
expat XML parser zzzzzzzzzzzzzzHTML Tidy - converts HTML 2 XML + source code!!
Parse DB (RDBMS, whatever) to XML zzzzzzzzzzzzzzPerl-XML Module List
PHP Manual XML parser functions - what the hell are they talking about, PHP Virtual M... zzzzzzzzzzzzzzPublic SGML-XML Software
Pyxie - XML Processor for Python, Perl, etc. zzzzzzzzzzzzzzSGML+XML tools.org
The XML Resource Centre - massive number of links zzzzzzzzzzzzzzW4F wrapper - wrapper converts XML to HTML
XFlat - convert flat file into XML zzzzzzzzzzzzzzXML Parsers and other XML stuff
XML.com - Parsers, etc. zzzzzzzzzzzzzzXML-Data Catalog System - uhhhh looks close
XTAL's general converter - convert anything 2 XML
other Background[8 links]
Is Linux ready for the Enterprise, scalable... zzzzzzzzzzzzzzLinux reliability
Linux Versus Windows NT, Mark(sysinternals bloke) zzzzzzzzzzzzzzPC reliability (pcworld)
SPEC - Standard Performance Evaluation Corp. zzzzzzzzzzzzzzSystems benchmarks
TPC - Transaction Processing Performance Council zzzzzzzzzzzzzzUnix Beats Back NT In EDA Workstation Arena
Proper TREC(-8) QA systems[2 links]
pg. 387 LIMSI-CNRS pretty deep parsing[2 links]
More links....
NLP, IR links - lots to corpii, etc.
pg. 575 U. of Ottawa and NRL (shit system, got 0%)[1 links]
LAKE Lab
pg. 607! University of Sheffield (crap system, but OPEN SOURCE!)[2 links]
GATE - FREE IE app w`source code
LaSIE - ER, coreference, template (cv)
pg. 617 Univ of Surrey (inconclusive matches)[2 links]
System Quirk - Or is this their search system..... Hmmmmmm
Univ of Surrey - pointers (hopefully this is their WILDER search system...)
SMU - Pg. 65[1 links]
Natural Language Processing Laboratory at SMU
Textract[2 links]
Cymfony - Technology
Textract - State of the Art Information Extraction
Xerox uhhhhh maybe[1 links]
Xerox Palo Alto Research Center
(OVERVIEW) 1999 TREC-8 Q&A Track Home Page
NLP bloke, Univ Sussex
Tcl-Tk[4 links] Tcl tutorial
Tcl-Tk Contributed Programs Index
Tcl-Tk Resources, sources
TclXML - manipulating XML using Tcl-Tk
Artificial Natural Language - Is this what I'm trying to parse into...
Comparison of Indexers - Prise vs. Inquery vs. MG, etc.
Eagles - Language Engineering Standards
Language Technology Group - lots of modules!
LDC - Linguistic Data Consortium, lots of corpora
Lexical Resources
Links 2 resources, indexers.....
Lots of IR stuff, University of uhhh
Managing Gigabytes Indexer
Managing Gigabytes Manuals and stuff
Htdig search system
NLP & IR (NLPIR, NIST) Group
OVERVIEW OF MUC-7-MET-2
Perl XML Indexing - XML search engine type thing
Phrasys Language Processing Software Components (money)
QA HCI bullshit
SIGIR - TREC-type thing, resources
SMART indexer system documentation
Text REtrieval Conference (TREC) Home Page
The Natural Language Software Registry
Thunderstone IE and IR products
WordNet - FREE DOWNLOADABLE lexical English database
Page created with URL+, nice utility for working with internet shortcuts -
Further Information
Looking just at the aspects of data deletion on the hard disk (i.e. ignoring the problems arising when data is transmitted to other computers), the problems of irretrievably deleting data have long been known. Most filesystems' delete commands are, of course, trivially insecure, since, at most, they make a note that the disk sectors containing the file are no longer allocated. Even overwriting the data multiple times may not be sufficient. I believe Peter Gutmann's 1996 Usenix paper, Secure Deletion of Data from Magnetic and Solid-State Memory, is still one of the (if not the) authoritative references on the subject. Briefly, when a bit is altered on a disk, the previous bits leave their imprints on the new bit, and it is possible to look back through the layers of deletion for data. Furthermore, it is possible to do this (to a limited degree, but even so...) with relatively inexpensive equipment.
Gutmann then goes on to derive a set of patterns that are optimal for rendering deleted data irretrievable. GNU shred (part of the GNU fileutils) uses these patterns and is the recommended tool for secure deletion in a Unix environment.
Note, however, that shred has some limitations in that it assumes that, when writing data to a file, it is overwriting the old data. The info node notes that this is not the case for some filesystems, including some journaling filesystems. Also, modern hard drives may remap drive sectors on the fly if those sectors begin to fail, leaving the possibility for data to remain in the swapped-out sectors. The safest method is, as usual, complete destruction of the drive.
--Phil (Me? Paranoid? Why do you ask?) -
Re:When you're done with a big job, always wipe.Using random data as opposed to zeroes is more secure because writing zeroes may leave a readable residual magnetic signature on the media whereas random data tends to obscure the mag sig.
Actually, this is only a partial solution. Because of little movements in the read/write heads, you actually have to a one, then a zero, then a one... and so on, depending on how securely you want to wipe out the data. It's the flipping of the polarisation of the little bits of oxide back and forth that actually wipes it out... anything else, will as you say, leave a residual magnetic signature which is recoverable with an oscilloscope and very fine motor control (still not easy though!).
For more information, see: Secure Deletion
-
Re:All you need to do is...
Apparently, this isn't 100% effective:
Contrary to conventional wisdom, "volatile" semiconductor memory does not entirely lose its contents when power is removed. Both static (SRAM) and dynamic (DRAM) memory retains some information on the data stored in it while power was still applied. SRAM is particularly susceptible to this problem, as storing the same data in it over a long period of time has the effect of altering the preferred power-up state to the state which was stored when power was removed. Older SRAM chips could often "remember" the previously held state for several days. In fact, it is possible to manufacture SRAM's which always have a certain state on power-up, but which can be overwritten later on - a kind of "writeable ROM".
This is from Peter Gutmann's paper Secure Deletion of Data from Magnetic and Solid-State Memory -
Who Cares!?
What about Who needs any other crypto solution?
-
Re:domino?ARRGGGGGH ! Don't Do it !!!!!!!
Our facility was recently sold to a company that uses Lotus Notes/Domino. We had been running M$ Outlook/Exchange. I used to complain bitterly about Outlook/Exchange (and I was the e-mail admin), but after fighting with Notes/Domoino for 3 months, I'd give my left nut to back to Outlook/Exchange. Notes/Domino supports even less standards than M$ (Can you say "non-RFC822 compliant"). At least running Exchange I could use a POP3 client to get my mail. I don't even have that choice with Domino. It's a memory hog, slow, and bug ridden.
Plus the fact that the corporation uses it's "database" capability (Think Filemaker Pro v 2.0-3.0) for just about everything; which forces you to use the Notes client. A programmer I work with has pretty much either implemented or found Apache/PHP/MySQL equivalent solutions for all of them that are faster, more reliable, and easier to support.
I aggree 100% with Peter Gutman's assesment:
Notes Spotting"Choose no life, Choose Lotus Notes"
-Neil -
University of Auckland...
...Computer Science department has pretty much everything on the web.
link. -
A paper on handling IIS in a secure manner:
The paper is here.
It's more involved than you might think. If you are a sysadmin, this might be important for your job security.