Researchers Expanding Diff, Grep Unix Tools
itwbennett writes "At the Usenix Large Installation System Administration (LISA) conference being held this week in Boston, two Dartmouth computer scientists presented variants of the grep and diff Unix command line utilities that can handle more complex types of data. The new programs, called Context-Free Grep and Hierarchical Diff, will provide the ability to parse blocks of data rather than single lines. The research has been funded in part by Google and the U.S. Energy Department."
I wonder what's the interest of these two in this.
-dZ.
Carol vs. Ghost
With these tools, you could make grep and diff work with binary files in a meaningful way - very useful at times. I bet you could even adapt the "Context-Free Grep" into a sort of packet sniffer with enough work. I'd sure like to try these new programs sometime.
My blog
They have HUGE amounts of data kicking around from various simulations/experiments.
Check out the wikipedia article for supercomputers, and you'll see DOE mentioned.
Tools like this could help with analysis and finding certain data sets. IIRC, regex are already used in DNA sequencing. There is probably a similar application and use for tools like this with their data.
But having to use quotes every time you call a command is a sure way to make sure your command is never used.
Would you rather type this:
./"Context-Free Grep" ...
./cfgrep ..
or this:
As soon as I see "Context-Free Grep", I immediately think of a Context Free Grammar.
That basically implies we can have much more sophisticated rules that match other structural elements the way a language compiler does. Which means that in theory you could do grep's that take into account structures a little more complex than just a flat file.
Grep and diff that can be made aware of the larger structure of documents potentially has a lot of uses. Anybody who has had to maintain structured config files across multiple environments has likely wished for this before.
Sounds really cool.
Lost at C:>. Found at C.
I know I'll be modded down, but I have to say it: what they describe is already available in Powershell, where objects can be piped in search/filter functions.
lucm, indeed.
Alas, history and lots of shell scripts have probably made existing command names unchangeable. History in this case goes back to the time people got RSI from ASR-33 Teletypes and didn't want to have to type very much, and names that make sense only if you know other programs (in ed, "g//p" prints all lines containing the specified regular expression, hence the name "grep").
That said, we programmers are users of programming languages as much as Joe Sixpack is a user of the desktop, and surely we deserve good design as much as they do, so we can get things done rather than taking perverse pride in mastering needlessly ghastly syntax.