Slashdot Mirror


Researchers Expanding Diff, Grep Unix Tools

itwbennett writes "At the Usenix Large Installation System Administration (LISA) conference being held this week in Boston, two Dartmouth computer scientists presented variants of the grep and diff Unix command line utilities that can handle more complex types of data. The new programs, called Context-Free Grep and Hierarchical Diff, will provide the ability to parse blocks of data rather than single lines. The research has been funded in part by Google and the U.S. Energy Department."

8 of 276 comments (clear)

  1. Follow the money...? by dzfoo · · Score: 1, Interesting

    funded in part by Google and the U.S. Energy Department

    I wonder what's the interest of these two in this.

              -dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
    1. Re:Follow the money...? by bobaferret · · Score: 3, Interesting

      I wouldn't call it a cancer. But it's definitely useful if you don't ever want commercial companies to use your code in public. It matches up well with the open core model. Commercial people will only use it if you can give them a differently licensed copy of the code. Apache, MIT, and BSD are great if you truly want to give your code away and don't care what people do with it behind closed doors. AGPL is nice to make sure people always give back. LGPL and GPL nice if you only want them to give back if they change it. Should people pay and how much is an age old question. I have to balance the cost of support and development vs. the cost of the product. The more I lean on the community the less I can charge and the more exposure I get. While in the other direction I get more money, but have to spend more of it. And there is no one size fits all solution to any of this.

  2. Interesting... by DangerOnTheRanger · · Score: 3, Interesting

    With these tools, you could make grep and diff work with binary files in a meaningful way - very useful at times. I bet you could even adapt the "Context-Free Grep" into a sort of packet sniffer with enough work. I'd sure like to try these new programs sometime.

  3. Re:DOE?????? by iced_tea · · Score: 3, Interesting

    They have HUGE amounts of data kicking around from various simulations/experiments.

    Check out the wikipedia article for supercomputers, and you'll see DOE mentioned.

    Tools like this could help with analysis and finding certain data sets. IIRC, regex are already used in DNA sequencing. There is probably a similar application and use for tools like this with their data.

  4. Re:Strange names by adonoman · · Score: 3, Interesting

    But having to use quotes every time you call a command is a sure way to make sure your command is never used.

    Would you rather type this:
    ./"Context-Free Grep" ...
    or this:
    ./cfgrep ..

  5. Ooooh! by gstoddart · · Score: 3, Interesting

    As soon as I see "Context-Free Grep", I immediately think of a Context Free Grammar.

    That basically implies we can have much more sophisticated rules that match other structural elements the way a language compiler does. Which means that in theory you could do grep's that take into account structures a little more complex than just a flat file.

    Grep and diff that can be made aware of the larger structure of documents potentially has a lot of uses. Anybody who has had to maintain structured config files across multiple environments has likely wished for this before.

    Sounds really cool.

    --
    Lost at C:>. Found at C.
  6. Microsoft Ad by lucm · · Score: 3, Interesting

    I know I'll be modded down, but I have to say it: what they describe is already available in Powershell, where objects can be piped in search/filter functions.

    --
    lucm, indeed.
  7. Re:Strange names by jejones · · Score: 3, Interesting

    Alas, history and lots of shell scripts have probably made existing command names unchangeable. History in this case goes back to the time people got RSI from ASR-33 Teletypes and didn't want to have to type very much, and names that make sense only if you know other programs (in ed, "g//p" prints all lines containing the specified regular expression, hence the name "grep").

    That said, we programmers are users of programming languages as much as Joe Sixpack is a user of the desktop, and surely we deserve good design as much as they do, so we can get things done rather than taking perverse pride in mastering needlessly ghastly syntax.