Slashdot Mirror


Researchers Expanding Diff, Grep Unix Tools

itwbennett writes "At the Usenix Large Installation System Administration (LISA) conference being held this week in Boston, two Dartmouth computer scientists presented variants of the grep and diff Unix command line utilities that can handle more complex types of data. The new programs, called Context-Free Grep and Hierarchical Diff, will provide the ability to parse blocks of data rather than single lines. The research has been funded in part by Google and the U.S. Energy Department."

276 comments

  1. Strange names by gnasher719 · · Score: 4, Funny

    Space characters in the name of a Unix command line tool is asking for trouble.

    1. Re:Strange names by bobdinkel · · Score: 0

      There's nothing that says the name of the tool and the command you type must be the same. I wouldn't sweat it.

      --
      A publicly traded company exists solely to make profits for shareholders.
    2. Re:Strange names by realyendor · · Score: 4, Insightful

      I expect those are just the spoken names and that the commands will still be single words, similar to:
      "GNU awk" -> gawk
      "enhanced grep" -> egrep

    3. Re:Strange names by dougmc · · Score: 1

      "enhanced grep" -> egrep

      Well, except that egrep is already taken :)

      But yeah, your point is valid and probably correct.

    4. Re:Strange names by dougmc · · Score: 2

      and I really should spend a few more seconds thinking about what I'm responding to. Obviously gawk and egrep are existing tools, given as examples, not proposed names for these new tools.

    5. Re:Strange names by pclminion · · Score: 1

      If the FS supports spaces in filenames, then you have broken code if you can't tolerate it. MS wisely put a space in the "Program Files" name when they added long filenames to Windows. That'll put any delusions about being able to ignore it to a direct immediate stop.

    6. Re:Strange names by rwa2 · · Score: 2

      Yay, a tools thread!

      I am liking meld (python-based visual diff)

      But I suppose they have a different concept of hierarchical diff than diffing/merging two directory structures.

    7. Re:Strange names by Anonymous Coward · · Score: 0

      Try reading the post you replied to.

    8. Re:Strange names by adonoman · · Score: 3, Interesting

      But having to use quotes every time you call a command is a sure way to make sure your command is never used.

      Would you rather type this:
      ./"Context-Free Grep" ...
      or this:
      ./cfgrep ..

    9. Re:Strange names by ivoras · · Score: 2

      But of course, "eegrep" isn't :)

      (enhanced enhaced grep)

      --
      -- Sig down
    10. Re:Strange names by jandrese · · Score: 1

      Ironically, many of Microsoft's tools have trouble dealing with the space in the filename, including the blasted Run window.

      Just because there is a way to make it work doesn't means there isn't a problem with it. All unix shells can handle spaces in filenames, but the methods to do so are not always intuitive and it's easy to mess up things like shell scripts. Even the "proper" solutions have problems.

      And I can't stand "Program Files", what a mess that has been.

      --

      I read the internet for the articles.
    11. Re:Strange names by ripler · · Score: 4, Funny

      Next thing you know we'll have CSIgrep. (enhance enhance enhance grep)

    12. Re:Strange names by sys_mast · · Score: 1

      I was going to say ./cgrep but your suggestion is better since it won't be confused with "Context Grep" Which would imply it is NOT Context free.

      so is the other command ./hdiff ?

      --
      Those who can, do.
    13. Re:Strange names by Longjmp · · Score: 4, Insightful

      Definitely
      II mean, where would we end up if unix commands actually give a hint what they are doing ;-)
      As a unix novice, if I wanted to search for something, my first choice of course would be grep
      Also if I wanted help on something, the first word that jumps to my mind would be man

      heh.

      --
      There are fewer illiterates than people who can't read.
    14. Re:Strange names by mytec · · Score: 4, Informative

      According to this paper, they are called bgrep and bdiff.

    15. Re:Strange names by Anonymous Coward · · Score: 1

      MS wisely put a space in the "Program Files" name when they added long filenames to Windows.

      You mean the PROGRA~1 directory?

    16. Re:Strange names by EdIII · · Score: 3, Informative

      and I really should spend a few more seconds thinking about what I'm responding to

      That's not what Slashdot is about........

    17. Re:Strange names by Anne+Thwacks · · Score: 4, Funny

      CSIgrep would take 30 mins to get the result! (With ad breaks)

      --
      Sent from my ASR33 using ASCII
    18. Re:Strange names by iluvcapra · · Score: 4, Insightful

      If you don't like a tool's name, export an alias.

      It's not about typing commands as much as it's about making these work:

      $ find . -name ".txt" | xargs wc
      $ for file in $*; do
      mv $file old/$file
      done

      Versus these:

      $ find . -name ".txt" -print0 | xargs -0 wc
      $ for file in $*; do
      mv "$file" "old/$file"
      done

      A lot of scripts you run into are just broken because of braindead assumptions.

      --
      Don't blame me, I voted for Baltar.
    19. Re:Strange names by gangien · · Score: 2

      in scripts, i pretty much quote everything. seems to be the way to avoid problems. of course, i'm not a sysadmin by trade, so maybe it's bad for some reason or something.

      when at the prompt i hit tab.

      We'd probably avoid a lot of problems, if people wouldn't be so lazy to not type a few extra characters.

    20. Re:Strange names by toadlife · · Score: 4, Funny

      "I have only been able to come up with one algorithm for creating Unix command names: think of a good English word to describe what you want to do, then think of an obscure near- or partial-synonym, throw away all the vowels, arbitrarily shorten what's left, and then, finally, as a sop to the literate programmer, maybe reinsert one of the missing vowels."

      Rachel Padman

      --
      I don't always use unix-like operating systems; but when I do, I prefer FreeBSD.
    21. Re:Strange names by berashith · · Score: 1

      yes, but it is nice to know that all of your expectations for the first 26 minutes are incorrect.

    22. Re:Strange names by Anonymous Coward · · Score: 0

      grep++

    23. Re:Strange names by Noughmad · · Score: 1

      Just wait until Microsoft sees your post and we'll have eeegrep.

      --
      PlusFive Slashdot reader for Android. Can post comments.
    24. Re:Strange names by Tomato42 · · Score: 1

      sign me in if it will search a 1TB data set in those 30min!

    25. Re:Strange names by Anonymous Coward · · Score: 1

      Actually, your "correct" code is also broken. It should read:

      for file in "$@"; do
          mv "$file" "old/$file"
      done

      You see, $* expands to string, not list.

    26. Re:Strange names by Anonymous Coward · · Score: 0

      add an alias in your .kshrc and then call it what you want....

    27. Re:Strange names by Anonymous Coward · · Score: 0

      MS wisely put a space in the "Program Files" name when they added long filenames to Windows.

      And with in swedish Windows XP that dir was called "Program",(you'd end up having both the swedish dir and "Program Files", since many programs didn't give any kind of $AppInstallPath variable, but hard-coded "Program Files". I've seen many an XP boot always opening the "Program" folder. I'm thinking it was some daemon or something-ware that was supposed to open something in "Program Files" but opened "Program" in windows explorer when it got to the space...

    28. Re:Strange names by marcosdumay · · Score: 1

      No, those 30min is per bit.

      But you'd be surprized by the amount of information you can gather from a single bit!

    29. Re:Strange names by bryan1945 · · Score: 1

      If you use a TV instead of a monitor, science and computer stuff runs really, really fast.

      --
      Vote monkeys into Congress. They are cheaper and more trustworthy.
    30. Re:Strange names by mfnickster · · Score: 4, Insightful

      There's nothing that says the name of the tool and the command you type must be the same

      Very true. Unix programmers seem to follow these rules:

      1. delete any spaces in the name
      2. delete any vowels in the name
      3. delete any superfluous consonants
      4. chuck the entire thing and just abbreviate it to the first letter of each word in the name

      So these tools will likely be run as "ctxtfrgrp" and "hierdiff" or just "cfgrep" and "hdiff"

      --
      "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
    31. Re:Strange names by Mister+Liberty · · Score: 1

      You meant '(enhanced enhanced grep)'.

      There, enhanced that for ya.

    32. Re:Strange names by unixisc · · Score: 1

      Like 'cat' for concatenate, or vi for what exactly?

    33. Re:Strange names by urdak · · Score: 2

      Like 'cat' for concatenate, or vi for what exactly?

      "vi" is short of "visual".
      First there was "ed", the, you guessed it, "editor". But "ed" was a real pain to use, because you wouldn't see what you were actually editing (if you ever used ed, you'd know what I mean). So the "visual" editor "vi" was invented.

    34. Re:Strange names by serviscope_minor · · Score: 1

      Also if I wanted help on something, the first word that jumps to my mind would be man

      If you want help, perhaps you should read the MMMAAANNNual.

      hint.

      --
      SJW n. One who posts facts.
    35. Re:Strange names by Dog-Cow · · Score: 1

      The Run window has no problems with spaces. The problem is that you expect it to read your mind and figure out which part is the command, which is arguments, and which arguments are really one argument. If you quote, everything works just fine.

    36. Re:Strange names by emj · · Score: 1

      ./C makes that ok, but that's not the problem. The problem is that you loose on one level of quotation.

    37. Re:Strange names by kelemvor4 · · Score: 1

      Definitely II mean, where would we end up if unix commands actually give a hint what they are doing ;-) As a unix novice, if I wanted to search for something, my first choice of course would be grep Also if I wanted help on something, the first word that jumps to my mind would be man heh.

      It's a reasonable assumption that unix was designed specifically to be counter intuitive.

    38. Re:Strange names by TheSpoom · · Score: 1, Funny

      Bonus points if the command is an inscrutable acronym that refers to itself.

      --
      It's better to vote for what you want and not get it than to vote for what you don't want and get it.
      - E. Debs
    39. Re:Strange names by jandrese · · Score: 1

      Bingo! You've discovered the basic problem with spaces in names, space is reserved as a delimiter and thus you're forced to quote anything you type that has a space in the name. It's the textbook example of the awkward workaround. If it were rare it wouldn't be a big deal (like on most Unix systems), but in Windows you end up having to use it all the damn time if you do any work at all on the commandline, even for simple operations. It's bad ergonomics.

      --

      I read the internet for the articles.
    40. Re:Strange names by unixisc · · Score: 1

      I've used both ed & vi. vi wasn't all that much better. vim is, but I really would prefer emacs. Another editor I liked during my brief stint in programming was crisp on unix, and Borland Brief on Windows.

    41. Re:Strange names by jd · · Score: 5, Funny

      You have to figure in two's complement notation. If it's sufficiently counter-intuitive, the sign bit flips over and it becomes totally intuitive.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    42. Re:Strange names by iluvcapra · · Score: 1

      touché, I've gotten it right before but not this time!

      --
      Don't blame me, I voted for Baltar.
    43. Re:Strange names by jd · · Score: 1

      What if you pipe the results through /dev/tivo?

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    44. Re:Strange names by Anonymous Coward · · Score: 0

      I totally grok you here. I mean, who do they fnort we zzzzzrt?

      W

    45. Re:Strange names by jejones · · Score: 3, Interesting

      Alas, history and lots of shell scripts have probably made existing command names unchangeable. History in this case goes back to the time people got RSI from ASR-33 Teletypes and didn't want to have to type very much, and names that make sense only if you know other programs (in ed, "g//p" prints all lines containing the specified regular expression, hence the name "grep").

      That said, we programmers are users of programming languages as much as Joe Sixpack is a user of the desktop, and surely we deserve good design as much as they do, so we can get things done rather than taking perverse pride in mastering needlessly ghastly syntax.

    46. Re:Strange names by bzipitidoo · · Score: 1

      That's why I always create these 2 directories on Windows installations: "C:\Software" and "C:\Hardware". I change "Program Files" to "Software" in every installer that gives the user that option. Except driver software goes in Hardware. Quick way to sort out what I've installed from what something else installed. And it fits in 8 characters, in case that old limit is ever an issue.

      --
      Intellectual Property is a monopolistic, selfish, and defective concept. It is "tyranny over the mind of man"
    47. Re:Strange names by jejones · · Score: 1

      "Catenate" is actually a word and means the same thing as "concatenate". Unfortunately, 1 - epsilon of people associate "cat" with F. domesticus, so "cat" was a really lousy choice.

    48. Re:Strange names by jejones · · Score: 1

      Shame on me for typing literal greater than and less than. That should have been "g/<regular expression>/p".

    49. Re:Strange names by morgauxo · · Score: 4, Insightful

      GP was a joke I am sure.

      As to yours though.. I wouldn't want spaces in my commands. How do you tell where the command ends and the arguments begin?

      As for man... man is the MANual. That's not that bad is it? Ok, help might be a little better but it's not a big deal unless you are very closed minded. It's really a history thing. Man wasn't just somebody's idea of a help command. Unix (or Unics as it was called back then) originally actually had a manual. As in dead trees paper! It got big. Real big. One day Dennis Ritchie accidentally dropped a copy and killed his dog. Flattened the poor girl like a pancake. After that he decided it needed to be digital. Man is a digital copy of that original dog killing book plus decades of additions and updates. Thus it is man(ual).

      Now should manual have been "manual" or maybe the real whole title "Unix Programmers Manual"? It might be easier to remember. 5 years after you learned that command and you are still typing it 5 times a day would you still appreciate the ease of using real whole English words? Are you that abc? (abreviationally challenged) Or do you just really love typing. Is your r/l name Mavis Beacon?

      That's how a lot of Unix commands are, they make plenty of sense with history. I'm sure grep and the others all have their own stories. Well.. not all. How much of a story does it take to come to ls is a lazy way to type list? Oh, yah, you are AbC. Sorry about that.

      Yes, the history of decades old programming decisions isn't really something you want to learn to use an OS (or any other software). But what's the alternative? Throw everything out x number of years and start over? It sounds great when you are a hopeless newbie but once you actually learn something do you want to do it all over again every 10 years just to make it easy for the next batch of basement kiddies? Your clock is ticking too you know! Now get off my lwn!!!! (lawn)

      P.S. Ok, Ok, I made up the dog part of the story. But it COULD have happened! The rest was real. Actually, I don't KNOW that it didn't happen... hmm....

    50. Re:Strange names by Longjmp · · Score: 1, Funny

      I'd just say woosh, but anyway:
      I basically grew up with PDP-11 and then VAXen running VMS (thanks to my father).
      I think I was first "exposed" to unix roughly 25 years ago, but I still think, the first command I entered AND returned some kind of result was "man this is plain shit"

      ;-)

      --
      There are fewer illiterates than people who can't read.
    51. Re:Strange names by Anomalyst · · Score: 2

      dont forget the 'n' prefix to indicate the previous flavour is deprecated.

      --
      There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
    52. Re:Strange names by Anomalyst · · Score: 1

      Screw it. We're going with 5 e's: eeeeegrep.

      --
      There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
    53. Re:Strange names by Anomalyst · · Score: 1

      It wasn't laziness, it was the latency of 10cps teletypes.

      --
      There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
    54. Re:Strange names by swordgeek · · Score: 1

      If the FS supports delimiters in filenames you will necessarily have to quote them. This is dumb.

      An accurate statement is "even if the FS supports delimiters in filenames, it doesn't mean you should use them."

      --

      "People who do stupid things with hazardous materials often die." -- Jim Davidson on alt.folklore.urban
    55. Re:Strange names by Anonymous Coward · · Score: 0

      man is a bad example... another word that pops into my head is 'manual'

    56. Re:Strange names by amRadioHed · · Score: 1

      Really? You don't think actually being able to see what your editing is much better? 90% of the time I'm not using any of the fancy additions vim added, so vi and vim are not much different for me.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    57. Re:Strange names by amRadioHed · · Score: 1

      It would be a lousy choice, if any unix commands actually had anything to do with felines. But none do, so where would any confusion over the name would come from.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    58. Re:Strange names by Rary · · Score: 1

      Next thing you know we'll have CSIgrep. (enhance enhance enhance grep)

      Wouldn't that need a GUI created in Visual BASIC?

      --

      "You cannot simultaneously prevent and prepare for war." -- Albert Einstein

    59. Re:Strange names by Anonymous Coward · · Score: 0

      the first word that jumps to my mind would be man

      "Man, I need help."

    60. Re:Strange names by GumphMaster · · Score: 1

      Yeah, 'cause pressing F1 for help (sometimes) and F3, Ctrl-F or some other key or menu item to search (sometimes) is so much more obvious. Average Joe doesn't seem able to find the clearly labelled Help menu in most programs, so the difference between that and not knowing man is negligible.

      --
      Patent litigation: A doctrine of Mutually Assured Destruction... in which everyone seems willing to push the button
    61. Re:Strange names by InterGuru · · Score: 1

      If you're a feminist, you can type "man bash"

    62. Re:Strange names by 93+Escort+Wagon · · Score: 3, Funny

      Just wait until Microsoft sees your post and we'll have eeegrep.

      No, I expect they'd call it grep#. And when Apple forks their own version, it'll be objective grep.

      --
      #DeleteChrome
    63. Re:Strange names by Richard_J_N · · Score: 1

      You do realise there is a "dog" command? It's written as "an improved version of cat".

    64. Re:Strange names by Anonymous Coward · · Score: 1

      There are two issues with this. The first one is that ever since tab-completion came on the scene (and has had a good implementation) this issue has been moot, because you only have to type out the number of letters that make a command unambiguous, and if you do double tab "completion", it shows you what commands it thinks you might mean. It is an early implementation of the Awesome Bar in Firefox, and some will probably argue that it is a better implementation thereof of something helpful.

      The second issue is that somebody (namely, me) came up with the great idea of making up intuitive, abbreviated aliases for commands and installing them by default on Ubuntu (check out Ubuntu Brainstorm for this idea). In this way, the commands are still there, but there are useful and DISCOVERABLE aliases that "normal" people can use as they learn the command line, and as they get more comfortable with it (and get more annoyed at typing out the whole command name) they can switch over AS THEY LIKE to using the short version of the command. The idea was immediately flamed down to the 9th circle of hell because OMG YOU R CHNGING TEH COMMAND LINE U R BAD PEARSON U DIAF, or for those of you who don't like hyperbole, because it would make learning the command line a lot easier on the uninitiated.

      I think people who run Macs, Linux, Unix, and use iPods, iPhones, iPads, and Android phones and tablets are all annoying when they try to convert you, but you know what? At least the makers of their devices made a token effort to make it easy to learn and use. The developers of Unix originally did not mind making the command line a black art because back then computers were a black art. But in modern society, being a black art is a bad thing, and moving to a system that is just as powerful but even easier to learn is a good thing, as those systems are (nowadays) becoming more and more valuable.

      And that is why Unix will either have to adapt to the much gnashing of teeth and beating of breasts of the neckbeards, or Unix as we know it will become something like Mac - underneath a slick interface, but totally unknown by the everyday user because of its clunkiness.

    65. Re:Strange names by rk · · Score: 4, Funny

      Unix is user-friendly, it's just picky about who its friends are.

    66. Re:Strange names by benjymouse · · Score: 1

      Just wait until Microsoft sees your post and we'll have eeegrep.

      No, it would be called Where-Object and have two built-in aliases called where and simply ?.

      And then you would be able to write something like ps | ?{$_.cpu -gt 10}

      --
      Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
    67. Re:Strange names by Pseudonym+Authority · · Score: 1

      How much can you improve a 100 line program that does nothing by concatenate streams?

    68. Re:Strange names by GrandTeddyBearOfDoom · · Score: 2

      I remember when I first installed Linux in 1995, which came on a cover CD with essentially no instructions. I had to reinstall two or three times and watch carefully the list of packages installed to get an idea of what to type. I took a while to find my way around the /bin and usr/bin stuff, and it took me a week, and the confirmation dialog in the openlookalike file manager (do you want to Remove), including getting X up and working, before I guessed that rm and not del or era was the command to delete a file. I guessed man correctly, having seen that the package man was installed and the installer screen indicated that this was the manual. But I was determined to play with this new toy, and few users today will try so hard. What fun.

      --
      -- The Grand Teddy Bear has Spoken: "Windows 8 Source Code Available NOW! more disgusting than your pr..."
    69. Re:Strange names by GrandTeddyBearOfDoom · · Score: 1

      If you haven't read The Unix Haters Handbook, it's a great read for all those *nix lovers. I say this as one who prefers Linux for most everyday stuff.

      --
      -- The Grand Teddy Bear has Spoken: "Windows 8 Source Code Available NOW! more disgusting than your pr..."
    70. Re:Strange names by Anonymous Coward · · Score: 0

      You should see the C library! You just have to love things like wcsncat() and atan() [maybe arctan was just too long to remember].

    71. Re:Strange names by GrandTeddyBearOfDoom · · Score: 1

      Or 'killall men' or 'kill husband' (or even 'kill -9 ex-husband') etc.

      --
      -- The Grand Teddy Bear has Spoken: "Windows 8 Source Code Available NOW! more disgusting than your pr..."
    72. Re:Strange names by jbolden · · Score: 2

      People were using terminals that were as slow as 110 baud. No one wanted to type extra characters.

    73. Re:Strange names by jbolden · · Score: 1

      Using symlinks and alias. We could easily fix the command line problems.

    74. Re:Strange names by X0563511 · · Score: 1

      Funny how much of a pain it is to learn something new when you fail to do any sort of research before diving in...

      --
      For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
    75. Re:Strange names by Anonymous Coward · · Score: 0

      When unix was first developed, no one would have search-ed for a "help" command. They have wanted a MANual. Some folks who remember VMS appreciate short names for commands.

    76. Re:Strange names by amRadioHed · · Score: 1

      If everyone will hate the name of a command, as would certainly be the case for a command called "Context-free grep", then it is a bad name. Everyone would be using their own aliased names, scripts wouldn't be portable between any two users, it would just be a mess. Save everyone the trouble and just give the thing a name that isn't a PITA to type.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    77. Re:Strange names by Anonymous Coward · · Score: 0

      The flaw in your story is the story. That's the problem with *nix, there's a story. There's always a story, about everything. In the end learning the system is an apprenticeship that involves memorizing hundreds of stories. And then deciding that all that lore and all those stories are actually an advantage and any talk otherwise is an abomination unto the flock.

      So take a look at the result. The OP mentioned that the command 'help' would be appropriate, and you told a story about how 'manual' was close (it isn't), and failed to address that it isn't even 'manual', it's 'man'. Which is de-facto unguessable.

      You also failed to close with the typical dismissal: alias 'help' to 'man' and you're done. Except this is a custom change and not a standard feature of *nix. To the beginner there is no way they can rely upon such assistance. They have to go through the apprenticeship and learn all the stories.

      What's next, implying that Google searches are a 'feature' of *nix?

    78. Re:Strange names by Anonymous Coward · · Score: 0

      kensama@vt.edu

      Ken-sama! I thought you were dead! How's life in Japan? Have you gone to that prestigious high school you wanted to go to?

    79. Re:Strange names by ChipMonk · · Score: 1

      True, but the lessons learned tend to be much more memorable.

    80. Re:Strange names by hawk · · Score: 1

      Yeh, but it drops to 20 minutes if you fast-forward through the weird music while they show the characters operating the algorithms . . . :)

      hawk

    81. Re:Strange names by Anonymous Coward · · Score: 0

      Thats the ugliest pile of shit ive ever seen in my life. Hahahaha.

    82. Re:Strange names by mattack2 · · Score: 1

      As a unix novice, if I wanted to search for something, my first choice of course would be grep

      One of my aliases is:

      alias search "grep -rs --binary-files=without-match \!^ ."

      So I can just type
      search whatever
      to search for the text whatever.. which works most of the time, and I don't have to remember the rest of the grep options the vast majority of the time. One big improvement would be to avoid results in .svn directories.
      (This is a csh/tcsh alias.)

      Another one I use a lot is

      alias ff "find -x . -name \!:1 -print"

      to just quickly search for a file underneath the current directory.

    83. Re:Strange names by Anonymous Coward · · Score: 0

      > If you're a feminist, you can type "man bash"

      That's not really feminism, it's female chauvinism. We now return you to your regularly scheduled levity.

    84. Re:Strange names by Anonymous Coward · · Score: 0

      vi is for VIsual editor. "ed" is the line editor on which it was based. Most system still have ed available too.

    85. Re:Strange names by unixisc · · Score: 1

      I just loved the Unix haters handbook - bought it years ago, and it was a funny read. Particularly how they dissected the shortcomings of just about every concept in Unix - starting w/ everything being a file, to things like how an user has the power to do anything once he is root - including delete files that are system files..

    86. Re:Strange names by unixisc · · Score: 1

      Feminists could have demanded that man be changed to woman, or person. That way, all that male chauvinism can be core dumped.

    87. Re:Strange names by unixisc · · Score: 1

      ed problem is that it doesn't let you go back to previous lines - just forces you to keep editing on that line itself - making it suitable only for single line or two line files. A carriage return will take you to the next line, and when you are done, you have to press Control-_ (I forget what it is) With vi, one would have to know what to type to get into edit mode, although leaving it is easier. I do wish the Unix specification would be altered to require the global presence of some editors other than vi and ed. Emacs & pico come to mind.

      One major headache I always had was Unix using the Delete key instead of Backspace, and not having a key to delete the text after a cursor.

    88. Re:Strange names by CAIMLAS · · Score: 1

      Yes, the history of decades old programming decisions isn't really something you want to learn to use an OS (or any other software). But what's the alternative?

      Um, then you're probably a fan of PowerShell?

      Oh, wait.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    89. Re:Strange names by CAIMLAS · · Score: 1

      And, contextually, hdiff and cgrep will make perfect sense.

      There's already egrep, as well as a handful of *diff type implementations. Strange how common use by professionals who know what they're doing tends to result in a lot of similar, technical things!

      (If you can't hack it, go back to Windows desktop support. I hear they're hiring in India.)

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    90. Re:Strange names by Anonymous Coward · · Score: 0

      so we can get things done rather than taking perverse pride in mastering needlessly ghastly syntax.

      This from someone who has demonstrably mastered English grammar and spelling.

    91. Re:Strange names by jeffb+(2.718) · · Score: 1

      I wouldn't want spaces in my commands. How do you tell where the command ends and the arguments begin?

      Ya\ quote\ the\ embedded\ spaces, ya\ lazy\ moron.

    92. Re:Strange names by guruevi · · Score: 1

      CON was also used in DOS (console or stdin/stdout)

      --
      Custom electronics and digital signage for your business: www.evcircuits.com
    93. Re:Strange names by smellotron · · Score: 2

      "cat" was a really lousy choice.

      The distinguished artist sees "cat" as an excellent choice—a palette for the creative file-namer, a mad-lib left incomplete!

      At least, that's how I justify log files named dog and crap.

    94. Re:Strange names by smellotron · · Score: 2

      How much can you improve a 100 line program that does nothing by concatenate streams?

      Make it a shell built-in and chide the user if only a single input was used (e.g. cat file | grep blah).

    95. Re:Strange names by Nexus7 · · Score: 1

      I understand you post in jest, but if you consider that those commands are typed thousands of times, the names make a lot of sense.

    96. Re:Strange names by bmo · · Score: 1

      Why use cat when you can use dog?

      NAME
                    dog - better than cat
      DESCRIPTION
                    dog writes the contents of each given file, URL, or the standard input if none are
                    given or when a file named '-' is given, to the standard output. It currently sup-
                    ports the file, http, and raw URL types. It is designed as a compatible, but
                    enhanced, replacement of cat(1).

    97. Re:Strange names by haruchai · · Score: 1

      Nope, that the ASUS warehouse inventory program but it's properly written Eeegrep.

      --
      Pain is merely failure leaving the body
    98. Re:Strange names by Anonymous Coward · · Score: 0

      ps | ?{$_.cpu -gt 10}

      Which does what, exactly? List all processes except those assigned to the first 12 processors?

    99. Re:Strange names by MoreDruid · · Score: 1

      alias help='man'

      --
      The best weapon of a dictatorship is secrecy, but the best weapon of a democracy should be the weapon of openness.
    100. Re:Strange names by Anonymous Coward · · Score: 0

      My guess is:
      List all processes that use greater (-gt) than 10% cpu

    101. Re:Strange names by justforgetme · · Score: 1

      I can remember the fist time I landed on a bash shell and typed help...
      Apparently me being a total noob the first thing I wanted to learn was how to program loops into bash commands, not how to find information....

      --
      -- no sig today
    102. Re:Strange names by spongman · · Score: 1

      huh? the Run dialog works just fine with spaces in filenames.

      for example:
          c:\some directory name with spaces\application name with spaces.exe argument1 argument2

      no quoting necessary. or did you mean something else?

    103. Re:Strange names by Anonymous Coward · · Score: 0

      Thinking that "era" would be a command to delete a file, shows that you were not going by logic. You were going by what you learned by using a different OS with just as obscure commands (ok, COPY actually does spell out exactly what it does).

    104. Re:Strange names by justforgetme · · Score: 1

      Intellectually I don't have a problem with the proposition of aliasing commands to deobfuscate their functionality. What I do have a problem with is
      a) people trying to convert people regardless of their actual potential or intent "cmon, I'll show you teh cmd line and thn U'll b 1337 lkie me"
      b) people impeding themselves (and occasionally others) to help some outsiders get to a place they don't belong.

      Those two things do not lead to expansion or domination (except in the twisted managerial tongue). The only thing they contribute to is fragmentation, lowering of standards, miss information and quarter knowledgeable toons that go out and sell themselves as linux gurus while they couldn't awk theirselves out of a bowl if their life depended on it.

      Now let me clarify, I do not promote elite cultism. I promote dialogue and sharing. Just don't go put everybody out of their way to make some random guy something he is not. Not everybody has to live in virtual teletypes, not every dev needs to run linux desktops. For all I know that makes the ones that do know their cli much more valuable and distinct.

      --
      -- no sig today
    105. Re:Strange names by justforgetme · · Score: 1

      you dirty little pixie you!

      --
      -- no sig today
    106. Re:Strange names by justforgetme · · Score: 1

      Wish I hadn't posted already.
      Joey 2-pack will always find something to complain about.

      --
      -- no sig today
    107. Re:Strange names by justforgetme · · Score: 1

      -1.7183 people?

      --
      -- no sig today
    108. Re:Strange names by tehcyder · · Score: 1

      surely that's a *whoosh*?

      --
      To have a right to do a thing is not at all the same as to be right in doing it
    109. Re:Strange names by tehcyder · · Score: 1

      Or you could just have bought a book explaining the basics of UNIX/Linux. Shockingly old-fashioned, I know.

      --
      To have a right to do a thing is not at all the same as to be right in doing it
    110. Re:Strange names by tehcyder · · Score: 1

      True, but the lessons learned tend to be much more memorable.

      Possibly, if you don't give up in frustration first.

      Personally, I find thatr enjoyable experiences are more memorable than horrible ones. Also, doing something the quickest and easiest way is a sign of intelligence, not laziness.

      --
      To have a right to do a thing is not at all the same as to be right in doing it
    111. Re:Strange names by KlaymenDK · · Score: 1

      Funny how much of a pain it is to learn something new when you fail to do any sort of research before diving in...

      or how seemingly relevant research yields no results.

      I learned rather early on that "man" gave you a small manual to a command. But I could never figure out how to EXIT the manual ... there was a period where I had to reboot after looking up each reference. So typing "man man" should yield appropriate research --- only, the man page for man does not in fact tell you that you need to type "Q" to return to the command line.

      Overall, it was a very sobering experience to come from a DOS/Windows world and being quite experienced, and going ("back") to "a black screen with a blinky in the upper left corner".

    112. Re:Strange names by tehcyder · · Score: 1

      Yeah, 'cause pressing F1 for help (sometimes) and F3, Ctrl-F or some other key or menu item to search (sometimes) is so much more obvious. Average Joe doesn't seem able to find the clearly labelled Help menu in most programs, so the difference between that and not knowing man is negligible.

      That would be true if there was a "man" key. It is not difficult to press all the Fn keys in turn and see what happens, but no one is going to guess at "man". people on this thread keep confusing "intuitive" with "easy enough to remember once you've been told".

      --
      To have a right to do a thing is not at all the same as to be right in doing it
    113. Re:Strange names by TheRaven64 · · Score: 1

      Now try using it in an environment variable, or as an argument to xargs. The first thing that reads the string will turn the escaped spaces into spaces. The second thing will interpret them as multiple values. UNIX shells are notoriously bad at handling things with spaces in them.

      --
      I am TheRaven on Soylent News
    114. Re:Strange names by JigJag · · Score: 1

      you could replace "--binary-files=without-match " with "-I" (that's uppercase i) so you get:
      alias search "grep -rsI \!^ ."

      JigJag

      --
      "The hallmark of humanity is the ability to move beyond sensory inputs" - Mary Helen Immordino-Yang
    115. Re:Strange names by Qzukk · · Score: 1

      only, the man page for man does not in fact tell you that you need to type "Q" to return to the command line.

      That's because man simply formats the page, then passes it to your $PAGER of choice (more or less) for display.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    116. Re:Strange names by Qzukk · · Score: 1

      This is why God invented $IFS

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    117. Re:Strange names by KlaymenDK · · Score: 1

      Interesting. It shows how much I have still to learn, a number of years hence. But also shows how utterly chanceless it is for a newbie to know how to do proper research. Ah well, all the more "Getting Started With..." books sold... :-)

    118. Re:Strange names by queBurro · · Score: 1

      what about "fegrep"? ~ fegrep (is an) enhanced g/re/p, and we've shoehorned in the recursive acronym

      --
      sag
    119. Re:Strange names by morgauxo · · Score: 1

      I'm aware of that option but it's ugly and hard to read. You could also of course use "Ya quote the embedded spaces, ya lazy moron". That's not so hard on the eyes but it could be easier to mentally lose track of the quotes.

    120. Re:Strange names by morgauxo · · Score: 1

      No, see the part about not wanting to relearn everything every so many years.

    121. Re:Strange names by g253 · · Score: 1

      You could have at least tried Ctrl-C (which you should know from DOS) before rebooting :-)

    122. Re:Strange names by morgauxo · · Score: 1

      Yes, tab completion is awesome. It definitely makes long commands easier to deal with. It doesn't solve all the issues though.

      Abbreviations are great too. I've used some languages and protocols where the first few characters of the name are the abreviation. As a shortcut most parsers were written to just ignore characters past those important few meaning you could type anything after them.

      There are two problems I still see though...

      One is that having abbreviations means twice the reserved words. This is especially an issue with the longer, 'real english' words. Every time you make a command out of a word you are making it more difficult to use it as a filename w/o ambiguity.

      The bigger issue... permanence. I know DOS and Unix fairly well. I don't want to learn a new commandline. Powershell for example has intrigued me ever since it was first released. Guess what I have not and still will not be spending my Friday night doing... learning power shell. I suspect that 10s of thousands if not 100s of thousands admins and users out there feel the same way.

      Is this just one generation being selfish? Will we eventually retire or die off finally leaving the new generation to drop a legacy from times when memory was expensive (one part of the reason for short commands) and user interface design were a new and unknown science?

      Maybe... To an extent. But if so it is going to happen again to every generation forever. Which 'English' words do you chose for your commands? Let's take another look at the original post's examples, in particular 'man' again. Once upon a time everything came with manuals. I bet it was pretty natural for people to think of a manual when they needed help and a 'man' command is not a very big step from there. Now.. every thing has been cheapened and pennies can be saved by not including manuals. Most things do not. I think I remember reading that 'grep' had some significance outside of computing at the time it became a unix command. I don't remember what though. The point is that things change and the words that are chosen today will not make as much sense in 10 years, 20 years, etc...

      Should computing be renewed for every generation with a whole new set of commands? I don't think so. For one thing, admins and users don't come in discreet generations. Any time you make this change somebody is caught in the middle and has to re-learn everything. Nobody wants to do that!

      Besides... while I could argue all day that a commandline IS an easy and natural way to communicate with a computer (we communicate with each other through language, not by clicking pictures) it is not something 'normal' people do anymore. The commandline might as well cater to the admin and not the general public because that is who will use it.

    123. Re:Strange names by jeffb+(2.718) · · Score: 1

      Actually, it would be "Ya quote the embedded spaces," "ya lazy moron". Which, of course, strongly supports your "hard to read" claim. I was being facetious.

    124. Re:Strange names by jeffb+(2.718) · · Score: 1

      As an OS X user, I'm painfully aware of this. But I've also run up against software that failed a default install on Windows because it couldn't cope with "Program Files". Really, people, how can you not test your own defaults?

    125. Re:Strange names by KlaymenDK · · Score: 1

      Ooh, but that doesn't work. I must admit: I haven't tried this in Linux, but that is certainly the case on the BSDs.

    126. Re:Strange names by jpvlsmv · · Score: 1

      -print0 and xargs -0 are foolish hacks.

      The correct way to do this would be `find . -name "*.txt" -exec wc {} +`

      the -exec in all modern (i.e anything that supports -print0) find will expand the {} to proper filenames when given a +. And you get the bonus that you don't have to shell-escape the \; at the end.

      --Joe

    127. Re:Strange names by mfnickster · · Score: 1

      Um, I've been able to "hack it" for twenty years and it's still hildarious after all this time, the lengths CLI users will go to to avoid a little typing!

      --
      "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
    128. Re:Strange names by morgauxo · · Score: 1

      " or Unix as we know it will become something like Mac - underneath a slick interface, but totally unknown by the everyday user because of its clunkiness."

      Isn't it already? Well.. up until the 'clunkiness' part. That's your opinion, you are entitled to it but I'm sure many here don't share it.

      I think we are talking about the Desktop here. On the desktop 'Unix' is mostly Linux or FreeBSD so that's what I am referring to. The 'Year of the Linux Desktop' may never arrive but I've know people who use it and know nothing about the kernel, the shell or anything like that. It just boots into KDE or Gnome and then it works. Just click on Chrome or Firefox, no different from any other OS.

      "all annoying when they try to convert you"
      Usually attempts at conversion of any kind are annoying. I don't try much anymore because I have learned that people aren't interested. Trying usually bothers me even more than the person I am talking to. I can sympathize with those trying to do so though.

      First, there is no such thing as geeks and normal users. It's geeky normal users and normal users. I for one want my U/Linux power tools but I want all or most of the things 'normal' users want too! I don't know if there will ever be a 'Year of the Linux Desktop' but I sure wish there would.

      The thing about all the 'normal' users changing, I just don't see they would suffer from it. You click the application you want to run and it does. On that level it's just like everything else. The opposite however is not true. Geeky power users have suffered.

      I've been using Linux since about 1998. At first I just wanted my own web server and wasn't at all considering replacing Windows on my Desktop. I had the server in the common area of my dorm and found myself surfing on it when I wanted to use the internet but also wanted to hang out, watch tv, etc.. with my roomates. Having it there was convenient. I also hooked it to the stereo and could play 100s of MP3s on it which was a big deal when my friends were still drooling over CD changers. IM clients were big but not available on Linux yet. I remember when Tik was released and suddenly I didn't need Windows for daily use anymore. I had instant messaging mp3 playing and web browsing. The three things I needed for everyday use. Next up came Wordperfect. Great!! No more rebooting to write papers. Then came Linux drivers for 3dfx. Quake II for Linux, the Sims.(I didn't have the Sims but it was a big deal at the time) I was finding I actually preferred Linux and there was less and less reason to boot Windows.

      It's not like I had set out on some OSS ideological quest or even out of a hatred of Windows. I still have never read 'The Cathedral and the Bizarre'. I just wanted a web server, in Linux a full featured one was free, in Windows that was still very expensive. But then I discovered I liked it better. For me, the Linux Desktop had arrived and it was good.

      And then stuff happened. Wordperfect lost out to Microsoft Office which will likely never be available under Linux or fully compatible with anything that is. Flash became an important part of web browsing. It got to the point were it seemed like the majority of the interesting content was unavailable without a recent Flash. But Macromedia stopped developing for Linux. 3dFX went out of business. ATI and NVidia refused to release 3d drivers or enough information about their products to allow anybody to write them for years. To a generation of computer geeks (aren't most gamers, at least when they are young) desktop Linux was dead. Games stopped even being released with Linux versions.

      Things are better now. Flash is available (and going away anyway). Open Office handles MSOffice formats well enough for most usage (IMHO). NVidia has been developing working (though not quite perfect) drivers for a while now and ATI is playing nice with OSS developers. Wine runs most things good enough, even games. (come on Photoshop!) Still, this could all happen again. I

    129. Re:Strange names by WorBlux · · Score: 1

      I believe Unix originally had a 6-letter limit of filenames. Manual is 7 letters, hence they had to shorten it. In addition when you only have six letter to work with, making them case-sensitive squares the possible file-names.

    130. Re:Strange names by iluvcapra · · Score: 1

      What's the matter with -print0? I'm curious.

      --
      Don't blame me, I voted for Baltar.
    131. Re:Strange names by LingNoi · · Score: 1

      First thing I do is google it. I never look at man pages.

    132. Re:Strange names by Anonymous Coward · · Score: 0

      Only because windows tries

          "c:\some" ...
          "c:\some directory" ...
          "c:\some directory name" ...
          "c:\some directory name with" ...
          "c:\some directory name with spaces\application" ...
          "c:\some directory name with spaces\application name" ...
          "c:\some directory name with spaces\application name with" ...

      until it finally finds that

          "c:\some directory name with spaces\application name with spaces.exe" ...

      works. This is a security nightmare.

    133. Re:Strange names by jc42 · · Score: 1

      Average Joe doesn't seem able to find the clearly labelled Help menu in most programs, so the difference between that and not knowing man is negligible.

      Or maybe Joe has just learned that that Help menu only rarely provides any help with the current problem.

      I'm typing this (to Firefox) on a Mac, perhaps the most vaunted "user-friendly" system in the industry. I've tried using that Help menu on many occasions, but most of the time, I seem to just get links to anything anywhere in the system that contains the keywords that I type into the Search widget. These hardly ever have anything to do with the app that I'm running at the moment. Actually, the FF Help menu does seem to have some things that deal with FF. When I move the pointer to the first (Sidebar > Bookmarks) item, a number of other menus suddenly pop up, and they're rather baffling. But there's a little wavering arrow pointing at the an item in one of the menus, so I clicked on it. All the menus instantly disappeared, and waiting didn't result in any new window to pop up with the expected Help information. I've worked with Macs off and on for a decade, and I still find most of Help's behavior this baffling. Occasionally it finds me help; usually it just wastes my time. So I ask google, which usually does find me information, though it's often buried in the 3rd or 17th page of totally irrelevant stuff. ;-)

      These days, I think I prefer the original unix man-based stuff. If it doesn't know, I discover that very quickly, and I ask google after wasting far less time than I waste on Mac's crappy Help menu.

      It's disappointing when it's so much harder to get good information about something on the "user-friendly" systems than on the "user-hostile" systems like unix and linux. ;-)

      --
      Those who do study history are doomed to stand helplessly by while everyone else repeats it.
    134. Re:Strange names by jpvlsmv · · Score: 1

      It's specific to GNU find. The -exec ... + syntax is POSIX-standard.

      --Joe

    135. Re:Strange names by cynyr · · Score: 1

      http://en.wikipedia.org/wiki/Grep

      "ed had a construct g/re/p, where re is "regular expression". It was decided that this was used enough that it was made into a stand alone tool.

      sed has similar origins. cd (change directory), ls (listing), mv (move), rm (remove), etc are also make sense once you know why they are that way.

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    136. Re:Strange names by cynyr · · Score: 1

      i love using $IFS to read in CSV files in bash. saves so much time. granted these days i just do that in python with split, but still neat bash trick for the new kids.

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    137. Re:Strange names by cynyr · · Score: 1

      what about

      c:\some directory name with spaces\application name with spaces.exe argument1 with spaces argument2withoutspaces

      means you end up with

      c:\some directory name with spaces\application name with spaces.exe "argument1 with spaces" argument2withoutspaces

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    138. Re:Strange names by dylan_- · · Score: 1

      I learned rather early on that "man" gave you a small manual to a command. But I could never figure out how to EXIT the manual ... there was a period where I had to reboot after looking up each reference. So typing "man man" should yield appropriate research --- only, the man page for man does not in fact tell you that you need to type "Q" to return to the command line.

      You never tried typing "help" when you were viewing the man page, did you? :-)

      --
      Igor Presnyakov stole my hat
    139. Re:Strange names by Anonymous Coward · · Score: 0

      Manual is not seven letters.

    140. Re:Strange names by CAIMLAS · · Score: 1

      I wasn't referring to you by that statement, it was a more global, "royal" you. And yes, I agree: people are afraid of typing commands and reading documentation. They'll search the web for -hours- copy/pasting shit they randomly find to see if it does the trick, and then ultimately give up saying "it doesn't work", blaming the tool not themselves.

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
    141. Re:Strange names by gangien · · Score: 1

      completely irrelevant to stuff written in the last, 30 years or so?

    142. Re:Strange names by mfnickster · · Score: 1

      I wasn't referring to being afraid to type commands, I meant the constant confusing abbreviation used in Unix, e.g. "cp" and "mv" instead of the straightforward "copy" and "move," whose only shortcoming is that they are four characters instead of two.

      Or try "umount," which does what? Unmounts filesystems. Some Unix programmer thought that typing an extra 'n' was too much work and the command would be just as clear without it. It isn't.

      --
      "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
  2. How's it compare to Meld? by Compaqt · · Score: 1

    A nice GUI diff for Linux. (Has 3-way).

    Click here to install

    --
    I'm not a lawyer, but I play one on the Internet. Blog
    1. Re:How's it compare to Meld? by Anonymous Coward · · Score: 3, Insightful

      It is surprising that Slashdot even let you post a deb: url, as the filter usually seems to destroy most non-http(s) links. However, not everyone uses a Debian-based distro, and not everyone tries some random package (even from the repository) before reading a little about it, so posting the home page would have been a bit more useful.

    2. Re:How's it compare to Meld? by garry_g · · Score: 1

      Or ASCII GUI: vimdiff ... works fine, also with 3 files ...

    3. Re:How's it compare to Meld? by Compaqt · · Score: 2

      Yeah, I usually post a disclaimer ("for Debian/Ubuntu/Mint" -- now "Debian/Mint/Ubuntu").

      Second, yes, /. does allow that, and I hope they continue to do so, because deb:// and click to install is neat and handy (even a lot of old Linux hands don't even know about it).

      Finally, (as you mentioned) it's not a link to download software, but rather install software from the repositories, so there's that level of security.

      --
      I'm not a lawyer, but I play one on the Internet. Blog
    4. Re:How's it compare to Meld? by pak9rabid · · Score: 1

      I like kompare.

    5. Re:How's it compare to Meld? by amRadioHed · · Score: 1

      But it's still less helpful for anyone who isn't using a Debian based distro, or for anyone who wants to read about the program before installing it.

      --
      We hope your rules and wisdom choke you / Now we are one in everlasting peace
    6. Re:How's it compare to Meld? by doti · · Score: 1

      Another nice multiplataform GUI diff, and also has 3-way: vimdiff / gvimdiff

      --
      factor 966971: 966971
    7. Re:How's it compare to Meld? by cynyr · · Score: 1

      I'd bet the graphical gui, gvimdiff works fine with 3 files as well.

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
    8. Re:How's it compare to Meld? by Anonymous Coward · · Score: 0

      It's open source - hack it up!

  3. awk? by realyendor · · Score: 2

    Done! It's called "awk". Just set the RS and FS fields as appropriate. :P

  4. DOE?????? by Anonymous Coward · · Score: 0

    What's the relevance of this work to DOE? Shouldn't DOD be the funding agency? Or does DOE simply have more money than they know what to do with?

    1. Re:DOE?????? by GameboyRMH · · Score: 1

      Well if we can use our computers more efficiently then we'll save energy. On the other hand I can't imagine what use the DOD would have for this, especially since they seem to run Windows at every opportunity...

      --
      "When information is power, privacy is freedom" - Jah-Wren Ryel
    2. Re:DOE?????? by amiga3D · · Score: 1

      I think maybe some of the scientist types at the DOE were behind the funding.

    3. Re:DOE?????? by iced_tea · · Score: 3, Interesting

      They have HUGE amounts of data kicking around from various simulations/experiments.

      Check out the wikipedia article for supercomputers, and you'll see DOE mentioned.

      Tools like this could help with analysis and finding certain data sets. IIRC, regex are already used in DNA sequencing. There is probably a similar application and use for tools like this with their data.

    4. Re:DOE?????? by PPH · · Score: 1

      Shouldn't DOD be the funding agency?

      The DOD has been parsing your data for years.

      --
      Have gnu, will travel.
  5. Follow the money...? by dzfoo · · Score: 1, Interesting

    funded in part by Google and the U.S. Energy Department

    I wonder what's the interest of these two in this.

              -dZ.

    --
    Carol vs. Ghost
    ...Can you save Christmas?
    1. Re:Follow the money...? by mvar · · Score: 0
      I was about to post the exact same thing. According to the article

      The DOE foresees that this sort of software could play a vital role in smart grids, in which millions of energy consuming end-devices would have connectivity of some sort.

      What a load of crap. These "new programs" sound more like a high school or an open source project. Since when a government agency cares about a Unix admin's toolbox so much that decides to fund something that could (and probably already has) been solved with a script. wtf?

    2. Re:Follow the money...? by Anonymous Coward · · Score: 0

      Cool. You could be lazy, and I could quote the article for you for sweeet karma. Win-win!

      Only... I'd feel... dirty.

      So forget it.

    3. Re:Follow the money...? by Tanktalus · · Score: 5, Insightful

      Context-free grep/diff can be used to search for data/changes in arbitrary non-line-record-based files. Such as XML, HTML, JSON, SQLite databases, other databases, Apache configs, and many other pieces of data. Heck, even most programming languages are not line-based, but statement terminated/separated. Imagine being able to grep for a function name, and getting its entire prototype/usage even when it spans multiple lines (very common in standard glibc headers). And, depending on the plugin's capabilities, you could grep for a function name as a function name and not get back any usage of that text as a variable or embedded in a string, or a comment (skip commented-out calls!).

      If there's sufficient configurability, you could ask for the entire block that given text is in, and such a grep would be able to display everything in the corresponding {...}. Makes grep that much valuable.

      So, my question is, why aren't more IT-heavy corporations/government departments not involved?

    4. Re:Follow the money...? by neokushan · · Score: 1

      I wonder why people feel the need to "sign" their posts, when their username is quite clearly visible at the top.

                        -nk

      --
      +1 IDisagreeSoHeMustBeATrollOrAnAstroturferOrAShill
    5. Re:Follow the money...? by Doc+Ruby · · Score: 2

      Vast amounts of OS SW has been funded by the government. BSD was developed by UC Berkeley, which is largely funded by Pentagon contracts.

      And the Internet.

      Meanwhile, the vast majority of open source projects never get past the opening statement.

      You clearly don't know what it takes to accomplish a project like this one. What have you ever done, that gives you some standing to announce that this Usenix project is a load of crap?

      --

      --
      make install -not war

    6. Re:Follow the money...? by Anonymous Coward · · Score: 0

      I wonder why people feel the need to "sign" their posts, when their username is quite clearly visible at the top.

      AND when they have a sig anyway. I guess their cut-and-pasted "witty" comment is worth more than their who they are?

      -AC

    7. Re:Follow the money...? by hedwards · · Score: 1

      Why does that necessitate screwing around with grep? I can sort of see modifying diff, but with grep if you need that data you'd write a new program to parse it and pipe it.

    8. Re:Follow the money...? by bobaferret · · Score: 2

      So weird. I spent the last 6 months writing some Java libraries that do exactly this. There were some similar things out there, but they weren't licensed appropriately for my uses, or were WAY too expensive. Writing a hierarchical diff engine is the most complex thing I've ever done, hell writing an efficient pure diff engine is insane itself. You have to identify blocks/structure. then you have to diff the structures, then you have to diff the content in the structures. Once all of that is said and done then you have to find a way to represent the differences using the recognized structures. And from my point of view half the reason was to be able to represent ONLY the changes so that I'd have a nice size savings, on a constantly changing tree. You also have to choose a format that allows you to roll back to an previous diff given the initial sate or final state. There are also a large number of trade offs that have to be made including window size etc. You can't do a diff across a massive amount of data w/o a massive amount of processing power and memory. So you effectively have to diff independent streams against each other that have similar sized sliding windows on each stream. /rant Good stuff though, just funny to read about, and difficult to do.

      I don't have a an answer to your question, but I wrote my software to deal with IT problems, because diff and grep just weren't good enough, and no one seems to do it for free.

    9. Re:Follow the money...? by Anonymous Coward · · Score: 0

      Are they opensourced? Are you going to contact the team to offer some sourcecode ideas?

    10. Re:Follow the money...? by RocketRabbit · · Score: 1

      I'm no doctor but I can tell that wound is infected.

    11. Re:Follow the money...? by bobaferret · · Score: 2

      LOL and that my friend is the hard part. It cost me $4000 in legal fees to make sure they are not owned by the company I work for, and 6 weeks of work. I'm leaning towards an AGPL/open core model. I just see so many people NOT happy with open core stuff. Also, I didn't get a grant from Google or the D.O.E. And these are just small, yet integral, parts of a larger system. That I don't really want to give away yet. Hell, deciding on licensing is harder than coding sometimes. Gotta feed the family you know, while at the same time pay back the OSS world for all of the great stuff that I use every day for free. How to do both is a hard ethical question. It's easy to say just consult, or write a book. It's much harder to actually _do_ these things. Hell, it's hard enough, just to open up your code to the worlds criticisms. The only thing I know at this point, is that it's not doing me or anyone else any good just sitting on it.

    12. Re:Follow the money...? by Anonymous Coward · · Score: 0

      Unless you're personally gaining from your code, it might as well not exist. If it's really useful to others, you'll make a name for yourself, and probably get ripped off too. So many libs are being illegally bundled into win GUI products, it's shocking. But if your code is genuinely decent enough for other professionals to use, you'll have something pretty amazing to put on your CV/resume.

      The fact your code may be awful and icky to other pro programmers doesn't actually matter, enhancements will be sent to you, and you'll gain experience from distributed development, further enhancing your skill set. Heck, you could even farm out *your* desires to developers who'll do it for free. This is the beauty of real open source. If you get really lucky, a big outfit will take it in and sponsor you, or even buy you out. Neither of which will happen while it's not available.

      The longer you wait, the more chance there is of someone else scratching the same itch and pulling the carpet from under you.

      Good luck, but don't dither!

    13. Re:Follow the money...? by bobaferret · · Score: 3, Interesting

      I wouldn't call it a cancer. But it's definitely useful if you don't ever want commercial companies to use your code in public. It matches up well with the open core model. Commercial people will only use it if you can give them a differently licensed copy of the code. Apache, MIT, and BSD are great if you truly want to give your code away and don't care what people do with it behind closed doors. AGPL is nice to make sure people always give back. LGPL and GPL nice if you only want them to give back if they change it. Should people pay and how much is an age old question. I have to balance the cost of support and development vs. the cost of the product. The more I lean on the community the less I can charge and the more exposure I get. While in the other direction I get more money, but have to spend more of it. And there is no one size fits all solution to any of this.

    14. Re:Follow the money...? by bobaferret · · Score: 1

      A new years resolution it is then...

    15. Re:Follow the money...? by Anonymous Coward · · Score: 0

      So, if you had studied the basics of theory of automatons, you would have not had any problem parsing such thing. Just implement a CFG parser. Or better, locate and take the code from gcc.

    16. Re:Follow the money...? by bobaferret · · Score: 1

      I had no idea it was that simple. I wonder why google and D.O.E. are funding it then? Perhaps they should just ask Wolfram and be done with it.

    17. Re:Follow the money...? by PReDiToR · · Score: 1

      narcissism noun
      1. inordinate fascination with oneself; excessive self-love; vanity.
      2. Psychoanalysis. erotic gratification derived from admiration of one's own physical or mental attributes, being a normal condition at the infantile level of personality development.

      That, and some people read /. logged in and have their preferences set to not show .SIGs.

      --

      Do not meddle in the affairs of geeks for they are subtle and quick to anger
    18. Re:Follow the money...? by JigJag · · Score: 1

      I tell maybe offer a reason since I'm guilty of doing the same thing. See, I've been a very long time reader of Slashdot, posted many comments, even submitted a story once, but one day I decided that I had finally found the perfect nick for my slashdot activities, namely JigJag, and so I created an account (boy, was I happy JigJag was available).
      I have a pretty good grasp of the slashdot crowd and I know it's important to make a name for oneself, so that next time someone sees a post by JigJag, they'll say "oh yeah, I remember that dude". For that to happen, at first you need to gave your name out there more visibly (also pick a rather unique sig. I recognize posters by their sig most of the time without looking their nick).
      It's like advertisement. Get the name out several times in a short span and people will remember it. Once it's been out there long enough, I'll probably stop.

      JigJag

      ps: is it me or there is something odd with the moderation points? I've been registered for what, 3 months? and I've been given moderation points 3 times already.

      --
      "The hallmark of humanity is the ability to move beyond sensory inputs" - Mary Helen Immordino-Yang
    19. Re:Follow the money...? by JigJag · · Score: 1

      oops. "I tell maybe" is supposed to be "I can maybe". Fingers were on autopilot when typing the first part of the sentence I suppose.

      JigJag (for good measure)

      --
      "The hallmark of humanity is the ability to move beyond sensory inputs" - Mary Helen Immordino-Yang
    20. Re:Follow the money...? by Anonymous Coward · · Score: 0

      > So, my question is, why aren't more IT-heavy corporations/government departments not involved?

      Because every time something like that is needed, it gets written as a special perl or python script by local staff and gets saved as 'jimmygrep', and the need remains masked.

    21. Re:Follow the money...? by Anonymous Coward · · Score: 0

      Yeah, so what would you call the resulting program?

      They went with something like grep, because that's what it did. I'm sure patches are welcome.

  6. Interesting... by DangerOnTheRanger · · Score: 3, Interesting

    With these tools, you could make grep and diff work with binary files in a meaningful way - very useful at times. I bet you could even adapt the "Context-Free Grep" into a sort of packet sniffer with enough work. I'd sure like to try these new programs sometime.

  7. No download link? by roguegramma · · Score: 1

    I would have wished for a download link ..

    --
    Hey don't blame me, IANAB
  8. Link to one of their papers on these tools by treerex · · Score: 4, Informative
  9. Re:bad, wrong and stupid by interval1066 · · Score: 2

    Do we really need to improve on something that works already? A grep that handles binary formats might be nice, but I think I'd rather see this spun off into some kind of new tool or two, like an "extended" grep and diff, maybe. Maybe they're doing that.

    --
    Python: 'And then suddenly you have a language which says "we're all stuck with whatever the whiniest coder wants".'
  10. Almost vaporware by gmuslera · · Score: 1

    The grep is "in design process", the diff is "not released yet". And should be a lot of alternative tools to those 2, some that should have go around the same goal (i.e. mailgrep). Im all for improving those 2 venerable tools, but the announcement look a bit of out of time or scale.

  11. Re:bad, wrong and stupid by Anonymous Coward · · Score: 0

    It's a new program. They're not replacing grep. They're not going to break into your house and apt-get remove grep. If the data you need to grep is broken into lines, keep using grep. If you'd rather manually sort through data that's not broken neatly into lines, feel free to do that. Personally, this has the potential to be a huge help for me, though it depends a lot on what's required to make the necessary library for a given data type.

  12. sgrep by SgtChaireBourne · · Score: 1

    There used to be a utility, sgrep, for searching SGML/XML.

    --
    Beta is broken and the link to classic doesn't work. Stop wasting our time or there won't be anybody left here.
  13. Object grep by Doc+Ruby · · Score: 1

    I'd like a grep tool that could scan XML data for instances of objects (according to some XSD or DTD), and take object state values as arguments to search objects for.

    If it could scan objects in memory I'd love that better, but XML seems the only likely candidate for a format that a universal tool would parse.

    --

    --
    make install -not war

    1. Re:Object grep by atisss · · Score: 1

      XPath? XSLT?

    2. Re:Object grep by jd · · Score: 2

      XML is ok, but there are many data formats that could really use a diff/grep utility that could make sense of them. HDF5 and NetCDF are nice in the scientific community, for example. Computer graphics geeks might find intelligent diff/grep tools for the Renderman format to be useful. Office users might want to know if two documents are genuinely different or were compressed differently. Hell, it would be incredibly useful if they could diff a MS Office file and LibreOffice file in their native formats to see if they were logically the same even if syntactically represented differently.

      I'm sure that's the kind of thinking on the Google side. If you can equate two files (even if they aren't absolutely identical when in file form) and search in a file-format-independent way, then you can eliminate duplicate indexing and boost searching. An obvious place for that would be Google Docs, where the internal format used for a file isn't necessarily the format used by you on your machine.

      A truly universal tool's only relationship to XML would be to use XML to define how different file formats worked. This would mean you could have a dictionary of file formats and object representations, without having to link to a billion libraries or having to stuff the utilities full of different kinds of parser. You'd have a single parser that would really be no different from the modern diff and grep, plus a layer in front that used the file format descriptions to convert the inputs into a usable representation.

      If you wanted to stick to the Unix convention, and make this capability universal to all tools, you'd have a single file decompiler utility that used the dictionary to turn any stdin/file input into a standardized output and a single file compiler utility that could take the output of something like diff or grep and convert the representation back into a format that's meaningful with respect to the original file format. Hey presto, any problems Google or the DoE are solved without having to alter any specific tool or create any compatibility issue.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    3. Re:Object grep by Doc+Ruby · · Score: 1

      Those are tools for using to decode (or reencode) an XML doc, but they don't do any searching for content matches. An "object grep" like what I described could use them to parse XML, but they're not the grepper.

      --

      --
      make install -not war

    4. Re:Object grep by atisss · · Score: 1

      No, those are not tools, those are standards.

      You can use xpath to find anything in xml (For example third object with property A=10 and property B>20).
      As for actual tool - use any DOM engine from any language where your data is kept.

      There are also command line tools for that: http://xmlstar.sourceforge.net/docs.php

    5. Re:Object grep by Doc+Ruby · · Score: 1

      Standards are tool for programmers.

      A programmer can use XPath to find something in XML by writing a program that uses XPath. XPath and XSLT cannot be used to find an object in XML any more than iambic pentameter tells a story - you need to write a poem in it to do that.

      The cmdline tools you link to are actual examples of what I'm talking about. Thank you for that.

      --

      --
      make install -not war

    6. Re:Object grep by Doc+Ruby · · Score: 1

      +1 Insightful.

      What I want is Linux cmdline tools that work on object pipelines the way Windows PowerShell does. It's got a grep that works on objects in the pipeline. I think Unix/Linux could do it better. Though maybe Android is a better environment for that kind of shell to thrive, instead of fighting the momentum/inertia in Unix/Linux shells all geared to unstructured data in pipelines. With Google I'm surprised they're investing anything in Unix instead of Android. Maybe they leverage the Unix developer community into an Android port.

      --

      --
      make install -not war

    7. Re:Object grep by atisss · · Score: 1

      I'm just not confident that processing XML in command line is such good idea. If you need that for single use - sure, but otherwise you would still need some application that supports DOM, and scripting/transformations. Most browsers do have javascript console where you can easily do all of that seaching/modifying. Anyway - you have to write code. That's also true for grep - generally you are using underpowered regular expressions without all the true power they provide from actual programming language. I guess command line XPath would be the same.

  14. RTFA? by DragonWriter · · Score: 4, Informative

    funded in part by Google and the U.S. Energy Department

    I wonder what's the interest of these two in this.

    FTFA:

    Google's interest in this technology springs from the company's efforts in cloud computing, where it must automate operations across a wide range of networking gear, Weaver said. The DOE foresees that this sort of software could play a vital role in smart grids, in which millions of energy consuming end-devices would have connectivity of some sort. The software would help "make sense of all the log files and the configurations of the power control networks," Weaver said.

    1. Re:RTFA? by Anonymous Coward · · Score: 0

      funded in part by Google and the U.S. Energy Department

      I wonder what's the interest of these two in this.

      FTFA:

      Google's interest in this technology springs from the company's efforts in cloud computing, where it must automate operations across a wide range of networking gear, Weaver said. The DOE foresees that this sort of software could play a vital role in smart grids, in which millions of energy consuming end-devices would have connectivity of some sort. The software would help "make sense of all the log files and the configurations of the power control networks," Weaver said.

      tl;dr

  15. Ooooh! by gstoddart · · Score: 3, Interesting

    As soon as I see "Context-Free Grep", I immediately think of a Context Free Grammar.

    That basically implies we can have much more sophisticated rules that match other structural elements the way a language compiler does. Which means that in theory you could do grep's that take into account structures a little more complex than just a flat file.

    Grep and diff that can be made aware of the larger structure of documents potentially has a lot of uses. Anybody who has had to maintain structured config files across multiple environments has likely wished for this before.

    Sounds really cool.

    --
    Lost at C:>. Found at C.
    1. Re:Ooooh! by skids · · Score: 1

      It will be interesting to see what they come up with. From the paper posted above it looks like it will definitely be taking "wisdom" about certain file types, but I hope they also work on some fuzzy guessing modes as well that do not require prior knowledge of the language being parsed.

      The main potential for ick factor is whether they can manage to get a set of commandline flags that can be used/learned incrementally so you don't have to memorize a ream of flags just to get something useful done, and can learn a few new ones every time you need to push the envelope of what you already know. (BTW, Judging from the number of times I see a scripting language launched to do things grep can do perfectly well, most people stopped reading the manpage before the -B and -A options.)

    2. Re:Ooooh! by steelfood · · Score: 1

      One of the reasons context-free searches isn't more prevalent in everyday computing is the increase in complexity and thus computation resources needed to process it. If regular grammar was linear, then context-free is closer to linearithmic (n*log(n)).

      Regular expressions can handle multi-line searches as easily as it does single-line searches (some people above were saying how the multi-line aspect of this would be real useful). The line delimiter is merely a convenience.

      What it can't handle are nested searches where the search criteria involves the attributes of the nesting. For example, while regular expressions can find items with or in between a particular tag, it cannot say, find items x-levels down or say, determine which tags nested to the 5th level are not named "xyz".

      --
      "If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be."
    3. Re:Ooooh! by jd · · Score: 1

      Agreed, but just as the best compilers are multi-stage and the best Unix tools are single-purpose and piped, I'd far prefer to see the "context-freeing" done outside of grep or diff. That way, it can be applied universally and uniformly. It'd also be easier to debug, since two simple tools are much easier to QA than one complex tool.

      --
      It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
    4. Re:Ooooh! by vAltyR · · Score: 1

      One of the reasons context-free searches isn't more prevalent in everyday computing is the increase in complexity and thus computation resources needed to process it. If regular grammar was linear, then context-free is closer to linearithmic (n*log(n)).

      Regular grammar searches are linear, or they are in a proper implementation. In addition, LALR parsers also run in linear time. There is an increased space requirement; regular grammars are equivalent of finite-state automata, and therefore require constant state, whereas a context-free grammar is equivalent to a pushdown automata, or a FSA with an extra stack. I couldn't find any sources on the space requirements of an LALR or an LR parser, but I should probably note that a linearly-bounded automaton requires O(n) space and is more powerful than a pushdown automaton, so the space requirements are likely less than linear for an LALR parser.

    5. Re:Ooooh! by Anonymous Coward · · Score: 0

      I think the space requirement for LALR is O(n) since you might have to push the entire input stream onto the stack before reducing it all.

      And in practice, once you start building ASTs to actually do something with the input besides say, "yes it matched," the linear stack space requirement isn't that significant... you are going to tend to have O(n) AST nodes since you don't tend to have languages full of tokens that don't have any corresponding AST element. Each stack-reduce stage tends to just push state into the AST.

    6. Re:Ooooh! by CAIMLAS · · Score: 1

      Does this mean I will one day be able to make sense of postfix logs?

      --
      ~/ssh slashdot.org ssh: connect to host slashdot.org port 22: too many beers
  16. Microsoft Ad by lucm · · Score: 3, Interesting

    I know I'll be modded down, but I have to say it: what they describe is already available in Powershell, where objects can be piped in search/filter functions.

    --
    lucm, indeed.
    1. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      But pipes should ALWAYS be plain text! SOMEONE THINK OF THE CHILDREN!

      etc

      etc

    2. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      So what? Maybe people want a non-proprietary solution that works on more than one OS.

    3. Re:Microsoft Ad by Mars+Saxman · · Score: 1, Funny

      That's great for all fifty people who use Powershell.

    4. Re:Microsoft Ad by grcumb · · Score: 1

      I know I'll be modded down, but I have to say it: what they describe is already available in Powershell, where objects can be piped in search/filter functions.

      Sure, and it's been possible in Perl (for example) forever:

      use File::Slurp;

      my $multi_line_pattern = join("", @ARGV);
      my $text = read_file( 'filename' ) ;
      if ($text =~ /$multi_line_pattern/){
      # do something useful.
      }

      The only problem with the above is that it fails in anything other than trivial situations.

      The issue isn't passing things through filters, it's doing so in a way that you don't have to write insanely devious and complex filters. This grep tool is still only at the design stage, so I'll not speculate about whether it actually succeeds at this goal or not.

      --
      Crumb's Corollary: Never bring a knife to a bun fight.
    5. Re:Microsoft Ad by lucm · · Score: 1

      That's great for all fifty people who use Powershell.

      You might need to get a better "grep" on reality, it's not 2004 anymore.

      --
      lucm, indeed.
    6. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      Ummm there is nothing to stop you from piping anything in any popular shell. Windows does stuff differently is all. Windows, its life the universe and everything in a single _huge_ _bloadted_ tool (like .net+powershell).

      unix you have a bunch of small commands that you string together to do useful work. small command doesn't exist. trivial to add another.

      Not sure one way is inherently better than the other, but they are different ways of thinking. I prefer *nix to perform real work. you can have your love fest with the windows way.

    7. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      Wake me up when powershit runs on something other than windohs.

    8. Re:Microsoft Ad by lucm · · Score: 1

      Wake me up when powershit runs on something other than windohs.

      "H4X0R DuD3" Bingo!

      --
      lucm, indeed.
    9. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      The only things i hack into with regularity are your moms pussy and your sisters mouth. skank bitches.

    10. Re:Microsoft Ad by lucm · · Score: 1

      The only things i hack into with regularity are your moms pussy and your sisters mouth. skank bitches.

      Basement dweller Bingo!

      --
      lucm, indeed.
    11. Re:Microsoft Ad by Anonymous Coward · · Score: 0

      You might need to get a better "grep" on reality, it's not 2004 anymore.

      You mean there are less than fifty users now? I still don't see any calls for Powershell-experts. What's "power" for MS, is just a shell in real operating systems. Can't blame people for not getting all enthusiastic over such a thing.

  17. XPath? by Anonymous Coward · · Score: 0

    Many years ago, I abused the capabilities of Flex (the fast lexical analyzer generator) to instrument students' C++ code. I was actually adding reference-counting code to check for leaks (part of that assignment's grading rubric). I just had to parse the code into a nested tree of { } bracing and adjoining text, and then pattern-match on that tree to find class and method definition boundaries, where I inserted code. I think it only broke on one out of about fifty submissions, where I had to intervene and instrument the code by hand instead.

    Something like this could be done to handle XML since it is essentially little regular languages embedded in a well-formed tree of angle brackets and quoted strings. But I wouldn't bother with this, since XSLT and XPath exist for your problem...

    I haven't read the original paper for this slashdot discussion, but the idea of a grep-like and sed-like tool that could use context-free grammars rather than regular expressions is very interesting to me. The hard part will be making it concise enough to use from the command-line rather than an edit/compile sort of parser-generator experience.

    1. Re:XPath? by Doc+Ruby · · Score: 1

      Your tool sounds interesting. But what I'm talking about is for actual instances of objects, not source code - which would give only classes, not objects. I described objects serialized in XML, and possibly the actual binaries in RAM, decoded according to either XML DTD or maybe binaries according to source code (which seems very ambitious).

      --

      --
      make install -not war

    2. Re:XPath? by Anonymous Coward · · Score: 0

      Actually, xsltproc is an existing command-line utility that can be used to search XML documents and output whatever you want. The core of XSLT is the use of XPath to address elements in an XML document tree via patterns such as parent/child relationships, tag names, attribute values, or element body text. One idiom in XSLT is to iterate over the set of matching elements and then print out some text which can include content extracted out of the matched element.

  18. Re:bad, wrong and stupid by Anne+Thwacks · · Score: 1
    They're not going to break into your house and apt-get remove grep

    Are you sure?

    They can probably do it remotely on must OS's anyway. Quick - make friends with Theo.

    --
    Sent from my ASR33 using ASCII
  19. They should call it... by goombah99 · · Score: 3, Insightful

    perl. Isn't this exactly why perl was invented?

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:They should call it... by The+Askylist · · Score: 1

      I've always thought perl should be renamed SOS - Self-Obfuscating Scripting. But then again I prefer languages to be human-readable.

    2. Re:They should call it... by Anonymous Coward · · Score: 0

      Actually people just wanted a way to execute random line noise as if it was a program.

      That's also why programming over a bad terminal connection in perl can have disastrous consequences.

      You never quite know if that garbly-gook is line noise or the last 10 minutes of your work.

      Off the top of my head..... @#$%&JVJDV)@#MSDC)(FGDSG(DF)GSDFG(SDFG

      That's probably valid perl that compiles and computes something.

    3. Re:They should call it... by marcosdumay · · Score: 1

      Yes, also sed, and awk.

      They are still ages behind prolog, that will parse context dependent texts....

    4. Re:They should call it... by FractalParadox · · Score: 1

      Yes, next up will be awk so that it can take many sequential operations at once ... we'll call it sqawk.

    5. Re:They should call it... by Anomalyst · · Score: 1

      For limited definitions of human.

      --
      There is no right to feel safe thru security vaudeville at the expense of everyone's freedom, privacy and tax money.
    6. Re:They should call it... by 93+Escort+Wagon · · Score: 1

      perl. Isn't this exactly why perl was invented?

      perl - the tool for people who find sed and awk too straightforward.

      --
      #DeleteChrome
    7. Re:They should call it... by wdef · · Score: 1

      I know you're trying to be funny but that is so wrong. Once you try to do anything more complex with sed or awk than their most obvious uses it's reach for the code snippets web page.

  20. Re:bad, wrong and stupid by gstoddart · · Score: 2

    Do we really need to improve on something that works already?

    This would work, but better. No, I'm not being flippant.

    If you have structured data (say XML), you could target hierarchies like config-root:server-name:name. That way if the text inside "name" is only being looked for in that one field, you won't hit a bunch of other stuff that also happen to be similar strings but are unrelated.

    I'm sure you'd still have your regular grep/diff utilities, but there's definitely places where being able to match these strings in-context would be of value.

    Of course, someone is going to need to write a corresponding context-free sed (and maybe awk as well) to go along with the grep. But there's actually a lot of places where this would be a huge improvement in terms of certain kinds of automation.

    Use of a context-free grammar also lets this be insensitive to whitespace and newlines, so it would work on "prettified" HTML or stuff that's all formatted haphazardly. This is basically how those things are parsed now ... the grammar rules define the structure, and don't need it to be all perfectly laid out in order to be able to handle it.

    --
    Lost at C:>. Found at C.
  21. darcs by bcrowell · · Score: 1

    There's a version-control system called darcs (written by the son of a colleague of mine) that incorporates some interesting ideas along these lines. For example, say you have a program with 100,000 lines of code, and there's a function in it called Foo, which is called thousands and thousands of times. You want to change the name of the function to Bar. In a traditional diff-based system, this results in thousands of differences. Darcs is supposed to be able to handle changes like this and recognize that it's only *one* change. It's also supposed to be able to handle the case where programmer A makes this change and checks it in, and then programmer B, who has simultaneously been doing lots of other work on the code, checks in his own changes -- with the old name for the function.

    1. Re:darcs by Anonymous Coward · · Score: 0

      Darcs is very smart about patch commutation and as a result it's the tops at cherry-picking patches, but it certainly doesn't operate at all like you describe. Changes to an identifier in a hundred lines is still a hundred lines changed in that one changeset.

  22. Rob Pike called, he wants his idea back by Anonymous Coward · · Score: 1

    People have been trying to adapt line-oriented regular expressions to handle other sorts of data since at least the 1980s. Structured regular expressions were introduced with the Plan 9 system, but never seem to have caught on elsewhere.

    It certainly would be nice to have tools that readily handle multi-line data, rather than forcing everything to fit into a line oriented format. It would be wonderful to be able to fix up indentation in version controlled files without making the history unreadable, for example.

  23. existing tools and suggestion by khipu · · Score: 1

    PCRE has recursive patterns (available as pcregrep) and .NET has balancing groups, also allowing grep-like operations involving context-free grammars. For XML data, there are various XML query languages that allow wonderfully complex queries over XML structures. There are also refactoring tools that allow syntax-aware searches across source files.

    For diff, the situation is a bit more complicated. There are XML-based diff tools, programming language syntax aware diff tools, and complex edit distance based diff tools already. It seems difficult to come up with something more generic. Let's say you want to diff programming language source files in languages for which there is no diff tools. What good is a context free diff tool going to be? You'd need to specify the entire grammar for the language.

    I think the most useful way these people could spend their time and money would be to port PCRE-style recursive patterns and .NET like balancing groups to more UNIX regular expression libraries (foremost, Python).

    1. Re:existing tools and suggestion by Tetsujin · · Score: 1

      There are XML-based diff tools, programming language syntax aware diff tools, and complex edit distance based diff tools already. It seems difficult to come up with something more generic. Let's say you want to diff programming language source files in languages for which there is no diff tools. What good is a context free diff tool going to be? You'd need to specify the entire grammar for the language.

      I don't know if you didn't read this bit, or if I've misunderstood your post... But the basic (proposed?) approach of bgrep and bdiff is to provide a plug-in mechanism that would be used to extend the tools to new languages and data types. So, yes, you would have to specify the entire grammar for the language, but you'd only do it once... Or preferably, someone would have already done it for you. :)

      --
      Bow-ties are cool.
    2. Re:existing tools and suggestion by khipu · · Score: 1

      The use cases, options, and interfaces are different for searching programming language source files, XML files, and other text. So, you really need at least three tools: bgrep-lang, bgrep-xml, and bgrep text. Each of those might then have a plugin mechanism. But these three classes of tools already exist. Trying to force them into a single command line tool makes little sense to me.

      "bgrep-text" is just pcregrep.

      "bgrep-xml" is any one of a number of XML query and search tools, using XQuery or similar languages.

      "bgrep-lang" is any one of a number of language-aware search tools (often with tons of options that make no sense for text or xml).

      It's similar for diff.

    3. Re:existing tools and suggestion by Tetsujin · · Score: 1

      The use cases, options, and interfaces are different for searching programming language source files, XML files, and other text. ... Trying to force them into a single command line tool makes little sense to me.

      You make a good point, but I think a tool that forces the different data types into a single mould could still be useful, even if it can't provide all the functionality that a specialized tool would.

      --
      Bow-ties are cool.
    4. Re:existing tools and suggestion by khipu · · Score: 1

      Well, reading the paper, I think they are actually aiming for something more limited, and although theoretically you might be able to handle program source code, I doubt that this will be a good tool for it. However, what they are aiming for doesn't even seem to require full context free parsing. Their motivating example, "files divided into sections" can be handled trivially using small Python or Perl scripts (that's what Perl is for). The paper also also seems pretty weak on prior work. Sorry to be so negative, but I really think there are better things they could spend their efforts on (I listed them before).

  24. Terrible idea by deblau · · Score: 4, Insightful

    This violates so many rules of the Unix philosophy that I don't even know where to begin...

    FTFA:

    Grep has issues with data blocks as well. "With regular expressions, you don't really have the ability to extract things that are nested arbitrarily deep," Weaver said.

    If your data structures are so complex that diff/grep won't cut it, they should probably be massaged into XML, in which case you can use XSLT off the shelf. It's already customizable to whatever data format you're working with.

    FTFA:

    With [operational data in block-like data structures], a tool such as diff "can be too low-level," Weaver said. "Diff doesn't really pay attention to the structure of the language you are trying to tell differences between." He has seen cases where dif reports that 10 changes have been made to a file, when in fact only two changes have been made, and the remaining data has simply been shifted around.

    No, 10 changes have been made. The fact that only two substantive changes have been made based on 10 edits is a subjective determination. That is, unless you want to detect that moving a block of code or data from one place to another in a file has no actual effect, in which case good luck because that's a domain-specific hard problem.

    --
    This post expresses my opinion, not that of my employer. And yes, IAAL.
    1. Re:Terrible idea by Anonymous Coward · · Score: 0

      This violates so many rules of the Unix philosophy [wikipedia.org] that I don't even know where to begin...

      So? The Catholic Church arrested Galileo for subverting the Aristotilean philosophy that the sun and stars revolve around the earth. Revered authority can impede progress. Thompson, Ritchie, et. al did great things way back when, but they are not the final word on everything.

      If your data structures are so complex that diff/grep won't cut it, they should probably be massaged into XML, in which case you can use XSLT [wikipedia.org] off the shelf. It's already customizable to whatever data format you're working with.

      Suppose our "data structures" are source code written in a language such as Java or Python....?

    2. Re:Terrible idea by bussdriver · · Score: 1

      no big deal; these features can become a FLAG option on grep and diff and I wouldn't care-- it could be useful; I just use perl for most stuff and grep for only really simple stuff. Ok, I'd probably still use perl... its my swiss army pocket knife I use to hammer everything ;-)

    3. Re:Terrible idea by Tetsujin · · Score: 2

      This violates so many rules of the Unix philosophy that I don't even know where to begin...

      I'll take this on. It's a subject that is of particular interest to me.

      First of all, you have to consider whether it even matters that a tool violates "rules" of the "Unix philosophy". I mean, seriously, why assume that some system design ideas cooked up 30-40 years ago are necessarily the One True Path? Because "those who do not understand Unix are doomed to reinvent it poorly"? What if the designers in question do understand Unix? Or what if <gasp> they might actually have some ideas that surpass those of Doug Mcllroy, ESR, K & R, and so on?

      Second, how does one account for tools like Perl? By many accounts it is one of the greatest Unix tools ever created. By combining the functionality and syntax of several useful tools, incorporating a rich regexp syntax, and binding it together with a general-purpose programming language, it can be a very versatile and effective tool. But it runs afoul of various "rules" as well: (I will use a star to mark the rules I don't particularly agree with)

      • "Write programs that do one thing and do it well" (Doug Mcllroy's summary of the philosophy, first clause)
      • "Clarity is better than cleverness" (ESR, second rule* - I think there are times when it's worth having a compact notation with a difficult learning curve.)
      • "Design programs to be connected to other programs." (ESR, third rule - I would argue that Perl encompasses as much functionality as it can to avoid having to connect to other programs - to avoid outside dependencies, to eliminate the problem of communicating with other processes, and to stabilize and simplify the interface to that functionality.)
      • "Design for simplicity: add complexity only where you must" (ESR, fifth rule... Though it could be argued that this is exactly how the design of Perl evolved.)
      • "Programmer time is expensive; conserve it in preference to machine time." (ESR 13th rule - Perl runs afoul of this if you accept the idea that Perl code is particularly hard to maintain. A language with a clearer syntax would, presumably, conserve programmer time.)
      • "Use shell scripts to increase leverage and portability." (Gancarz, 7th rule. I would argue that Perl scripting exists largely as a way to avoid solving problems in the shell language.)

      Perl's biggest "violation", which it shares with other scripting languages, is that first one: "do one thing and do it well." Perl, Python, etc. are perfectly capable of doing a fork/exec or popen or loading a .so or whatever - but generally if there's a piece of functionality that people want to have in those languages, they re-implement it as a native library for those languages. Why do we accept so blatant a violation of what may be rightly considered the Unix philosophy? Because it works. It's useful. So a better question, then, is why is it that violating such an important "rule" is apparently necessary to create such a useful tool? There are various reasons: First, any reliance on an outside program is a maintenance issue. If your script is written for GNU find, for instance, and you move it to a system that has some other implementation of find, it may not work. Things can change from revision to revision as well. Second, it actually makes it easier to access the functionality, since you don't have to deal with writing out a stream of values and/or reading back a stream of results - when you call a Perl module, everything is neatly packaged into a (usually) synchronous call/result function interface, and presented as native Perl data.

      Perl could be a contentious example - but I chose it because to me, it and other scripting languages are examples of people bypassing the shell environment, rather than augmenting it. I would go so far a

      --
      Bow-ties are cool.
    4. Re:Terrible idea by sonamchauhan · · Score: 1

      diff only _estimates_ the changes made

      Without access to user keystroke information, it cannot be sure what changed. It reverse engineers that information

      There are XML diff tools available. But I don't see how a XML transformation language can serve as a diff tool out of the box.

    5. Re:Terrible idea by cynyr · · Score: 1

      Is perl part of busybox? no? okay then it isn't a core tool that I can expect to rely on. The same goes for this new grep, and diff. There are times I have had to recover using the distro's provided rescue initramfs.

      Also doesn't git already handle the "we rearranged the order of the functions in module" case cleanly?

      --
      All of the above was encrypted with a Quad ROT-13 method. Unauthorized decryption is in violation of the DMCA.
  25. Mod parent up by Lakitu · · Score: 0

    This man has a point -- these government-sponsored dumbification programs have obviously already worked on him. You could be next.

  26. Wait, why? by RandomMonkey · · Score: 0

    Why do we need to write another perl?

    1. Re:Wait, why? by DrVxD · · Score: 1

      Because the existing isn't (quite) unreadable enough?

      --
      Not everything that can be measured matters; Not everything that matters can be measured.
    2. Re:Wait, why? by Anonymous Coward · · Score: 0

      Hehe, that is awesome. You are referring to the comments made by people about perl not being readable. I have to agree, it is a bit unreadable if you don't understand unix or perl.

  27. the perl man page by goombah99 · · Score: 2

    From the header of 1988 perl man page:

    Submitted-by: Larry Wall
    Posting-number: Volume 13, Issue 1
    Archive-name: perl/part01

    [ Perl is kind of designed to make awk and sed semi-obsolete. This posting
          will include the first 10 patches after the main source. The following
          description is lifted from Larry's manpage. --r$ ]

          Perl is a interpreted language optimized for scanning arbitrary text
          files, extracting information from those text files, and printing
          reports based on that information. It's also a good language for many
          system management tasks. The language is intended to be practical
          (easy to use, efficient, complete) rather than beautiful (tiny,
          elegant, minimal). It combines (in the author's opinion, anyway) some
          of the best features of C, sed, awk, and sh, so people familiar with
          those languages should have little difficulty with it. (Language
          historians will also note some vestiges of csh, Pascal, and even
          BASIC-PLUS.) Expression syntax corresponds quite closely to C
          expression syntax. If you have a problem that would ordinarily use sed
          or awk or sh, but it exceeds their capabilities or must run a little
          faster, and you don't want to write the silly thing in C, then perl may
          be for you. There are also translators to turn your sed and awk
          scripts into perl scripts.

    --
    Some drink at the fountain of knowledge. Others just gargle.
    1. Re:the perl man page by mbkennel · · Score: 1

      Submitted-by: Larry Wall
      Posting-number: Volume 13, Issue 1
      Archive-name: perl/part01

      [ Perl is kind of designed to make awk and sed semi-

      Stop!!!

      You had me with 'kind of designed'.

  28. Structural Regular Expressions by vAltyR · · Score: 2

    This reminds me of a paper Rob Pike wrote a while back addressing this problem. His solution was a generalization of regular expressions, which he termed Structural Regular Expressions. I'm not sure how these stack up against context-free grammars, but it's an interesting approach that seems at least fairly similar to the Dartmouth work. In any case, I didn't see it as a reference, so I thought I'd mention it.

  29. Subject line is not part of the comment by Tetsujin · · Score: 1

    They should call it... perl. Isn't this exactly why perl was invented?

    Perl could do this - with the right libraries. But that's the real value they're adding here. They created tools that operate on files with knowledge of the structure of those files. So for instance a "diff" between two XML files with identical contents but differences in formatting could report that the files are identical... Or if you had some file structure that defined a directed-graph structure, a format meant to be edited in-place (and which therefore might sometimes have holes in it where data was removed - or which might have data presented in a different order depending on the sequence of operations used to store it) - the "diff" tool would decode the files, examining the data structure they're meant to represent - and show the differences in that.

    Obviously it could be done in Perl - but it wouldn't be a one-liner unless you had those libraries which translate the particular file format into the desired level of abstraction.

    --
    Bow-ties are cool.
    1. Re:Subject line is not part of the comment by JanneM · · Score: 1

      I haven't read the story (I mean, obviously) but could you do a diff between two source files and it uses GCC to determine if the corresponding ASTs are the same? And, in my dreams, do a diff on the same file with two different compilers to see where their interpretation differs?

      --
      Trust the Computer. The Computer is your friend.
  30. Oh, to suffer the slings and arrows... by Tetsujin · · Score: 1

    I know I'll be modded down

    Dude, the only part of your post that I find objectionable is this assumption that you're going to be crucified for posting your thoughts. I know that there are some people on Slashdot who are pretty predictably triggered to shout down certain opinions - just don't assume that everyone here is like that, OK?

    I think there's a lot to like about Powershell, and part of me will always be a bit jealous that Windows got a shell with those kinds of capabilities before Linux did. It does indeed seem that what they describe bgrep and bdiff doing could be accomplished in Powershell. I've never been too clear on some of the particulars of how that would be done, though. As I understand it, you can search/filter either XML data streams, or a sequence of .NET objects. Would the way to accomplish this in .NET, then, be to have a commandlet that opens the source file and passes them through as .NET objects? It would be a bit less compact than having the special type handling right in the "find" or "filter" command but it does lend a certain clarity to things, too...

    --
    Bow-ties are cool.
    1. Re:Oh, to suffer the slings and arrows... by lucm · · Score: 1

      I think there's a lot to like about Powershell, and part of me will always be a bit jealous that Windows got a shell with those kinds of capabilities before Linux did. It does indeed seem that what they describe bgrep and bdiff doing could be accomplished in Powershell. I've never been too clear on some of the particulars of how that would be done, though. As I understand it, you can search/filter either XML data streams, or a sequence of .NET objects. Would the way to accomplish this in .NET, then, be to have a commandlet that opens the source file and passes them through as .NET objects? It would be a bit less compact than having the special type handling right in the "find" or "filter" command but it does lend a certain clarity to things, too...

      Powershell and .Net can collaborate both ways. As an example, many recent Microsoft products are using .Net for the GUI but in the backend all the actual work is done in Powershell. The opposite is true - one can plug a .Net component in a Powershell script, as an example to do a custom filter.

      The best example for the PS pipe model is with the VMWare extensions (PowerCLI). You can get a full inventory by writing a script like this:
      Get-Host | Get-Vm | Export-Csv c:\myinventory.txt

      In the CSV you get all the properties of each VM in the host. And if you don't want all the properties, you can put a filter between the Get-VM and the Export-Csv to match the exact output you want. This is done by an extensive use of reflection.

      --
      lucm, indeed.
    2. Re:Oh, to suffer the slings and arrows... by benjymouse · · Score: 1

      As I understand it, you can search/filter either XML data streams, or a sequence of .NET objects. Would the way to accomplish this in .NET, then, be to have a commandlet that opens the source file and passes them through as .NET objects?

      Indeed. The "extended grep" in PowerShell is the Where-Object (aliases where and ?). It works on a stream of objects; objects in PowerShell's extended type system actually being a superset of .NET objects which can also wrap WMI and COM objects.

      The common way in PowerShell to "grep" using where is indeed to read the source as objects. For example to read from an XML document you would cast it to the built-in [xml] type (which is really a wrapper around the .NET System.Xml.XmlDocument type) and then pass select nodes through the where cmdlet.

      My sig is an example of a PowerShell command which reads slashdot RSS feed, parses it as xml, selects the "items" xml nodes, filters away the biased postings by kdawson and finally displays the description property in a list.

      --
      Reading slashdot one-liner: (irm http://rss.slashdot.org/Slashdot/slashdot).rdf.item | fl title,desc*
    3. Re:Oh, to suffer the slings and arrows... by Qzukk · · Score: 1

      In the CSV you get all the properties of each VM in the host.

      What do you get printed on the screen if you remove the "Export-Csv"? binary gibberish, or just text?

      On Linux, it's trivial to detect whether your output is to a terminal or to a file, and likely possible to detect that your output is a pipe. Likewise, if people really cared, they could produce a set of programs that output "objects" to pipes and text to the screen. The key here being that even if you declared a standard form for your objects (for instance, bencode) and other people used your standard, you'd need a pretty complete set of programs to deal with it.

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    4. Re:Oh, to suffer the slings and arrows... by lucm · · Score: 1

      In the CSV you get all the properties of each VM in the host.

      What do you get printed on the screen if you remove the "Export-Csv"? binary gibberish, or just text?

      On Linux, it's trivial to detect whether your output is to a terminal or to a file, and likely possible to detect that your output is a pipe. Likewise, if people really cared, they could produce a set of programs that output "objects" to pipes and text to the screen. The key here being that even if you declared a standard form for your objects (for instance, bencode) and other people used your standard, you'd need a pretty complete set of programs to deal with it.

      The default output is a table format (just like ls -l on linux), but there are many options. Export-Csv is a command that will create a csv file, it is not per se an output option.

      --
      lucm, indeed.
  31. Powershell envy by Tetsujin · · Score: 1

    So what? Maybe people want a non-proprietary solution that works on more than one OS.

    If there are such people, and it's not just me, I'd love to oblige them. :) I really need to get crackin'...

    --
    Bow-ties are cool.
  32. Perl to wa chigau no da yo! Perl to wa! by Tetsujin · · Score: 1

    Why do we need to write another perl?

    Is it really "writing another perl"? The meat of these tools (which, I think, aren't yet implemented?) is that they filter and compare parsed data structures - and provide plug-in hooks so people can insert parsers for additional data types. Certainly this could be done as a Perl library - and doing so might have some advantages over creating new tools with their own plug-in mechanism. But implementing bgrep and bdiff is nowhere close to "writing another perl".

    --
    Bow-ties are cool.
  33. TXR! by Kaz+Kylheku · · Score: 1

    I'm also working on a text processing tool that deals with blocks of data is already here.

    http://www.nongnu.org/txr

  34. 10 changes have been made - Disagree by roguegramma · · Score: 1

    Suppose you signal the nesting level by indentation, as most programmers today do.

    If you add a condition around some code, then for example 3 lines might indented, resulting in 5 lines being altered instead of the 2 which actually have changed.

    For this, the proposed improved grep and diff might be good, at least better than the current state of diff. Okay, maybe I'm not telling about the -b flag, but the -b flag might be a problem if you code in whitespace or so ;-)
    http://en.wikipedia.org/wiki/Whitespace_(programming_language)

    The appropriate way to deal with this would be to convert all program code to intermediate language, including comments and if available assertions, and to only check this code into the versioning system.

    On check out, the code would be transformed according to an either agreed on formatting, or even to a different formatting for everyone.

    tags:programming languages versioning systems patents prior art

    --
    Hey don't blame me, IANAB
  35. Re:Perl to wa chigau no da yo! Perl to wa! by RandomMonkey · · Score: 0

    Right, but what is there not already a parser for in CPAN? And if you are handy with perl, what kind of comparison is difficult?

  36. Perl by wdef · · Score: 2

    Perl can context grep any ****ing thing any which way from Sunday. Much easier and more powerful than awk.

  37. binary logs by Anonymous Coward · · Score: 0

    Just in time for RedHat's move to binary logging...

  38. Re:Perl to wa chigau no da yo! Perl to wa! by Tetsujin · · Score: 1

    Right, but what is there not already a parser for in CPAN? And if you are handy with perl, what kind of comparison is difficult?

    I couldn't say, honestly. :) So write a Perl script that recognizes the input file types, chooses the correct module, implements some kind of matching rule syntax, and performs the comparison with whatever module you chose in step 2, and a plugin system so people can add more file types without modifying your script, and yes, you've pretty much got bgrep.

    I think at that point you're beyond "Perl's capabilities" and into the realm of "capabilities of things you can implement in Perl".

    --
    Bow-ties are cool.
  39. Re:Great idea by b4dc0d3r · · Score: 1

    So you would rather convert your data into XML instead of having a tool do it for you? That's pretty much the point of this, having a tool to do the work for you. Maybe it will even work by converting it to XML and using XSLT. But the data definitions will help everyone who uses it instead of everyone rolling their own.

    FTFA

    For each new type of data structure, a vendor would provide a pattern library identifying the basic structure of the data, which the software would then use to "extract the constructs of interest from the document," Weaver said.

    I have a binary file problem right now, I built a parser to convert it to text, and I can see the differences easily that way. Many files are exact duplicates in a different format (like a JPEG saved as a GIF and BMP), and many others are only slightly different (think French or Spanish converted to the English alphabet where the accents are lost). Weeding through the files is a lot easier, and if I could define the format and have the tool do it for me I would not have had to "roll my own"

    That is, unless you want to detect that moving a block of code or data from one place to another in a file has no actual effect, in which case good luck because that's a domain-specific hard problem.

    If your format definition says ordering is important, like for a programming language, that would be 10 edits. I can think of piles of examples, po translation files would be one, where the order doesn't matter. If someone sorts the list to be able to compare if one file is missing phrases, I don't care, I only want to see what's new and different. The format definition would say that order is not important.

  40. no, there is another by mbkennel · · Score: 1

      the one letter per word algorithm, e.g First Unix Command Creator, and its obviously improved successor bj.

  41. USENIX Conference Videos by luk3Z · · Score: 0
    --
    Recipes for USA bankrupt - http://tinypaste.com/0d66f dd = dollar deluge (printed in the infinity)
  42. No, it'll be by Anonymous Coward · · Score: 0

    ...And when Apple forks their own version, it'll be objective grep.

    iGrep.

  43. Grep Paragraph option by Anonymous Coward · · Score: 0

    There is an already existing option in the AIX version of grep.

    -p[Separator] Displays the entire paragraph containing matched lines. Paragraphs are delimited by paragraph separators, as specified by the Separator parameter, which are patterns in the same form as the search pattern. Lines containing the paragraph separators are used only as separators; they are never included in the output. The default paragraph separator is a blank line.

    http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds2/grep.htm

    By default the -p separate stanza and ouputs separated with lines of the same char. (Very useful) I miss that option in Linux or other Unix flavors.

    VladTepes

  44. BGrep & BDiff by Anonymous Coward · · Score: 0

    For anyone actually looking for the poster information, it can be found here: http://www.cs.dartmouth.edu/~gweave01/grepDiff/index.html