Slashdot Mirror


Preview Of New Beagle Search UI

An anonymous reader writes "The new Beagle Search UI was merged into Beagle CVS last week, after being developed as a separate module known as 'Holmes'. A preview is now online with plenty of screenshots. It currently doesn't look as smooth or well integrated as Spotlight, but it does look promising and it is still in a very early stage."

36 comments

  1. Another step forward by helmutvs · · Score: 1, Informative

    This looks like another (needed) step towards making Linux ready for the average computer user.

    --
    There are no uninteresting things. There are only uninterested people.
  2. Eyecandy by Eightyford · · Score: 1, Redundant

    Eyecandy, that is not.

  3. Hmm, I Wonder... by hzs202 · · Score: 1

    if it has web-search features will it give Slashdot servers a problem loading thousands of pages like Google Desktop does?

  4. the spotlight interface is horrible by John+Nowak · · Score: 2, Informative

    Anything should be better. I'm not going to get into details here, but if anyone has actually used it, they'll know how limited and clunky it is. John Siracusa outlined the issues well in the Ars write-up on Tiger.

    1. Re:the spotlight interface is horrible by node+3 · · Score: 2, Insightful

      the spotlight interface is horrible

      Agreed, in the same vein as "all OS's suck, just that 'x' sucks less". Taken in the context of all available desktop search systems, Spotlight is pretty good.

      Anything should be better.

      I wouldn't bet on it. It's a really hard thing to get right. We're currently at the bear-skins and stone tools stage of full desktop search. Elegance in design takes time.

      John Siracusa outlined the issues well

      No he didn't. He critiqued Spotlight well. There's a huge difference. A Beagle developer cannot just use Spotlight for a while, read Apple's technical documentation, then read Siracusa's review and create a better Spotlight.

      His reviews are very good. They tell you how things work, how they don't work, how they are inconsistent, and how they don't match his dogmatic ideals. What his reviews do not do is provide solutions for any of the bigger problems. They are reactive, not creative.

      For example, he has this huge thing for a spatial Finder. A spatial Finder was very usable in the day of 800k floppies, and 20mb hard drives. Today, the spatial consistency of the Finder is not as important as before. He provides no solution other than to keep the Finder spatial, as before. The current NeXT-style Finder is a good stop-gap as we transition into huge hard drives with hundreds of thousands of files. The iTunes, iPhoto and Mail interfaces are very useful for their specific data types, but Spotlight is what's needed to bring the modern Finder to be as usable with today's requirements as the old Finder was back then.

      So sure, compared to the "ideal", Spotlight sucks, but all desktop search systems suck. Spotlight just sucks a lot less. And Beagle (I'm glad it exists, and look forward to using it on Linux) is pretty sure to suck, probably less than Google Desktop and Windows Desktop Search suck, but will certainly suck more than Spotlight does. It's just the most rational set of expectations to hold.

    2. Re:the spotlight interface is horrible by Anonymous Coward · · Score: 0

      Mod parent up, it's the only really insightful comment on this story.

  5. Mono by MoogMan · · Score: 0, Troll

    Unfortunately, it uses MONO, which as we all know is a Bad Thing(tm).

  6. Interesting... by mellon · · Score: 4, Informative

    I'm working on a somewhat more flexible search tool for Qt/KDE right now. I'll put up some screenshots in a few minutes - I'd be interested in some insightful comments about it.

    1. Re:Interesting... by mellon · · Score: 1

      Screenshots are up. I realized after posting it that the above comment sounds like stealing thunder from Beagle - sorry about that. The Beagle stuff looks cool; my reason for putting this comment up was (a) to show that there is more that can be done with a search UI than merely what spotlight does, and (b) because the small subset of searchers who need the additional functionality provided in gofer and not beagle might see this and try it out or comment on it.

    2. Re:Interesting... by filipncs · · Score: 1

      You're not using indexing? That is kinda the selling point of Spotlight, Beagle, Google Desktop, etc.

      (or rather, the selling point is that the speed of searching is so fast that it's actually useful as a way of navigating your files)

    3. Re:Interesting... by mellon · · Score: 2, Informative

      Excessive reliance on indexing renders searching virtually useless. If you want to use searching to keep track of your files, you need to search quickly. Generally speaking, an indexed search will either find too many matches, or too few, because you can't do intelligent string matching with an indexed search - you can only search for indexed words, and of necessity the index can't store word relationships.

      For example, and index will have a pointer in it that says "acrobatic" occurs in files a, b, c, d, e and f. And the word "wombat" occurs in files c, e, k, l and m. So if you want to find the phrase "acrobatic wombat", which only occurs in file c, you're going to get either too long a list, or possibly no list at all.

      If the words you're searching for are rare, you can use an index to speed up the search by winnowing the list of files based on the search. However, if the words are common, chances are that most files will match, and so the time spent winnowing the file list via the index will not make any noticable difference in the search speed.

      My point is not that indexing is wrong, but that it can be (and almost always is) abused to produce results that aren't very helpful. Spotlight, for example, while very cool in the abstract, almost always fails to find what I need, or finds so many things that although the thing I needed was on the list, I didn't save time by using spotlight to find it.

    4. Re:Interesting... by filipncs · · Score: 1

      I've never used spotlight, and beagle only very little, but I've had extremely good results with google desktop. My primary use is searching pdf-files (whenever I want to save a web page I print it with pdfcreator). I can't really recognise the problems you're describing.

    5. Re:Interesting... by mellon · · Score: 1
      For example, and index will have a pointer in it that says "acrobatic" occurs in files a, b, c, d, e and f. And the word "wombat" occurs in files c, e, k, l and m. So if you want to find the phrase "acrobatic wombat", which only occurs in file c, you're going to get either too long a list, or possibly no list at all.


      Oops, I edited my point away here. In the above example, the index would actually do admirably, because both acrobatic and wombat appear in the index, and the only file that contains both is file c, which happens to contain the phrase "acrobatic wombat."

      However, consider a pile of text that contains a lot of very similar phrases. There are seventeen papers on acrobats, fifteen on wombats, and twenty five that talk about both acrobats and wombats, but only one of these papers talkes about wombats who are acrobats. You want to be able to find just the one that talks about wombats who are acrobats. In an indexed search, you're probably going to find either twenty-five or fifty-seven papers in total. In a search that winnows from the index and does a full text search on what remains, you will find a single paper. So the latter is useful, and the former is useless.

      Furthermore, what if you don't know the exact phrase? Maybe it's "wombats who are acrobats" and maybe it's "acrobatic wombats." You want the full-text portion of the search to find either. So if your search string is to match "acrobat wombat", you may get zero matches if the phrase in the paper is "wombats who are acrobats."

      Now imagine that you are searching an entire library, and there are three books that would make good source material. And the phrase acrobat appears in ten thousand of the books in the library, and the word wombat appears in another ten thousand. Say the library contains a hundred thousand books. Searching a hundred thousand books on a computer takes only a few minutes at most, even if you do a full text search. But if you get five thousand matches, you might as well not have searched, because you're still going to have to go through five thousand synopses looking for the three books that contain what you want (and you may risk missing it, if the book doesn't mention acrobatic wombats in the synopsis).

      Adding an index to a search system that can find the three books will speed up the searches. But you want the thing to find the three books before you worry about indexing, because it takes a lot longer to manually search five thousand books for a particular phrase than it does to do a full-text computer search on a hundred thousand books. Indexing before you get the search right is putting optimization before functionality.
    6. Re:Interesting... by mellon · · Score: 1

      I suspect Google does something a bit smarter than the naive indexing scheme I'm talking about. They've historically been very smart about searching. You'll get a better idea of why Google doesn't do well, though, if you try to find something relatively obscure on Google for which all the keywords you can think of are common. Google may return ten thousand matches, and the one you need could be anywhere on the list. If you have a relatively constrained source base, and you're searching for things that you know about (e.g., an email message you can recall, but you don't remember where you filed it), you can construct a good search for that relatively easily, either by restricting the location of the search or by remembering the name of the person who wrote it, for example.

      However, if you are searching a domain whose contents you don't actually know, this is much harder.

    7. Re:Interesting... by filipncs · · Score: 1

      How long do your searches usually take, say if you let it search your entire home folder? Are we talking ms, s, minutes?

    8. Re:Interesting... by mellon · · Score: 1

      Hm. First of all, let me just be clear here in saying that I am not saying that indexing is bad, or that Beagle is bad, or that you should use my search engine instead of Beagle. I just tried a search of a ten gigabyte mail folder. It took eight minutes. This is on a modern laptop with a laptop drive, but still, that's 2.2 megabytes per second, which is pretty pathetic. I would hope to get a bit closer to the drive's native speed with some optimization, of which I have so far done none - not even so basic an optimization as making sure that I'm reading ahead, and that I'm not double-copying data, much less any CPU pipeline-based optimizations.

      Once those optimizations are done, some indexing might make sense, as long as it doesn't produce any false negatives. But that's really key - it has to produce no false negatives. Otherwise, a very precise search simply won't find anything. And it's difficult to produce no false negatives from an index. Consider if the search string includes the word "the". The index probably doesn't contain "the", because it's so common that it would appear in _any_ english document. There's no point in indexing "the". But if you don't, you get a false negative during the winnowing process. So you have to notice that "the" is a thing that's not indexed, and not use it to exclude any files. And then, potentially, you're left with the whole list anyway. So your indexing/winnowing has to be smart. Which is precisely why I'm doing it last.

      Don't look for gofer as a replacement for spotlight on your kde desktop next week. Do look for it as a textual research tool next week.

    9. Re:Interesting... by Hanji · · Score: 1

      I would hope to get a bit closer to the drive's native speed with some optimization

      You'll never get particularly close to the drive's native speed like that. Hard drives (and the HD is gonna be your big bottleneck when you're searching more data than can fit in the block cache), are really, really, good at burst reads of lots of consecutive pieces of data. Your 10GB mail folder, which presumably has thousands of files in it, is probably spread all over the disk, even if individual files are pretty unfragmented (many filesystems even *try* to do this to spread data around, to reduce individual file fragmentation).

      An index, on the other hand, will all be stored in one or two files, which on a decent filesystem, will be pretty much continuous, and you can burst-read it all into RAM, and scan it scary-fast.

      --
      A Minesweeper clone that doesn't suck
    10. Re:Interesting... by dthulson · · Score: 1

      What if you used the index to create a shorter list of candidate files? e.g. instead of looking at all 10 gigs of files for "acrobatic wombat", find the 30 or whatever files that contain "acrobatic" or "wombat" using the index and only do a full-text search of that? I guess it gets more complicated if you allow wildcards (something like "acrobat* wombat"), but I think it's worth thinking about...

    11. Re:Interesting... by mellon · · Score: 1

      That's what I mean by "winnowing."

  7. Don't be surprised if... by Anonymous Coward · · Score: 0

    If you use this, don't be surprised if your UI crashes somewhere on Mars and is never seen again!

    1. Re:Don't be surprised if... by Anonymous Coward · · Score: 0

      Not to worry, I have two other desktop search tools which do a similar job and have managed to continue operating far longer than I expected!

  8. Not As Well Integrated!? by nathanh · · Score: 3, Interesting

    Is the submitter on crack? Beagle is equally or perhaps more integrated than Spotlight.

    To launch the Beagle search UI is a single keypress: F12. On Spotlight it's a double keystroke: command spacebar. Advantage: Beagle.

    Both Beagle and Spotlight have a single icon in the main panel that you can click for a search UI or to set preferences. Advantage: equal

    Both Beagle and Spotlight have a single search field that you can type into, hit enter, and see the results in the main window. Advantage: equal.

    Clicking on a search result in either Beagle or Spotlight will launch the appropriate application for that document. Advantage: equal.

    Beagle has helpers for mail, web pages, text documents, spreadsheets, image files, audio files, instant messaging, etc. Spotlight does not have the same breadth of helpers. Advantage: Beagle.

    Beagle is integrated with inotify which means it is aware of file changes as soon as they occur. The very latest versions of OS X can do the same thing for Spotlight. Advantage: equal.

    Beagle metadata is stored in the ext3 filesystem, associated with the file, so when you move the file the metadata moves with it. Beagle also provides a legacy database for filesystems that don't support file metadata. OS X does not provide a legacy database so you can't store metadata for files on filesystems such as found on removable drives. Advantage: Beagle.

    Neither Beagle nor Spotlight are integrated with any applications other than the Finder or the Finder equivalent. Some OS X applications give the illusion that they have Spotlight functionality by using the same magnifying glass icon. In fact, they are using a separate metadata database and their own search routines. Advantage: equal.

    Beagle looks ugly and Spotlight looks ugly. However Spotlight is the least ugly of the two though it fails a number of human interface design rules. Advantage: you decide.

    Spotlight has been rammed down everybody's throat when it's blindingly obvious that it was rushed for Tiger. Beagle is still an optional feature on most distros. Advantage: you decide.

    1. Re:Not As Well Integrated!? by cpt+kangarooski · · Score: 1

      To launch the Beagle search UI is a single keypress: F12. On Spotlight it's a double keystroke: command spacebar. Advantage: Beagle.

      That's a simplistic analysis. Since users will generally type in their search terms, their hands will be in the main section of the keyboard. That means that it is easier to press Cmd-Space than F12, since the former is located closer to where their hands are. In fact, it's easy to do with one thumb, since the two keys are next to one another. To hit F12 on most keyboards, you can't just reach over, you have to move your hand over.

      --
      -- This and all my posts are in the public domain. I am a lawyer. I am not your lawyer, and this is not legal advice.
    2. Re:Not As Well Integrated!? by Anonymous Coward · · Score: 0

      To launch the Beagle search UI is a single keypress: F12. On Spotlight it's a double keystroke: command spacebar. Advantage: Beagle.

      That's a pretty arbitrarily-judged "advantage", especially considering that the Spotlight shortcuts keys can be easily customised via the Spotlight Preference Pane.

    3. Re:Not As Well Integrated!? by ThinkingInBinary · · Score: 1
      OS X does not provide a legacy database so you can't store metadata for files on filesystems such as found on removable drives.

      Are you sure? Every time anyone with a Mac goes anywhere near a folder of mine on a PC network share or disk, it spatters folders called ".DS_Store" on them, and for many files it creates a file named "._filename ". Aren't these used for generic metadata storage? If not, what are they for?

    4. Re:Not As Well Integrated!? by Anonymous Coward · · Score: 0

      Beagle has helpers for mail, web pages, text documents, spreadsheets, image files, audio files, instant messaging, etc. Spotlight does not have the same breadth of helpers.

      I have to disagree - have you tried checking here?

    5. Re:Not As Well Integrated!? by CableModemSniper · · Score: 1

      .DS_Store is for finder type stuff as regards icon positioning and such. ._filename is for when the file has an old-style Mac OS resource fork. Neither contain the metadata used by Spotlight.

      --
      Why not fork?
    6. Re:Not As Well Integrated!? by ThinkingInBinary · · Score: 1

      Ah, I see. Perhaps Apple should use it for Spotlight as well.

    7. Re:Not As Well Integrated!? by CableModemSniper · · Score: 1

      Well its been a while since I read the ars dissection, but the meta-data attributes that spotlight uses are stored in the filesystem meta-data of HFS+ (ala BeOS). The "problem" is that spotlight doesn't search the files directly, but rather the index. Which is ok because since Tiger has kernel hooks akin to inotify, the index is updated whenever you move, delete, copy, or modify a file or its meta-data.

      --
      Why not fork?
  9. Is it not pretty? by karlto · · Score: 1

    A number of comments on the page linked claimed that the screen shots are ugly, with bad fonts etc. To me, they look fine - antialiased and the whole thing is a lot nicer than any Windows XP screen I've seen. It's clean and doesn't distract me.

    Do people really think this is ugly? Why?

  10. Desktop Search for KDE by WhiteFoxBR · · Score: 2

    If you use KDE and are looking for a desktop search application you should try Kat

  11. Who needs search? Beagle already found! by Myself · · Score: 1

    I thought the "Beagle search" had already been completed.

    Really, how much more time does this issue need? :)

  12. Wow by Makarakalax · · Score: 1

    Wow, apparently I'm not "anyone" because I find it great. In fact when I first used it I thought, "anyone who doesn't like this must be a effeminate moron!". So you're an effeminate moron? *scans posting history* Yep.

  13. Linux is Closing in on OS X, I think... by grouchofan · · Score: 1

    I wrote a series of articles on my web site a little while ago comparing Red Hat Linux FC4 to Mac OS X 10.4. While OS X has the advantage in a number of areas, I believe Linux has it in several others. Spotlight is one of those tools Apple makes a big deal about, but which Linux and open source have replicated with relative ease. Beagle is one good example of this. Other search tools noted in the replies here are good as well. I was a Mac advocate until just before the release of Mac OS X. Apple lost me there. I couldn't see the point of paying a premium price for hardware to run an OS that was (aside from eye candy) not much different or better than Linux (which is free and runs on far cheaper and more common hardware). I think OS X may be the best thing that's ever happened to Linux. It's helped the average user see the value of a UNIX-like desktop while taking major vendors like Adobe and Macromedia a step closer to supporting Linux on Intel (since OS X on Intel isn't THAT far-removed from Linux on Intel from a development standpoint).