Search Engine For Coders to Launch
karvind writes "According to Wired, 'Krugle' is set to next month. The search engine indexes programming code and documentation from open-source repositories like SourceForge, and includes corporate sites for programmers like the Sun Developer Network. The index will contain between 3 and 5 terabytes of code by the time the engine launches in March. According to article, Krugle also contains intelligence to help it parse code and to differentiate programming languages, so a PHP developer could search for a website-registration system written in PHP simply by typing 'PHP registration system.'" Update: 02/17 21:04 GMT by Z : Summary edited for accuracy.
This sounds like a new company, not a product of Google.
Some other interesting features above and beyond simple searching could be:
- merge with semantic web work to be able to search on higher level concepts (e.g. if I type "bubble sort" it returns all bubble sorting code even if it doesn't explicitly say "bubble sort" anywhere).
- "community" features that allow developers to leave comments on code (no, not comments _in_ code, but on code, similar to epinions et al).
- if this index is available via api like the main google index, then people could do things like have automated lint type tools.
- code chain. If I search for some code, then it'd be nice to be able to then peruse that codes hierarchy within the search engine (vs having to download it or cvs over to it).
...till Microsoft, SAP, SCO (remember them?) etc start polluting this repository with proprietary code?
Jaysyn
There is a war going on for your mind.
...to people that won't look at them.
I have thought about this a lot because I have some detailed plans for implementing a superficially similar system. I have looked at a list of similar existing sites, like Koders, CodeFetch, jdocs, etc. I haven't looked at Krugle yet because they only grant access to people that think will help them in their extensive pre-launch publicity campaign. Krugle-related announcements, all with basically the same rehashed non-information, have appeared all over the internet (Digg, Infoworld, numerous blogs). Whoever runs Krugle's marketing program should get a raise. This is basically "PR 2.0" and we will, unfortunately, be subjected to it by many companies from now on.
AFAICT, the existing sites like this that are trying to make money seem to base everything on the idea that they can get programmers to click advertisements in the search results. But, there is no group in the world that is better at ignoring online advertisements than (open-source) programmers. Plus, some of the ads are really ridiculous. For example, on jdocs.com, the first ads I noticed were for _illegal street racing videos_ (no joke!). On the other hand, some of these sites have been around for a while, so perhaps the advertising model works better than I think.
Having said _that_, they still have to compete with Google. Just like google has google.com/linux, they could easily add google.com/code. Even the normal Google search is pretty effective at finding code (I mean, SEO companies rarely use keywords like GetNextFileName or SwingUtilities.invokeLater).
So, these code-search companies would have to have major value-added features. To be honest, what I've heard about Krugle makes me think that they have yet to come up with such compelling features. And, if they make you sign up to access them, then the value of their advertised features decreases significantly.
In fact, it will be interesting to see if anybody can come up with the features that I think would compel people to use and even _pay_ to use such a system. I can think of several such features that I would pay for access to. But, I am not sure that it is profitable for anybody to sell them to me at a price I will pay. Plus, there are quite a few political roadblocks to implementing them.
OK, that code exits dirty at EOF. Needs a small modification to pass exit status of read back through function return to main while loop:
ReadFields () {
local IFS=:
Host="#"
while [ `expr substr $Host 1 1` = "#" ]
do
read Host Key Interval Excludes Keep || return
done
}
Still ugly and inelegant, in my opinion, but at least it seems to work... and it has an explicit EOF return now, which is probably a good thing.