Slashdot Mirror


Finding New Code

tabandmountaindew writes "Too much time is wasted re-implementing code that someone else has already done, for the sole reason it's faster than finding the other code. Previous source code search engines, such as google codesearch and krugle, only considered individual files on their own, leading to poor quality results, making them only useful when the amount of time to re-implement was extremely high. According to a recent newsforge article a fledgling source-code search engine All The Code is aiming to change all of this. By looking at code, not just on its own, but also how it is used, it is able to return more relevant results. This seems like just what we need to unify the open-source community, leading to an actual common repository of unique code, and ending the cycle of unnecessary reimplementing."

9 of 158 comments (clear)

  1. I call bullshit on this by analog_line · · Score: 5, Interesting

    I'm not a coder, but my impression of the vast majority of coders is that they reinvent the wheel because they believe that everyone screwed up their wheel implementation and if no one is going to do it right, they should.

    1. Re:I call bullshit on this by introp · · Score: 5, Insightful

      There's an old adage in the racing business: if you're building a parade float, buy your wheels. If you're racing a 300 kph Formula One car, consider building your own. If you place very high demands on a component because it is at the core of what you do, the stock component may not be good enough.

      So, in software, if the "wheel" is at the core if your product, you may have to re-invent it to get exactly what you need. This is not because everyone else screwed it up, just that the stock "wheel" serves 90% of the features for cheap. Look at the Mac iPhone's OS, Cisco's move to (cheaper, memory-wise) VxWorks on the WRT54, etc.

    2. Re:I call bullshit on this by j00r0m4nc3r · · Score: 4, Insightful

      I think a system like this might work if it has a user feedback system, where particular authors get a good reputation by positive feedback. So you know that an implementation is probably good if that author has good ratings. Think about system libraries. Nobody (well, mostly nobody) writes their own implementation of system libs, because they trust (usually) the implementation provided by the OS or the compiler. Why do they trust? Why is a Microsoft routine more trustworthy than S00p3rC0d3r's? Just because those functions have been tested by users over and over again. And they're (usually) well-documented.

    3. Re:I call bullshit on this by NotQuiteReal · · Score: 4, Funny
      I often Push on the Pull doors, just to see if they work.

      Often the Push/Pull sign is just some control freak placing arbitrary rules on things. So what if you clock a little old lady on the other side once in a while.

      Freedom to swing both ways has its price!

      --
      This issue is a bit more complicated than you think.
    4. Re:I call bullshit on this by Lazerf4rt · · Score: 4, Insightful

      I don't think people are giving programmers enough credit for having common sense, and this project to reduce code re-implementation sounds pretty idealistic. I don't know if I call bullshit on it, but I smell a few flawed assumptions.

      First flawed assumption seems to be that the hard part in re-using code is simply to find it. But that's crap. When code is in the form where it can even be re-used, it's called a module, and a big chunk of this code is the module's interface. The interface is what lets you re-use it. But there are huge differences between interfaces. There are different calling conventions, different parameter orderings, different limitations on thread re-entry, different permissions on order of things you're allowed to call, and entirely different approaches to specifying the interface. A streaming library can have a public function Read() or it can have a pair of public functions BeginRead()/EndRead(), and there are many valid reasons for both cases.

      Point being, you often have to refactor a module's interface before you can fit it into your project, and depending on the size and purpose of that module, the refactoring might be just as slow as writing a new implementation.

      Second flawed assumption is that developers aren't even able to find the code which they need to find today. But that's crap too because there are a lot of great, re-useable libraries that programmers already commonly know about, or can easily find through Google or Sourceforge. First of all, the standard C/C++ libraries give you a lot. Then there's zlib, curl, glut, Allegro, and hundreds (thousands, depending on your standards) more, depending on what you're doing.

      Come to think of it, when you really want to re-use code, you look for libraries, not source code. Searching for source code mainly helps people who want to learn programming.

      I know that the site linked in the summary only contains Java code right now, and I'm mainly focusing on C/C++ here but I think a lot of what I said applies. (Don't tell me that Java's automatic turning of each class into a monitor solves the thread re-entry problem, because that really just substitutes one problem for another.)

  2. Three things that make this article suspicious by knightmad · · Score: 4, Interesting

    1) "Java Only for now, more coming soon!"
    2) "Alpha"
    3) The linked article is a "product announcement" on Newsforge

    This is slashvertisement for a vaporware product. Although this is promising, there is nothing concrete there to call it "what we need to unify the open-source community", not even an alternative to Google codesearch.

    Btw, is alpha the new beta?

  3. Are we really making it better for us, or worse? by hacker · · Score: 4, Insightful

    If we create this grand, uber code-searching portal, which can search the context of the code, aren't we making it easier for commercial entities to go ahead and and pick and choose those bits of code to use in their products, knowing full well that they're going to violate the GPL (or other OSS licensing models) by doing so?

    I've talked to NO LESS THAN a dozen commercial companies in the last 2-3 years where they're actively taking FOSS source and incorporating it into their products, because.. (and I quote) "..Its freeware, so we can use it however we wish."

    The licensing differences between "freeware" and "free software" seem to escape them. Just google around and you'll see thousands of FOSS projects listed on sites like TUCOWS, download.com and others, as "freeware" and not the proper "free software" that they are. There are also people who think "free software" means just that (lowercase "F" there).

    Let's be sure that if we have a search engine that let's brainless developers look like experts by cutting and pasting bits of OSS code from here and there together to make their software work, that they know what the license is and that they must be in compliance with it to use it.

    Please?

  4. Doesn't work by Timesprout · · Score: 4, Funny

    I just ran a search for "the 500,000 lines of code I need to finish by friday all the stupid extra features the PHB wanted after we had set a deadline based on the original spec".

    0 results, rather disappointingly.

    --
    Do not try to read the dupe, thats impossible. Instead, only try to realize the truth
    What truth?
    There is no dupe
  5. Licensing by maxwell+demon · · Score: 5, Insightful

    I think in order to be really useful for not reinventing the wheel, it should allow intelligent searching for licensing. That is, it should allow to restrict your search to codes with certain licenses, or even better, to code under a license compatible with any given license (or set of licenses).

    For example, if you are working on code which you want to release as BSD, it's not much help if you find code licensed under the GPL, even if that code on its own is great. Likely, if you are writing GPLed code, you are not interested in code under licenses incompatible with the GPL (like e.g. the MPL).

    Of course, the search engine cannot make a guarantee that the license will fit your needs, but then, it cannot guarantee that about the code's functionality either.

    --
    The Tao of math: The numbers you can count are not the real numbers.